题名

智慧存活分析響應系統

并列篇名

SMART Survival Metadata Analysis Responsive Tool

DOI

10.6342/NTU201701902

作者

郭文宗

关键词

存活分析 ; 元數據 ; ETL ; 人口統計分析 ; Kaplan-Meier ; Cox比例風險模型 ; Survival Analysis ; Metadata ; ETL ; Demographic ; Kaplan-Meier ; Cox proportional hazard model

期刊名称

國立臺灣大學資訊工程學系學位論文

卷期/出版年月

2017年

学位类别

碩士

导师

賴飛羆

内容语文

英文

中文摘要

研究背景: 癌症研究中最關注的問題,就是病患的存活率以及接受治療後無復發的週期,而能夠依照時間與事件發生的分析方式,正是此類問題最好的分析方法,其中存活分析便是研究人員日常使用的方法;但是統計軟體高額的花費與困難的學習曲線,使得存活分析有著相當高的進入門檻。而智慧存活分析響應系統(SMART)將突破這些限制,以自由軟體的形式提供使用者協助,幫助研究人員進行存活分析。 研究方法與主要貢獻: SMART是一個基於R-Shiny的網頁應用程式。SMART以使用者友善的方式處理Hospital Information System數據。它為研究人員提供元數據Extract-Transform-Load (ETL) 工具,並幫助他們進行人口統計分析,Kaplan-Meier生存分析,原始HIS數據集的Cox比例風險比分析,幫助用於比較兩組或多組之間的生存數據的假設檢驗。SMART還能自動生成所有比較表和圖片,讓研究人員可以直接應用於期刊發表上。 結論: SMART是一個開源軟體,用於抽取臨床研究數據,完善研究模型,並進行世代研究和隨機對照試驗研究的生存分析。

英文摘要

Background: The survival rate and progression free duration are the most important measurements for cancer therapy research. The time to event analysis, said survival analysis, is a key method for such kind of study. Therefore, survival analysis is massively used in clinical and epidemiological follow-up studies. However, disadvantages are there, such as difficulty on managing survival analysis procedures and high expense of professional statistical software. To break those limitations, this innovative Survival Metadata Analysis Responsive Tool (SMART) system is a free web application with assisted steps for performing survival analysis. Methodology/Principal Findings: SMART is an R-Shiny based web application which can handle novel survival analysis from HIS data in a user-friendly manner. It provides Metadata ETL tools for researchers and helps them perform demographic analysis, Kaplan-Meier survival analysis, Cox proportional hazard ratio analysis by simply inputting the original HIS datasets. SMART automatically generates comparison tables and figures which are of SCI journal publication-ready format. Conclusions/Significance: SMART is a public domain software of easy management on clinical research data, refining research models, and performing survival analysis of cohort and randomized controlled trial (RCT) research.

主题分类 基礎與應用科學 > 資訊科學
電機資訊學院 > 資訊工程學系
参考文献
  1. 1 Farewell, V.T.: ‘The use of mixture models for the analysis of survival data with long-term survivors’, Biometrics, 1982, pp. 1041-1046.
    連結:
  2. 2 Miller Jr, R.G.: ‘What Price Kaplan-Meier?’, Biometrics, 1983, pp. 1077-1081.
    連結:
  3. 3 Kaplan, E.L., and Meier, P.: ‘Nonparametric estimation from incomplete observations’, Journal of the American statistical association, 1958, 53, (282), pp. 457-481.
    連結:
  4. 4 Tolles, J., and Lewis, R.J.: ‘Time-to-Event Analysis’, Jama, 2016, 315, (10), pp. 1046-1047.
    連結:
  5. 5 Lin, D.: ‘Cox regression analysis of multivariate failure time data: the marginal approach’, Statistics in medicine, 1994, 13, (21), pp. 2233-2247.
    連結:
  6. 6 Allison, P.D.: ‘Survival analysis using SAS: a practical guide’ (Sas Institute, 2010. 2010).
    連結:
  7. 7 Cleves, M.: ‘An introduction to survival analysis using Stata’ (Stata Press, 2008. 2008).
    連結:
  8. 9 Team, R.C.: ‘R language definition’, Vienna, Austria: R foundation for statistical computing, 2000.
    連結:
  9. 11 Chang, W., Cheng, J., Allaire, J., Xie, Y., and McPherson, J.: ‘Shiny: web application framework for R’, R package version 0.11, 2015, 1.
    連結:
  10. 13 Shaphiro, S., and Wilk, M.: ‘An analysis of variance test for normality’, Biometrika, 1965, 52, (3), pp. 591-611.
    連結:
  11. 14 Scholz, F.W., and Stephens, M.A.: ‘K-sample Anderson–Darling tests’, Journal of the American Statistical Association, 1987, 82, (399), pp. 918-924.
    連結:
  12. 16 Schulz, R., and Beach, S.R.: ‘Caregiving as a risk factor for mortality: the Caregiver Health Effects Study’, Jama, 1999, 282, (23), pp. 2215-2219.
    連結:
  13. 17 Benz, R.L., Pressman, M.R., Hovick, E.T., and Peterson, D.D.: ‘Potential novel predictors of mortality in end-stage renal disease patients with sleep disorders’, American Journal of Kidney Diseases, 2000, 35, (6), pp. 1052-1060.
    連結:
  14. 18 Kleinbaum, D.G., and Klein, M.: ‘Survival analysis: a self-learning text’ (Springer Science & Business Media, 2006. 2006).
    連結:
  15. 25 Lo, C.M., Ngan, H., Tso, W.K., Liu, C.L., Lam, C.M., Poon, R.T.P., Fan, S.T., and Wong, J.: ‘Randomized controlled trial of transarterial lipiodol chemoembolization for unresectable hepatocellular carcinoma’, Hepatology, 2002, 35, (5), pp. 1164-1171.
    連結:
  16. 29 Wickham, H.: ‘ggplot2: elegant graphics for data analysis’ (Springer, 2016. 2016)
    連結:
  17. 34 Moertel, C.G., Fleming, T.R., Macdonald, J.S., Haller, D.G., Laurie, J.A., Tangen, C.M., Ungerleider, J.S., Emerson, W.A., Tormey, D.C., and Glick, J.H.: ‘Fluorouracil plus levamisole as effective adjuvant therapy after resection of stage III colon carcinoma: a final report’, Annals of internal medicine, 1995, 122, (5), pp. 321-326.
    連結:
  18. 35 Wolmark, N., Fisher, B., Rockette, H., Redmond, C., Wickerham, D.L., Fisher, E.R., Jones, J., Glass, A., Lerner, H., and Lawrence, W.: ‘Postoperative Adjuvant Chemotherapy or BCG for Colon Cancer: Results From NSABP Protocol C-011 2’, JNCI: Journal of the National Cancer Institute, 1988, 80, (1), pp. 30-36.
    連結:
  19. 8 Pallant, J.: ‘SPSS survival manual’ (McGraw-Hill Education (UK), 2013. 2013).
  20. 10 Therneau, T.: ‘A package for survival analysis in S. R package version 2.37-4’, See http://CRAN. R-project. org/package= survival, 2014.
  21. 12 Razali, N.M., and Wah, Y.B.: ‘Power comparisons of shapiro-wilk, kolmogorov-smirnov, lilliefors and anderson-darling tests’, Journal of statistical modeling and analytics, 2011, 2, (1), pp. 21-33.
  22. 15 Gross, J., and Ligges, U.: ‘nortest: Tests for Normality’, R package version, 2012, 1, pp. 0-2.
  23. 19 Borgan, Ø.: ‘Nelson–Aalen Estimator’, Encyclopedia of Biostatistics, 2005.
  24. 20 Cox, D.R.: ‘Regression models and life-tables’: ‘Breakthroughs in statistics’ (Springer, 1992), pp. 527-541.
  25. 21 Fox, J.: ‘Cox proportional-hazards regression for survival data’.
  26. 22 Team, R.: ‘RStudio: integrated development for R’, RStudio, Inc., Boston, MA URL http://www. rstudio. com, 2015.
  27. 23 Hornik, K.: ‘The comprehensive R archive network’, Wiley Interdisciplinary Reviews: Computational Statistics, 2012, 4, (4), pp. 394-398.
  28. 24 Wickham, H., and Francois, R.: ‘dplyr: A grammar of data manipulation’, R package version 0.4, 2015, 1, pp. 20.
  29. 26 Attali, D.: ‘shinyjs: Perform Common JavaScript Operations in Shiny Apps using Plain R Code, 2016’, R package version 0.6.
  30. 27 Chang, W.: ‘shinythemes: Themes for Shiny. R package version 1.0. 1’, in Editor (Ed.)^(Eds.): ‘Book shinythemes: Themes for Shiny. R package version 1.0. 1’ (2015, edn.), pp.
  31. 28 Plate, T., and Heiberger, R.: ‘abind: Combine multi-dimensional arrays’, R package version, 2011, pp. 1.3-0.
  32. 30 Horikoshi, M., and Tang, Y.: ‘ggfortify: data visualization tools for statistical analysis results. R package version 0.0. 4’, in Editor (Ed.)^(Eds.): ‘Book ggfortify: data visualization tools for statistical analysis results. R package version 0.0. 4’ (2015, edn.), pp.
  33. 31 Merkel, D.: ‘Docker: lightweight linux containers for consistent development and deployment’, Linux Journal, 2014, 2014, (239), pp. 2.
  34. 32 Kaushal, V., and Bala, A.: ‘Autonomic fault tolerance using haproxy in cloud environment’, International Journal of Advanced Engineering Sciences and Technologies, 2011, 7, (2), pp. 222-227.
  35. 33 Arel-Bundock, V.: ‘Rdatasets R datasets: An archive of datasets distributed with R, 2014’, URL http://vincentarelbundock. github. io/Rdatasets.