题名

應用資料採礦技術分析臺灣健保資料庫-以攝護腺癌病人為例

并列篇名

Application of Data Mining in National Health Insurance Database-Prostate Cancer as a Study Case

DOI

10.6338/JDA.201310_8(5).0004

作者

童俊榮(Jiun-Rung Tung);劉志光(Chih-Kuang Liu)

关键词

資料採礦 ; 攝護腺癌 ; 臺灣健康保險資料庫 ; data mining ; prostate cancer ; NHI claim data

期刊名称

Journal of Data Analysis

卷期/出版年月

8卷5期(2013 / 10 / 01)

页次

67 - 83

内容语文

繁體中文

中文摘要

根據行政院衛生署統計資料顯示,2011年惡性腫瘤(癌症)已連續29年為十大死因之首,其中攝護腺癌更高居男性死亡原因排名的第七名,該疾病之預防工作顯得更加重要。攝護腺為男性生殖系統的一個腺體,其中若有細胞的基因因為突變導致增殖失控,就會變成癌症。攝護腺癌也可能會導致疼痛、排尿困難、性功能障礙等症狀,其中最常發生於50歲以上的人。本研究納入全民健康保險資料庫2000年1月1日至2009年12月31日期間,住院診斷出現「185」診斷碼的攝護腺癌病人,利用羅吉斯迴歸、決策樹、類神經、支援向量機及隨機森林判斷病人是否會死亡,並找出其危險因子,提供醫學研究方面參考。經本研究實證結果顯示,決策樹之整體平均鑑別率最高,顯示決策樹建構之模型在判斷病人死亡與否有較佳鑑別力。此外,決策樹模型篩選出3個重要變數,顯示此3個變數應為影響分類模型之重要變數。

英文摘要

According to Department of Health, Executive Yuan, R.O.C statistics, in 2011, the cancer has consecutive top ten causes of death in 29 year, of which Prostate cancer is the highest ranked cause of death in man in the seventh. Therefore the prevention of disease is getting more and more important. The prostate gland is a gland in the male reproductive system, if cell gene mutation will lead to uncontrolled proliferation, it will become cancer. Prostate cancer may cause pain, difficulty urinating, sexual dysfunction and so on, which occurs most often in people over the age of 50. This study included patients who were documented with ICD9-CM coded of ”185” in non-sampled NHI claim database from 1 January 2000 through 31 December 2009. This study use logistic regression, Decision Tree, Neural Net, Support Vector Machine and Random Forests to determine whether patients will die and finding risk-factor to provide medical research reference. Analytic results demonstrated that Decision Tree outperforms the other analysis approaches in terms of classification accuracy. In addition, the Decision Tree model find out the three important variables to display the three variables should be important variables for the classification model.

主题分类 基礎與應用科學 > 資訊科學
基礎與應用科學 > 統計
社會科學 > 管理學
参考文献
  1. Barwick, B. G.,Abramovitz, M.,Kodani, M.,Moreno, C. S.,Nam, R.,Tang, W.(2010).Prostate cancer genes associated with TMPRSS2-ERG gene fusion and prognostic of biochemical recurrence in multiple cohorts.British Journal of Cancer,102(3),570-576.
  2. Berry, M. J. A.,Linoff, G. S.(1997).Data mining techniques: For marketing, sales, and customer support.NY:John Wiley & Sons, Inc..
  3. Dianat, R.,Kasaei, S.(2010).Change Detection in Optical Remote Sensing Images Using Difference-Based Methods and Spatial Information.IEEE Geoscience and Remote Sensing Letters,7(1),215-219.
  4. Fayyad, U. M.,Piatetsky, S. G.,Padhraic, S.(1996).Form Data Mining to Knowledge Discovery in Databases.American Association for Artificial Intelligence,11(5),20-25.
  5. Frawley, W. J.,Paitetsky-Shapiro, G.,Matheus, C. J.(1991).Knowledge discovery in databases: An overview, knowledge discovery in databases.SF:AAAI.
  6. Grupe, F. H.,Owrang, M. M.(1995).Database mining discovering new knowledge and cooperative advantage.Information systems management,12,26-31.
  7. Hall, J.,Mani, G.,Barr, D.(1996).Applying computational intelligence to the investment process.proceedings of CIFER-96: Computational intelligence in financial engineering,Piscataway, NJ:
  8. Hand, D. J.(1998).Data mining: Statistics and more?.The American Statistician,52(2),112-118.
  9. Hara, I.,Kawabata, G.,Miyake, H.,Nakamura, I.,Hara, S.,Okada, H.(2003).Comparison of quality of life following laparoscopic and open prostatectomy for prostate cancer.The Journal Of Urology,169(6),2045-2048.
  10. Hirji, K. K.(1999).Discovering Data Mining: From Concept to Implementation.SIGKDD Explorations,1(1),44-45.
  11. Huggins, C.,Hodges, C. V.(2002).Studies on prostatic cancer: I. The effect of castration, of estrogen and of androgen injection on serum phosphatases in metastatic carcinoma of the prostate.Journal urology,168,9-12.
  12. Kantardzic, M.(2003).Data Mining: Concepts, Models, Methods, and Algorithms.NJ:Wiley-IEEE Press.
  13. Kuper, H.,Adami, H. O.,Tricropoulos, D.(2000).Infections as a major preventable cause of human cancer.J. Intern. Med.,248,171-183.
  14. Lim, L. S.,Sherin, K.(2008).Screening for prostate cancer in U.S. men ACPM Position statement on preventive practice.American Journal of Prevent Medicine,34(2),164-170.
  15. Mohler, J.,Bahnson, R. R.,Boston, B.(2010).NCCN clinical practice guidelines in oncology: prostate cancer.J Natl Compr Canc Netw.,8,162-200.
  16. Otero, R. J.,Martínez-Salamanca, J. I.(2007).Critical comparative analysis between open, laparoscopic and robotic radical prostatectomy: urinary continence and sexual function (part II).Archivos Españoles De Urología,60(7),767-776.
  17. Shaw, M. J.,Subramaniam C.,Tan, G. W.,Welge, M. E.(2001).Knowledge management and data mining for marketing.Decision Support Systems,31(1),127-137.
  18. Smith, J. A.,Herrell, S. D.(2005).Robotic-assisted laparoscopic prostatectomy: do minimally invasive approaches offer significant advantages?.Journal of Clinical Oncology,23(32),8170-8175.
  19. Steele, C. B.,Miller, D. S.,Maylahn, C.,Uhler, R. J.,Baker, C. T.(2000).Knowledge, attitudes and screening practices among older men regarding prostate cancer.American Journal of public health,90(10),1595-1600.
  20. Steinberg, G. D.,Carter, B. S.,Beaty, T. H.,Childs, B.,Walsh, P. C.(1990).Family history and the risk of prostate cancer.Prostate,17,337-347.
  21. Sun, L.,Caire, A. A.,Robertson, C. N.,George, D. J.,Polascik, T. J.,Maloney, K. E.(2009).Men older than 70 years have higher risk prostate cancer and poorer survival in the early and late prostate specific antigen eras.The Journal of Urogoly,182(5),2242-2248.
  22. Thompson, I. M.,Ankerst, D. P.(2007).Prostate-specific antigen in the early detection of prostate cancer.Canadian Medical Association Journal,176(13),1853-1858.
  23. Weber, B. A.,Roberts, B. L.,Chumbler, N. R.,Mills, T. L.,Algood, C. B.(2007).Urinary, sexual, and bowel dysfunction and bother after radical prostatectomy.Urologic Nursing,27(6),527-533.
  24. Weiss, S. M.,Indurkhya, N.(1998).Predictive data mining: A practical guide.NY:Morgan Kaufman Publishers, Inc..
  25. 林傑斌(2002)。資料挖掘與OLAP理論與實務。新北市:文魁資訊股份有限公司。
  26. 國家衛生研究院NHRI癌症研究組、臺灣癌症臨床研究合作組織TCOG(1999)。攝護腺(前列腺)癌診治共識。台北市:TCOG攝護腺癌研究委員會。
  27. 陳宇平(2002)。成人內外科護理。台北市:偉華。
  28. 黃勝崇(2001)。嘉義縣,南華大學資訊管理研究所。
  29. 劉介宇、洪永泰、莊義利(2006)。台灣地區鄉鎮巿區發展類型應用於大型健康調查抽樣設計之研究。健康管理學刊,4,1-22。
  30. 謝邦昌、鄭宇庭、蘇志雄、郭良芬(2007)。EXCEL在資料採礦上之應用。新北市:中華資料採礦協會。