题名

以健保資料庫建構頭頸癌併發吸入性肺炎高風險病患之預測模式

并列篇名

A Prediction Model for Head and Neck Cancer Patient Complicated with Aspiration Pneumonia

作者

李彥賢(Yen-Hsien Lee);賴家玄(Chia-Hsuan Lai);蔡佳玲(Jia-Ling Cai)

关键词

頭頸癌 ; 吸入性肺炎 ; 國民健康保險資料庫 ; 傾向分數配對 ; 整體學習演算法 ; head and neck cancer ; aspiration pneumonia ; National Health Insurance Research Database ; propensity score matching ; ensemble learning

期刊名称

資訊管理學報

卷期/出版年月

24卷3期(2017 / 07 / 31)

页次

341 - 367

内容语文

繁體中文

中文摘要

預防醫學是指以預防疾病的發生,來代替對疾病的治療,其主要目標在於健康的促進以及疾病的預防,藉由讓民眾增加對疾病的認知、改變態度,用預防的概念來管理健康。近年來隨著人口結構與疾病型態的轉變,使得預防醫學逐漸受到重視。根據台灣衛福部2014年統計,頭頸癌死亡率在所有癌症中排名第五。頭頸癌的治療方式根據病人狀況通常包含手術、放射治療及化學治療,然而相關治療的後遺症或腫瘤位置的因素,往往引起患者吞嚥的問題而導致嗆咳,嚴重者更會併發吸入性肺炎。根據研究,頭頸癌若併發吸入性肺炎,在12個月內的死亡率將近10%。過去研究雖指出頭頸癌併發吸入性之可能影響因素,但各研究間觀察的變數不同,且研究結果略有差異,而實務上亦仍未建立評估準則可供醫師評估病患。本研究期望能基於健保申報資料,利用資料探勘中分類學習技術,試圖建構預測模式來協助預測頭頸癌併發吸入性肺炎之高風險病患,以期能給予病患適當之衛生教育,預防吸入性肺炎或及早發現相關症狀,以降低患者的死亡風險及相關醫療成本。實驗評估結果顯示,用以建立訓練資料的抽樣方式明顯影響分類器效能,而從整體學習方法的預測效能來看,Boosting方法在一般資料情況下預測效能優於Bagging方法;而Bagging方法效能差異,取決於採用的基礎學習演算法,其中以Decision Tree方法最佳。儘管如此,本研究評估之五種演算法皆達成相當不錯之預測效能,而以RBF-Kernel SVM為基礎學習演算法之Bagging方法更是對訓練資料外的非目標類別資料(未併發吸入性肺炎之頭頸癌病患),有相當好的預測效能。

英文摘要

Purpose-The treatment-related adverse effects of head and neck cancer and/or the anatomic location of tumors are likely to cause swallowing problems that might lead to the complications such as choking, malnutrition, and aspiration pneumonia. Prior research indicated the 12-month death rate of head and neck cancer patient with the complication of aspiration pneumonia is nearly 10%. The factors that cause the complication of aspiration pneumonia have been observed in prior studies but inconclusive. This study aims to discover Taiwan’s National Health Insurance Research Database, the most comprehensive records of medical insurance claim in Taiwan, to construct a prediction model for the head and neck cancer patients who are at risk of aspiration pneumonia. Design/methodology/approach - We reviewed the literature to identify a collective set of thirteen factors, which are relevant to the head and neck cancer patients with the complication of aspiration pneumonia and whose data values are available in Taiwan’s National Health Insurance Research Database, and adopted them as independent variables. We used propensity score matching to create training dataset and implemented bagging-based and boosting-based ensemble learning methods with different learning algorithms to construct prediction models. Findings-The results suggested that the five investigated approaches were effective in predicting the head and neck cancer patients at risk of aspiration pneumonia. The prediction performances achieved by boosting-based ensemble learning methods were better than bagging-based ones. Overall, the proposed approach can be promising to the construction of prediction model for the head and neck cancer patients with higher risk of aspiration pneumonia using Taiwan’s National Health Insurance Research Database. Research limitations/implications-This study applies ensemble learning to construct the prediction model for predicting the head and neck cancer patients at risk of aspiration pneumonia. The evaluation results reveal the effectiveness and the practicability of the proposed method, which builds the prediction model based on health insurance database. This study has contributed to the research area of health data mining. Nevertheless, the independent variables used to construct the prediction model are limited to the records of medical insurance claim. Future research is suggested to incorporate other data sets, such as medical records into the construction of prediction models. Practical implications-The proposed method can be developed into a decision support system to support physicians in assessing the head and neck cancer patients who are at risk of aspiration pneumonia. Such patients can be well educated in advance to prevent the occurrence of aspiration pneumonia. The development of such system is feasible because the records of the medical insurance claim required for constructing the prediction model are ready available. Originality/value-This study investigated the factors that may cause the complication of aspiration pneumonia, thereby constructing a prediction model based on the health insurance database to predict the head and neck cancer patients who are at risk. We developed a method for database preprocessing, training dataset creation, and prediction model construction. The evaluation results suggested practicability and effectiveness of the proposed method.

主题分类 基礎與應用科學 > 資訊科學
社會科學 > 管理學
参考文献
  1. 陳錦華(2014),『傾向分數(propensity score)在估計風險比之使用方法』,臺北 醫學大學生物統計研究中心 eNews,第二卷。
  2. Adnet, F.,Baud, F.(1996).Relation between Glasgow Coma Scale and aspiration pneumonia.Lancet,348(9020),123-124.
  3. Bauer, E.,Kohavi, R.(1999).An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants.Machine Learning,36(1),105-139.
  4. Baum, G.L.,Crapo, J.D.,Celli, B.R.,Karlinsky, J.B.(1998).Textbook of Pulmonary Diseases.Philadelphia:Lippincott Williams & Wilkins.
  5. Beasley, R.P.,Lin, C.C.,Hwang, L.Y.,Chien, C.S.(1981).Hepatocellular carcinoma and hepatitis B virus: a prospective study of 22 707 men in Taiwan.Lancet,318(8256),1129-1133.
  6. Becker, S.O.,Ichino, A.(2002).Estimation of average treatment effects based on propensity scores.The Stata Journal,2(4),358-377.
  7. Berry, M.J.,Linoff, G.S.(1997).Data mining Techniques: For Marketing, Sales, and Customer Support.Indianapolis, Indiana:Wiley Publishing, Inc..
  8. Breiman, L.(1996).Bagging Predictor.Machine Learning,24(2),123-140.
  9. Chawla, N.V.,Bowyer, K.W.,Hall, L.O.,Kegelmeyer, W.P.(2002).SMOTE: Synthetic Minority Over-sampling Technique.Journal of Artificial Intelligence Research,16(1),321-357.
  10. Chu, C.N.,Muo, C.H.,Chen, S.W.,Lyu, S.Y.,Morisky, D.E.(2013).Incidence of pneumonia and risk factors among patients with head and neck cancer undergoing radiotherapy.BMC Cancer,13(370)
  11. Cortes, C.,Vapnik, V.(1995).Support-vector networks.Machine Learning,20(3),273-297.
  12. Daniels, S.K.,Brailey, K.,Priestly, D.H.,Herrington, L.R.,Weisberg, L.A.,Foundas, A.L.(1998).Aspiration in patients with acute stroke.Archives of Physical Medicine and Rehabilitation,79(1),14-19.
  13. Delgado, M.,Sánchez, D.,Martín-Bautista, M.J.,Vila, M.A.(2001).Mining association rules with improved semantics in medical databases.Artificial Intelligence in Medicine,21(1),241-245.
  14. Dietterich, T.G.(2000).Ensemble methods in machine learning.Proceedings of the First International Workshop on Multiple Classifier Systems,Cagliari, Italy:
  15. Dubray-Vautrin, A.,Ballivet de Régloix, S.,Girod, A.,Jouffroy, T.,Rodriguez, J.(2015).Epidemiology, diagnosis and treatment of head and neck cancers.Soins,60(798),32-35.
  16. Eisbruch, A.,Lyden, T.,Bradford, C.R.,Dawson, L.A.,Haxer, M.J.,Miller, A.E.,Wolf, G.T.(2002).Objective assessment of swallowing dysfunction and aspiration after radiation concurrent with chemotherapy for head-and-neck cancer.International Journal of Radiation Oncology, Biology, Physics,53(1),23-28.
  17. Frawley, W.J.,Piatetsky-Shapiro, G.,Matheus, C.J.(1991).Knowledge discovery in databases: An overview.AI Magazine,13(3),57-70.
  18. Freund, Y.,Schapire, R.E.(1996).Experiments with a new boosting algorithm.Proceedings of the Thirteenth International Conference on Machine Learning (ICML '96),Bari, Italy:
  19. Freund, Y.,Schapire, R.E.(1997).A decision-theoretic generalization of on-Line learning and an application to boosting.Journal of Computer and System Sciences,55(1),119-139.
  20. Freund, Y.,Schapire, R.E.(1999).A short introduction to boosting.Journal of Japanese Society for Artificial Intelligence,14(5),771-780.
  21. He, H.,Garcia, E.A.(2009).Learning from imbalanced data.IEEE Transactions on Knowledge and Data Engineering,21(9),1263-1284.
  22. Henley, W.E.,Hand, D.J.(1996).A k-nearest-neighbour classifier for assessing consumer credit risk.The Statistician,45(1),77-95.
  23. Irwin, R.S.,Cerra, F.B.,Rippe, J.M.(1999).Irwin and Rippe's Intensive Care Medicine.Philadelphia:Lippincott Williams & Wilkins.
  24. Kearns, M.,Valiant, L.(1989).Crytographic limitations on learning boolean formulae and finite automata.Proceedings of the Twenty-First Annual ACM Symposium on Theory of Computing,Seattle, WA, USA:
  25. Langerman, A.,MacCracken, E.,Kasza, K.,Haraf, D.J.,Vokes, E.E.,Stenson, K.M.(2007).Aspiration in chemoradiated patients with head and neck cancer.Archives of Otolaryngology-Head & Neck Surgery,133(12),1289-1295.
  26. Lee, Y.H.,Hu, P.,Cheng, T.H.,Huang, T.C.,Chuang, W.Y.(2013).A preclusteringbased ensemble learning technique for acute appendicitis diagnoses.Artificial Intelligence in Medicine,58(2),115-124.
  27. Lewis, D.,Catlett, J.(1994).Heterogeneous uncertainty sampling for supervised learning.Proceedings of the 11th International Conference on Machine Learning,New Brunswick, NJ:
  28. Lin, Z.,Hao, Z.,Yang, X.,Liu, X.(2009).Several SVM ensemble methods integrated with under-sampling for imbalanced data learning.Proceedings of the Fifth International Conference on Advanced Data Mining and Applications (ADMA'09),Beijing, China:
  29. Liu, X.Y.,Wu, J.,Zhou, Z.H.(2009).Exploratory undersampling for class-imbalance learning.IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics,39(2),539-550.
  30. Lüdemann, L.,Grieger, W.,Wurm, R.,Wust, P.,Zimmer, C.(2006).Glioma assessment using quantitative blood volume maps generated by T1-weighted dynamic contrast-enhanced magnetic resonance imaging: A receiver operating characteristic study.Acta Radiol,47(3),303-310.
  31. Marik, P.E.(2001).Aspiration pneumonitis and aspiration pneumonia.New England Journal of Medicine,344(9),665-671.
  32. Metz, C.E.(1978).Basic principles of ROC analysis.Seminars in Nuclear Medicine,8(4),283-298.
  33. Mingers, J.(1989).An empirical comparison of pruning methods for decision tree induction.Machine Learning,4(2),227-243.
  34. Mitchell, T.M.(1997).Machine learning.McGraw Hill.
  35. Mittal, B.B.,Pauloski, B.R.,Haraf, D.J.,Pelzer, H.J.,Argiris, A.,Vokes, E.E.,Rademaker, A.,Logemann, J.A.(2003).Swallowing dysfunction-preventative and rehabilitation strategies in patients with head-and-neck cancers treated with surgery, radiotherapy, and chemotherapy: a critical review.International Journal of Radiation Oncology, Biology, Physics,57(5),1219-1230.
  36. Mortensen, H.R.,Jensen, K.,Grau, C.(2013).Aspiration pneumonia in patients treated with radiotherapy for head and neck cancer.Acta Oncologica,52(2),270-276.
  37. Nilsson, N.J.(1965).Learning Machines.New York:McGraw-Hill.
  38. Obuchowski, N.A.(2003).Receiver operating characteristic curves and their use in radiology.Radiology,229(1),3-8.
  39. Parsons, L.S.(2001).Reducing bias in a propensity score matched-pair sample using greedy matching techniques.Proceedings of the Twenty-Sixth Annual SAS® Users Group International Conference,Long Beach, California, USA:
  40. Rosen, A.,Rhee, T.H.,Kaufman, R.(2001).Prediction of aspiration in patients with newly diagnosed untreated advanced head and neck cancer.Archives of Otolaryngology-Head & Neck Surgery,127(8),975-979.
  41. Roy, T.M.,Ossorio, M.A.,Cipolla, L.M.,Fields, C.L.,Snider, H.L.,Anderson, W.H.(1989).Pulmonary complications after tricyclic antidepressant overdose.CHEST Journal,96(4),852-856.
  42. Siegel, R.,Naishadham, D.,Jemal, A.(2012).Cancer statistics, 2012.CA: A Cancer Journal for Clinicians,62(1),10-29.
  43. Valiant, L.G.(1984).A theory of learnable.Communications of the ACM,27(11),1134-1142.
  44. Ward, E.,Jemal, A.,Cokkinides, V.,Singh, G.K.,Cardinez, C.,Ghafoor, A.,Thun, M.(2004).Cancer disparities by race/ethnicity and socioeconomic status.CA: A Cancer Journal for Clinicians,52(4),78-93.
  45. Xu, B.,Boero, I.J.,Hwang, L.,Le, Q.T.,Moiseenko, V.,Sanghvi, P.R.,Cohen, E.E.,Mell, L.K.,Murphy, J.D.(2015).Aspiration pneumonia after concurrent chemoradiotherapy for head and neck cancer.Cancer,121(8),1303-1311.
  46. Zorman, M.,Eich, H.P.,Kokol, P.,Ohmann, C.(2001).Comparison of three databases with a decision tree approach in the medical field of acute appendicitis.Studies in Health Technology and Informatics,84(2),1414-1418.
  47. 王宏銘、廖俊達、范網行、吳樹鏗、詹勝傑、閣紫宸(2009)。頭頸部鱗狀細胞癌治療的新進展。腫瘤護理雜誌,9(S),51-67。
  48. 曾淑芬(1999)。從醫院管理角度論全民健保資料庫。中華公共衛生雜誌,18(5),363-372。