题名

應用機器學習演算法建立貧血疾病之分類模型以提升醫療品質

并列篇名

APPLICATION OF MACHINE LEARNING FOR CLASSIFYING ANEMIA TYPE TO IMPROVE HEALTHCARE QUALITY

DOI

10.6220/joq.202108_28(4).0004

作者

楊婉華(Wan-Hua Yang);鄭春生(Chuen-Sheng Cheng)

关键词

貧血 ; 全血細胞計數 ; 機器學習演算法 ; 隨機森林 ; 支援向量機 ; anemia ; complement blood count ; machine learning ; random forest ; support vector machine

期刊名称

品質學報

卷期/出版年月

28卷4期(2021 / 08 / 30)

页次

283 - 295

内容语文

繁體中文

中文摘要

在血液疾病中,貧血(anemia)是最常見的症狀之一,其主要原因是血液中血色素(hemoglobin, Hb)濃度不足。Hb是輸送氧氣(oxygen, O_2)到組織和器官必需的物質,當其數值低下時,器官和組織會因為缺氧(hypoxia)而引起各類全身性的症狀。通常醫師可經由全血細胞計數(complete blood count, CBC)獲得Hb等數據,藉以判斷病人是否患有貧血疾病。由於其預後和治療有不同的處理方式,因此早期鑑別貧血疾病的類別,具有重要的臨床意義。本研究根據臺灣北部某地區教學醫院的電子病歷(electronic medical records, EMRs)資料,利用機器學習(machine learning, ML)演算法建立一個決策模型(decision mode),針對四種常見的貧血類型進行分類。實驗根據EMRs資料庫中的特徵變數,在刪除無關的變數後,選擇包括病人看診年齡、性別、身高、體重、脈博、呼吸、收縮壓/舒張壓及Hb等16個屬性特徵,接著利用隨機森林(random forest, RF)、支援向量機(support vector machine, SVM)兩種演算法建立分類模型。本研究利用相關的指標,評估各種演算法之績效,結果發現,兩者皆具有良好的分類績效,其中又以SVM略優於RF。本研究所提出之方法可作為貧血疾病早期診斷的輔助工具。

英文摘要

Among hematological diseases, anemia is one of the most common symptoms. The major reason is the insufficient concentration of hemoglobin (Hb). Hb is necessary for transporting oxygen (O_2) to tissues and organs. When the value is low, organs and tissues will be systemic symptoms due to hypoxia. Generally, physicians can obtain data such as Hb through the complete blood count (CBC) to determine whether anemia exists. Because the prognosis and cure of anemia may have different treatment methods, it is of great clinical significance to identify the type of anemia disease early. This study applied machine learning (ML) algorithms to establish a decision model for classifying four common types of anemia. The data were collected from the electronic medical records (EMRs) of a teaching hospital in northern Taiwan. After deleting the irrelevant features of the CBC database, 15 features were selected in this study, including the patient's age, height, Hb, etc. The resulting dataset will be split into the training and test datasets for random forest (RF) and support vector machine (SVM) algorithms to establish classification models. This study used various metrics to evaluate the performance of ML algorithms. We found that both achieve satisfactory results but SVM performs slightly better than RF. The method proposed in this research can be applied as an auxiliary tool for early diagnosis of anemia.

主题分类 社會科學 > 管理學
参考文献
  1. Akrimi, J. A.,Ahmad, A. R.,George, L. E.(2013).Review of machine learning techniques in anemia recognition.International Journal of Science and Research,2(3),140-142.
  2. Akter, F.,Hossin, M.,Daiyan, G.,Hossain, M.(2018).Classification of hematological data using data mining technique to predict diseases.Journal of Computer and Communications,6,76-83.
  3. AlAgha, A. S.,Faris, H.,Hammo, B. H.,Al-Zoubi, A. M.(2018).Identifying β-thalassemia carriers using a data mining approach: the case of the Gaza Strip, Palestine.Artificial Intelligence in Medicine,88,70-83.
  4. Ali, J.,Ahmad, A. R.,George, L. E.,Der, C. S.,Aziz, S.(2013).A review of machine learning techniques and statistical models in anaemia.International Journal of Science & Technology,2(2),171-175.
  5. Amin, N.,Habib, A.(2015).Comparison of different classification techniques using WEKA for hematological data.American Journal of Engineering Research,4(3),55-61.
  6. Bashir, S.,Qamar, U.,Khan, F. H.,Javed, M. Y.(2014).An efficient rule-based classification of diabetes using ID3, C4.5, & CART Ensembles.12th International Conference on Frontiers of Information Technology,Islamabad, Pakistan:
  7. Bellinger, C.,Amid, A.,Japkowicz, N.,Japkowicz, N.,Victor, H.(2015).Multi-label classification of anemia patients.IEEE 14th International Conference on Machine Learning and Applications,Miami, FL:
  8. Breiman, L.(2001).Random forests.Machine Learning,45(1),5-32.
  9. Chaurasia, V.,Pal, S.(2013).Data mining approach to detect heart diseases.International Journal of Advanced Computer Science and Information Technology,2(4),56-66.
  10. Hsu, C.-W.,Lin, C.-J.(2002).A comparison of methods for multiclass support vector machines.IEEE Transactions on Neural Networks,13(2),415-425.
  11. Huang, F.,Wang, S.,Chan, C.-C.(2012).Predicting disease by using data mining based on healthcare information system.IEEE International Conference on Granular Computing,Hangzhou, China:
  12. Iyer, A.,Jeyalatha, S.,Sumbaly, R.(2015).Diagnosis of diabetes using classification mining techniques.International Journal of Data Mining & Knowledge Management Process,5(1),1-14.
  13. Janz, T. G.,Johnson, R. L.,Rubenstein, S. D.(2013).Anemia in the emergency department: evaluation and treatment.Emergency Medicine Practice,15(11),1-15.
  14. Otoom, A. F.,Abdallah, E. E.,Kilani, Y.,Kefaye, A.,Ashour, M.(2015).Effective diagnosis and monitoring of heart disease.International Journal of Software Engineering and Its Applications,9(1),143-156.
  15. Parthiban, G.,Srivatsa, S. K.(2012).Applying machine learning methods in diagnosing heart disease for diabetic patients.International Journal of Applied Information Systems,3(7),25-30.
  16. Poomcokrak, J.,Neatpisarnvanit, C.(2008).Red blood cells extraction and counting.The 3rd International Symposium on Biomedical Engineering,Singapore:
  17. Powers, D. M. W.(2011).Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & correlation.International Journal of Machine Learning Technology,2(1),37-63.
  18. Sanap, S. A.,Nagori, M.,Kshirsagar, V.(2011).Classification of anemia using data mining techniques.Swarm, Evolutionary, and Memetic Computing: Second International Conference, SEMCCO 2011, Visakhapatnam, Andhra Pradesh, India, December 19–21, 2011, Proceedings, Part II,Andhra Pradesh, India:
  19. Sun, Y.,Wong, A. K. C.,Kamel, M. S.(2009).Classification of imbalanced data: a review.International Journal of Pattern Recognition and Artificial Intelligence,23(4),687-719.
  20. Veluchamy, M.,Perumal, K.,Ponuchamy, T.(2012).Feature extraction and classification of blood cells using artificial neural network.American Journal of Applied Sciences,9(5),615-619.
  21. Zhou, Z.-H.,Liu, X.-Y.(2006).Training cost-sensitive neural networks with methods addressing the class imbalance problem.IEEE Transactions on Knowledge and Data Engineering,18(1),63-77.