题名 |
資料採礦應用於乳癌患者之遺傳基因及生活因素探討 |
并列篇名 |
Application of Data Mining on Hereditary Genes and Behavioral Factors of Breast Cancer Patients |
DOI |
10.6338/JDA.200906_4(3).0008 |
作者 |
侯藹玲(Ai-Ling Hour);朱國豪(Kuo-Hao Chu);蘇志雄(Chih-Hsiung Su) |
关键词 |
基因 ; 乳癌 ; 家族遺傳 ; 判別分析 ; Gene ; Microarray ; Breast Cancer ; Family Heredity ; T-test ; Discriminant Analysis |
期刊名称 |
Journal of Data Analysis |
卷期/出版年月 |
4卷3期(2009 / 06 / 01) |
页次 |
131 - 158 |
内容语文 |
繁體中文 |
中文摘要 |
近代醫學的技術,通常都要等到發病後,才能做出診斷與治療。因此大部分的病患被發現時,其病情已經達嚴重且治癒困難狀態,而此時治癒機會也較低。利用生物晶片,配合分子醫學影像,不僅可提供細胞的生理途徑及疾病成因,也使得正常或腫瘤細胞的基因表現可以經由分子影像表現出來。除此之外,也可藉由DNA晶片技術(Microarray),大量地快速尋找候選基因,進而診斷疾病的分子層面病變,提供一個完全不一樣的醫療照顧。根據最新癌症統計資料,乳癌已經成為國內女性10大癌症的首位,據研究發現,影響乳癌的危險因子,包含了家族病史、年齡、抽菸、飲酒等,在這些危險的因子中,家族史為最顯著的因素,而有家族遺傳的人比沒有家族遺傳的人,罹患乳癌的相對危險性大約是3倍左右。此研究目的是利用乳癌病患的特性資料,從NCBI資料庫,抓取病人的54675個基因表現量,進行乳癌、家族遺傳、抽菸、飲酒、轉移等之T-test差異性比較,從中可找出59個候選基因與因遺傳而罹患乳癌最有顯著相關,利用這些基因建立31個對有無家族史之判別模型,而31個判別分析模型之整體預測能力約界在50%至60%左右,進而將測試資料帶入判別分析模型,得分類矩陣之正確率約達60%。因此在往後疾病剛萌芽的分子階段,挑選出病患的59個主要影響乳癌遺傳基因之表現量,帶入此模型來判別病人是否因家族史而罹患乳癌,如此可在早期讓病患進行乳癌的治療,為乳癌病患提供一個更完善的醫療照顧。 |
英文摘要 |
With modern medical technology, diagnosis and treatment can be made after the incidence. Therefore most of cancers that have been diagnosed must have other tests performed to determine. Once the stage is known, usually in the later stage and the cure rate is decreasing. Take advantage of biochips and the image of molecular medicine, it is not only to provide the growth of cells, the cause of disease but demonstrate normal or tumor cell gene by molecular imaging. Moreover, search a large number of candidate genes quickly by DNA chip technology (Microarray), then diagnose lesion of molecular level to provide a completely different medical care. According to the latest statistics, women get breast cancer more than any other type of cancer. It was found that risk factors for breast cancer include family history, age, smoking, drinking and so on. Family history is the most significant factor. It's about 3 times risky for family heredity got breast cancer than without family heredity.The purpose of this study is making use of the characteristics of breast cancer information from NCBI database. According to patients' 54675 gene expression, process T-test to compare the differences with breast cancer, family heredity, smoking, drinking and metastasis. It can be found 59 candidate genes are significantly related to breast cancer. Making use of these genes to built 31 discriminant models whether a family history. The overall predictive ability is in 50-60% of 31 models. Then taking testing data into the discriminant models, it is found 60 % correct rate in the classification matrix. In the disease embryonic elements stage, selected patients' 59 gene expressions impact on breast cancer. Put these gene expressions into model to determine whether the patient got breast cancer because of family history. Therefore an appropriate treatment plan can be developed. |
主题分类 |
基礎與應用科學 >
資訊科學 基礎與應用科學 > 統計 社會科學 > 管理學 |
参考文献 |
|