题名

資料採礦技術應用於微陣列資料分析以篩選乳癌期數候選基因之研究

并列篇名

Using Data Mining Technique to Rediscover Brest Cancer's Stage Candidate Genes from Microarray Data

DOI

10.6338/JDA.201110_6(5).0001

作者

徐遠宥(Yuan-Yu Hsu);侯藹玲(Ai-Ling Hour);歐尚靈(Shang-Ling Ou)

关键词

微陣列 ; 乳癌 ; 資料採礦 ; 隨機森林 ; Microarray ; Brest Cancer ; Data Mining ; Random Forest

期刊名称

Journal of Data Analysis

卷期/出版年月

6卷5期(2011 / 10 / 01)

页次

1 - 10

内容语文

繁體中文

中文摘要

根據行政院衛生署統計資料顯示,2009年惡性腫瘤(癌症)已連續27年為十大死因之首,其中乳癌則是高居女性死亡原因排名的第四名。乳癌是全世界女性最常見的癌症疾病,全世界每年約有50萬人的死因為乳癌。越早發現的癌症其治癒率會比癌症末期高出很多,本研究希望透過統計分析找出哪些基因在乳癌不同期的表現量有何差異,探討這些基因表現量的型態,期望日後能對乳癌及早的發現及預防有所幫助。本研究利用美國國家衛生研究中心(NCBI)資料庫所提供的Affymetrix GeneChip Human Genome U133 Plus 2.0 Array這組晶片所做出來的GSE2109、GSE3744、GSE7307三個微陣列資料集來進行分析,利用隨機森林判斷乳癌分期,以提供醫學研究方面的參考。

英文摘要

According to Department of Health, Executive Yuan, R.O.C statistics, in 2009, The cancer has consecutive top ten causes of death in 27 year, of which breast cancer is the highest ranked cause of death in women in the fourth. Breast cancer is the most common cancer in women worldwide disease, about 50 million people die each year because of breast cancer in the world. This study is to identify which genes through statistical analysis of breast cancer different in the amount of any difference in the performance period, the amount of these gene expression patterns, we look forward to on the early breast cancer detection and prevention of some help.In this study, the National Health Research (NCBI) database provided by the Affymetrix GeneChip Human Genome U133 Plus 2.0 Array chips made out of this group of GSE2109, GSE3744, GSE7307 three microarray data sets for analysis, using random forests to determine breast cancer stage has been providing medical research references.

主题分类 基礎與應用科學 > 資訊科學
基礎與應用科學 > 統計
社會科學 > 管理學
参考文献
  1. Chien, C.,Lin, K.(2006).A data mining framework for binary Cdna bio-chip data analysis and its validation.journal of Information Management,13(4),133-159.
    連結:
  2. 乳癌防治基金會網站:http://www.breastcf.org.tw/
  3. 行政院衛生署。衛生統計系列(一)死因統計。取自 http://www.doh.gov.tw/
  4. Breiman, L.(2001).Random Forest.Machine Learning,45,5-32.
  5. Draghici, Sorin(2003).Data Analysis Tools for DNA Microarrays.Chapman & Hall/CRC.
  6. Gibson, G.,Muse, S.(2004).A Primer of Genome Science.Sinauer Associates.
  7. Kelsey, JL,Berkowitz, GS(1998).Breast cancer epidemiology.Cancer Res,48,5615-5623.
  8. Lu, Jun,Getz, Gad,Miska, Eric A.,Alvarez-Saaveddra, Ezequiel,Lamb, Justin,Peck, David,Sweet-Cordero, Alejandro,Ebert, Benjamin L.,Mak, Raymond H,Ferrando, Adolfo A.,Downing, James R.,Jacks, Tyler,Horvitz, H. Robert,Golub, Todd R.(2005).MicroRNA expression profiles classify human cancers.Nature,435,834-838.
  9. 沈志陽(2004)。尋找乳癌發生的原因