题名

結合基因演算法與模擬退火法於多重DNA序列之比對

并列篇名

Combination of GA and SA for Multiple DNA Sequence Alignment

DOI

10.6220/joq.2014.21(5).01

作者

葉進儀(Jinn-Yi Yeh);林軒仲(Syuan-Jhong Lin)

关键词

序列比對 ; 多重序列比對 ; 基因演算法 ; 模擬退火法 ; sequence alignment ; multiple sequence alignment ; genetic algorithms ; simulated annealing

期刊名称

品質學報

卷期/出版年月

21卷5期(2014 / 10 / 31)

页次

305 - 328

内容语文

繁體中文

中文摘要

序列比對是將蛋白質中的基因或氨基酸進行對齊的動作,藉此找出兩序列的相似程度,而多重序列比對則是同時比對多個DNA或蛋白質序列,找出此序列群組中最佳的比對結果,本研究結合基因演算法及模擬退火法,先利用基因演算法物競天擇的概念,隨著世代演進逐漸產生近似最佳解,再利用模擬退火法進行小區塊內之比對修正,實驗結果顯示,利用基因演算法與模擬退火法之結合,使得基因演算法在跳脫局部最佳解的時候能有更大空間移動,而且也讓模擬退火法能有效解決經由基因演算法初步比對之後所產生的不良區域,此結合之序列比對結果比任何單一演算法的結果好,因此可以提升整體比對表現,將來能夠為生物學家在判斷未知序列功能時提供適當的輔助。

英文摘要

There was more and more DNA and protein sequence has been founded. The similar DNA sequence or protein sequence may have the same features. Therefore, sequence alignment has become the most popular technology in the field of bioinformatics. In this work, the combination of genetic algorithms and simulated annealing is used for multiple sequence alignment. Genetic algorithms apply the concept of fittest with evolution generation to gradually produce near optimal solutions. Simulated annealing is used to correct the alignment in small blocks. Experimental results show that the proposed method can let genetic algorithms' solutions to escape local optimum and to have more space to move, and also let simulated annealing effectively solves the problems with bad initial solutions generated by genetic algorithms. The results obtained by the proposed method are better than the results of any single algorithm. Therefore, the proposed method can improve the overall ratio of performance and will be able to provide appropriate support to biologists in determining the function of unknown sequence.

主题分类 社會科學 > 管理學
参考文献
  1. Chen, S.-M.,Lin, C.-H.(2007).Multiple DNA sequence alignment based on genetic simulated annealing techniques.Information and Management Sciences,18(2),97-111.
  2. Edgar, R. C.,Batzoglou, S.(2006).Multiple sequence alignment.Current Opinion in Structural Biology,16,368-373.
  3. Gondro, C.,Kinghorn, B. P.(2007).A simple genetic algorithm for multiple sequence alignment.Genetics and Molecular Research,6(4),964-982.
  4. Horng, J.-T.,Wu, L.-C.,Lin, C.-M.,Yang, B.-H.(2005).A genetic algorithm for multiple sequence alignment.Soft Computing,9(6),407-420.
  5. Huo, H.,Stojkovic, V.(2007).A simulated annealing algorithm for multiple sequence alignment with guaranteed accuracy.Proceedings of Third International Conference on Natural Computation
  6. Jangam, S. R.,Chakraborti, N.(2007).A novel algorithm for DNA sequence alignment using ant colony optimization and genetic algorithms.Applied Soft Computing,7(3),1121-1130.
  7. Jones, N. C.,Pevzner, P. A.(2004).An Introduction to Bioinformatics Algorithms.Cambridge, MA.:Massachusetts Institute of Technology Press.
  8. Kanz, C.,Aldebert, P.,Althorpe, N.,Baker, W.,Baldwin, A.,Bates, K.(2005).The EMBL nucleotide sequence database.Nucleic Acids Research,33(Suppl. 1),D29-D33.
  9. Kim, J.,Pramanik, S.,Chung, M. J.(1994).Multiple sequence alignment using simulated annealing.Computer Application in Biosciences,10(4),419-426.
  10. Kirkpatrick, S.,Gelatt, C. D., Jr.,Vecchi, M. P.(1983).Optimization by simulated annealing.Science,220(4598),671-680.
  11. Laarhoven, P. J. M.,Aarts, E. H. L.(1987).Simulated Annealing: Theory and Applications.Norwell, MA.:Kluwer Academic.
  12. Lee, Z.-J.,Su, S.-F.,Chuang, C.-C.,Liu, K.-H.(2008).Genetic algorithm with ant colony optimization (GA-ACO) for multiple sequence alignment.Applied Soft Computing,8(1),55-78.
  13. Metropolis, N.,Rosenbluth, A. W.,Rosenbluth, M. N.,Teller, A. H.,Teller, E.(1953).Equation of state calculations by fast computing machines.The Journal of Chemical Physics,21(6),1087-1092.
  14. Mount, D. W.(2004).Bioinformatics: Sequence and Genome Analysis.New York:Cold Spring Harbor.
  15. Needleman, S. B.,Wunsch, C. D.(1970).A general method applicable to the search for similarities in the amino acid sequence of two proteins.Journal of Molecular Biology,48(3),443-453.
  16. Notredame, C.(2002).Recent progress in multiple sequence alignment: a survey.Pharmacogenomics,3(1),131-144.
  17. Notredame, C.,Higgins, D. G.(1996).SAGA: sequence alignment by genetic algorithm.Nucleic Acids Research,24(8),1515-1524.
  18. Omar, M. F.,Salam, R. A.,Abdullah, R.,Rashid, N. A.(2007).Multiple sequence algorithm using optimization algorithms.International Journal of Computer, Information, Systems and Control Engineering,1(5),1511-1519.
  19. Poirot, O.,O'Toole, E.,Notredame, C.(2003).Tcoffee@igs: a web server for computing, evaluating and combining multiple sequence alignments.Nucleic Acids Research,31(13),3503-3506.
  20. Rasmussen, T. K.,Krink, T.(2003).Improved hidden Markov model training for multiple sequence alignment by a particle swarm optimization-evolutionary algorithm hybrid.BioSystems,72(1-2),5-17.
  21. Reese, J. T.,Pearson, W. R.(2002).Empirical determination of effective gap penalties for sequence comparison.Bioinformatics,18(11),1500-1507.
  22. Riaz, T.,Wang, Y.,Li, K.-B.(2004).Multiple sequence alignment using Tabu search.Proceedings of the Second Conference on Asia-Pacific Bioinformatics
  23. Sarıyer, O. S.,Güven, C.(2010).Sequence alignment using simulated annealing.Physica A: Statistical Mechanics and Its Applications,389(15),3007-3012.
  24. Smith, L.,Yeganova, L.,Wilbur, W. J.(2003).Brief communication: hidden Markov models and optimized sequence alignments.Computational Biology and Chemistry,27(1),77-84.
  25. Smith, T. F.,Waterman, M. S.(1981).Identification of common molecular subsequences.Journal of Molecular Biology,147,195-197.
  26. Taylor, W. R.,Selensminde, G.,Eidhammer, I.(2000).Multiple protein sequence alignment using double-dynamic programming.Computers and Chemistry,24(1),3-12.
  27. Thompson, J. D.,Higgins, D. G.,Gibson, T. J.(1994).CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.Nucleic Acids Research,22(22),4673-4680.
  28. Wang, C.,Lefkowitz, E. J.(2005).Genomic multiple sequence alignments: refinement using a genetic algorithm.BMC Bioinformatics,6
  29. Wang, Z.,Zhang, K.(2005).Multiple RNA structure alignment.Journal of Bioinformatics and Computational Biology,3(3),609-626.
  30. 王聿泰、汪詩梅、范廷佳、許玉璇、陳淑美、黃彥華(2003)。生物資訊。臺北:教育部顧問室。
  31. 李龍緣、林應如、許晉詮、萬磊、徐媛曼、洪千惠(2008)。生物資訊。臺北:九州圖書文物。
  32. 黃建彰(2008)。碩士論文(碩士論文)。高雄,臺灣,義守大學工業工程與管理學系碩士班。