题名

滯留時間校準、去噪與化合物辨識在代謝體學上之應用

并列篇名

Alignment, Denoising, and Identification in Metabolomics

DOI

10.6342/NTU.2011.02371

作者

何宗融

关键词

毛細管電泳 ; 高效液相層析儀 ; 紫外光檢測器 ; 滯留時間校正 ; 液相層析質譜儀 ; 去背 ; 波峰擷取 ; 代謝體學 ; 毒物篩檢 ; correlation optimized warping ; capillary electrophoresis ; high performance liquid chromatography ; UV detector ; retention time alignment ; liquid chromatography/mass spectrometry ; background subtraction ; peak picking ; metabolomics ; toxicological screening

期刊名称

國立臺灣大學資訊工程學系學位論文

卷期/出版年月

2011年

学位类别

碩士

导师

曾宇鳳

内容语文

繁體中文

中文摘要

本篇論文呈現兩套我們研發的演算法,用來解決偵測小分子訊號時面臨的計算問題,它們是由代謝體學和毒物篩檢的應用發展而來。 在本篇論文的第一個部分,我們發展一套用於紫外光檢測器之層析圖譜校正工具-Chromaligner,它可以校正多種層析法的滯留時間(retention time),且執行速度比傳統correlation optimized warping (COW)演算法快。代謝體學(metabolomics)分析上常用之層析法包含毛細管電泳及高效液相層析儀等。毛細管電泳因其分析所需之樣品量及有機溶媒消耗量較少,是一種綠色環保層析分析法,且分析物分離效率高,但缺點為滯留時間再現性較差。高效液相層析儀有系統穩定性佳、分析結果再現性高之優點,但在梯度沖提時易發生滯留時間偏移。倘若樣本內包含多元化合物,層析圖譜之滯留時間偏移將導致化合物之辨識錯誤率提高,因此需要發展滯留時間校正工具。現有的校正工具大多為質譜儀設計,無法校正紫外光檢測器之圖譜資料。然而質譜儀費用昂貴,許多實驗單位仍以紫外光檢測器作為主要分析方式。Chromaligner為紫外光檢測器之資料提供滯留時間校正之解決方案。Chromaligner藉設定約束條件之層析圖譜校準法來解決波峰偏移問題。使用者收集完層析圖譜並定義一組作為約束條件之波峰後,Chromaligner使用這些定義好之波峰配合COW演算法來校正圖譜。Chromaligner之執行速度比傳統COW演算法快總波峰數之平方倍,可大量加速圖譜校正。Chromaligner同時可藉由已知成分之波峰校正出最佳結果,以利化學劑量學之分析。Chromaligner已建置在網路上,提供使用者免費使用線上校正層析圖譜滯留時間,網址如下: http://cmdd.csie.ntu.edu.tw/~chromaligner 在本篇論文的第二個部分,我們發展一個全新的去噪(denoising)和波峰擷取(peak picking)演算法-TIPick,它能夠自動在複雜的液相層析飛行時間式質譜儀(Liquid Chromatography - Time of Flight)樣本中,精確、靈敏地偵測出標的化合物。在毒物篩檢(toxicological screening)和代謝體學的研究中,液相層析飛行時間式質譜儀是一套重要的分析技術。TIPick的優點在於不假設化合物峰形之情況下能準確擷取到各種波峰,例如拖尾峰(tailing peaks)、低強度的波峰(low-intensity peaks)與分裂峰(split peaks)。分裂峰通常是由偵測器飽和(instrumental saturation)或去背演算法造成。TIPick包含了去背(background subtraction)與波峰擷取(peak picking)這兩個主要步驟。TIPick藉由去背消除已存在空白注射(blank injections)之化合物訊號,並使用圖譜波峰(chromatographic peak)之時間長度與強度(intensity)特性,去除雜訊並擷取波峰。另外,TIPick利用重複分析之實驗資料(duplicates)增強訊噪比(signal-to-noise ratio),有效提高波峰偵測之能力。TIPick會對指定之荷質比(target mass-to-charge ratio)自動產生萃取離子層析圖(extracted ion chromatogram),協助研究人員從複雜的混合物中測定特定化合物訊號。TIPick會自動產生樣本之總峰表(total peak table),作為分析依據或為後續提供可能的客制化分析。TIPick成功建立了國立台灣大學代謝體核心實驗室標準品之資料庫,並應用在毒物篩檢上,為複雜的液相層析飛行時間式質譜儀圖譜分析提供一個高靈敏度之標的化合物偵測演算法。

英文摘要

This dissertation presents two developed algorithms for solving computational problems of detecting small molecules in the field of metabolomics analysis and toxicological screening. In the first part of this dissertation, we present the tool - Chromaligner, which is a tool for chromatogram alignment to align retention time for chromatographic methods coupled to spectrophotometers such as high performance liquid chromatography and capillary electrophoresis for metabolomics works. Chromaligner resolves peak shifts by a constrained chromatogram alignment. For a collection of chromatograms and a set of defined peaks, Chromaligner aligns the chromatograms on defined peaks using correlation optimized warping (COW). Chromaligner is faster than the original COW algorithm by k2 times, where k is the number of defined peaks in a chromatogram. It also provides alignments based on known component peaks to reach the best results for further chemometric analysis. In the second part of this dissertation, we present the tool - TIPick. Liquid Chromatography - Time of Flight mass spectrometry has become an important technique in toxicological screening and metabolomic analysis. We hereby provide an effective algorithm, TIPick, for target analysis to accurately and sensitively detect target compounds in complex samples. TIPick comprises two major steps: background subtraction and peak picking. By subtracting the blank chromatogram, TIPick is able to eliminate chemical signals appearing in blank injections. TIPick utilizes the length and intensity of chromatographic peaks to perform peak enhancement and peak picking; thus, it is able to detect low-intensity or split peaks that may arise from either instrumental saturation or a mathematical background subtraction algorithm. TIPick is able to detect all peaks even the tailing or fronting peaks without pre-assuming peak shape. Furthermore, TIPick uses duplicate injections to enhance the signals of peaks, which improves peak detection power. TIPick can generate the extracted ion chromatograms for target m/z and automatically provide the total peak table of the sample. TIPick has successfully constructed the NTU MetaCore standard library and facilitated toxicological screening.

主题分类 基礎與應用科學 > 資訊科學
電機資訊學院 > 資訊工程學系
参考文献
  1. [1] V. Cianchino, C. Ortega, G. Acosta, L. D. Martinez, and M. R. Gomez, "Fingerprint analysis and synthetic adulterant search in Hedera helix formulations by capillary electrophoresis," Pharmazie, vol. 62, pp. 262-5, Apr 2007.
    連結:
  2. [2] A. Nordstrom, G. O'Maille, C. Qin, and G. Siuzdak, "Nonlinear data alignment for UPLC-MS and HPLC-MS based metabolomics: quantitative analysis of endogenous and exogenous metabolites in human serum," Anal Chem, vol. 78, pp. 3289-95, May 15 2006.
    連結:
  3. [3] C. Christin, A. K. Smilde, H. C. J. Hoefsloot, F. Suits, R. Bischoff, and P. L. Horvatovich, "Optimized Time Alignment Algorithm for LC-MS Data: Correlation Optimized Warping Using Component Detection Algorithm-Selected Mass Chromatograms," Analytical Chemistry, vol. 80, pp. 7012-7021, 2008.
    連結:
  4. [4] N. Hoffmann and J. Stoye, "ChromA: signal-based retention time alignment for chromatography-mass spectrometry data," Bioinformatics, vol. 25, pp. 2080-1, Aug 15 2009.
    連結:
  5. [5] M. E. Monroe, N. Tolic, N. Jaitly, J. L. Shaw, J. N. Adkins, and R. D. Smith, "VIPER: an advanced software package to support high-throughput LC-MS peptide identification," Bioinformatics, vol. 23, pp. 2021-3, Aug 1 2007.
    連結:
  6. [6] A. Lommen, "MetAlign: Interface-Driven, Versatile Metabolomics Tool for Hyphenated Full-Scan Mass Spectrometry Data Preprocessing," Analytical Chemistry, vol. 81, pp. 3079-3086, 2009.
    連結:
  7. [7] N.-P. V. Nielsen, J. M. Carstensen, and J. Smedsgaard, "Aligning of single and multiple wavelength chromatographic profiles for chemometric data analysis using correlation optimised warping," Journal of Chromatography A, vol. 805, pp. 17-35, 1998.
    連結:
  8. [8] G. Tomasi, F. v. d. Berg, and C. Andersson, "Correlation optimized warping and dynamic time warping as preprocessing methods for chromatographic data," Journal of Chemometrics, vol. 18, pp. 231-241, 2004.
    連結:
  9. [9] J. W. H. Wong, C. Durante, and H. M. Cartwright, "Application of Fast Fourier Transform Cross-Correlation for the Alignment of Large Chromatographic and Spectral Datasets," Analytical Chemistry, vol. 77, pp. 5655-5661, 2005.
    連結:
  10. [10] S. Lacorte and A. R. Fernandez-Alba, "Time of flight mass spectrometry applied to the liquid chromatographic analysis of pesticides in water and food," Mass Spectrometry Reviews, vol. 25, pp. 866-880, 2006.
    連結:
  11. [11] S. Ojanpera, A. Pelander, M. Pelzing, I. Krebs, E. Vuori, and I. Ojanpera, "Isotopic pattern and accurate mass determination in urine drug screening by liquid chromatography/time-of-flight mass spectrometry," Rapid Communications in Mass Spectrometry, vol. 20, pp. 1161-1167, 2006.
    連結:
  12. [13] H. K. Lee, C. S. Ho, Y. P. H. Iu, P. S. J. Lai, C. C. Shek, Y.-C. Lo, H. B. Klinke, and M. Wood, "Development of a broad toxicological screening technique for urine using ultra-performance liquid chromatography and time-of-flight mass spectrometry," Analytica Chimica Acta, vol. 649, pp. 80-90, 2009.
    連結:
  13. [14] F. Hernandez, J. V. Sancho, M. Ibanez, and S. Grimalt, "Investigation of pesticide metabolites in food and water by LC-TOF-MS," TrAC Trends in Analytical Chemistry, vol. 27, pp. 862-872, 2008.
    連結:
  14. [15] J. K. Nicholson, J. C. Lindon, and E. Holmes, "'Metabonomics': understanding the metabolic responses of living systems to pathophysiological stimuli via multivariate statistical analysis of biological NMR spectroscopic data," Xenobiotica, vol. 29, pp. 1181-1189, 1999.
    連結:
  15. [16] C. M. Fleming, B. R. Kowalski, A. Apffel, and W. S. Hancock, "Windowed mass selection method: a new data processing algorithm for liquid chromatography-mass spectrometry data," Journal of Chromatography A, vol. 849, pp. 71-85, 1999.
    連結:
  16. [17] V. P. Andreev, T. Rejtar, H. S. Chen, E. V. Moskovets, A. R. Ivanov, and B. L. Karger, "A universal denoising and peak picking algorithm for LC-MS based on matched filtration in the chromatographic time domain," Anal Chem, vol. 75, pp. 6314-26, Nov 15 2003.
    連結:
  17. [18] M. Katajamaa and M. Oresic, "Data processing for mass spectrometry-based metabolomics," Journal of Chromatography A, vol. 1158, pp. 318-328, 2007.
    連結:
  18. [19] W. Windig, J. M. Phalp, and A. W. Payne, "A Noise and Background Reduction Method for Component Detection in Liquid Chromatography/Mass Spectrometry," Analytical Chemistry, vol. 68, pp. 3602-3606, 1996.
    連結:
  19. [20] Y. Li, H. Qu, and Y. Cheng, "An entropy-based method for noise reduction of liquid chromatography-mass spectrometry data," Analytica Chimica Acta, vol. 612, pp. 19-22, 2008.
    連結:
  20. [21] W. Windig, "The use of the Durbin-Watson criterion for noise and background reduction of complex liquid chromatography/mass spectrometry data and a new algorithm to determine sample differences," Chemometrics and Intelligent Laboratory Systems, vol. 77, pp. 206-214, 2005.
    連結:
  21. [22] T. UENO, T. SUEYOSHI, and Y. TAKEGAMI, "Computer-Aided Deduction of Mass Spectra Detected on a Photographic Plate. I. Description of the Procedures," JMSSJ, vol. 22, pp. 95-109, 1974.
    連結:
  22. [23] H. Zhang and Y. Yang, "An algorithm for thorough background subtraction from high-resolution LC/MS data: application for detection of glutathione-trapped reactive metabolites," J Mass Spectrom, vol. 43, pp. 1181-90, Sep 2008.
    連結:
  23. [24] P. Zhu, W. Ding, W. Tong, A. Ghosal, K. Alton, and S. Chowdhury, "A retention-time-shift-tolerant background subtraction and noise reduction algorithm (BgS-NoRA) for extraction of drug metabolites in liquid chromatography/mass spectrometry data from biological matrices," Rapid Communications in Mass Spectrometry, vol. 23, pp. 1563-1572, 2009.
    連結:
  24. [25] C. A. Smith, E. J. Want, G. O'Maille, R. Abagyan, and G. Siuzdak, "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification," Analytical Chemistry, vol. 78, pp. 779-787, 2006.
    連結:
  25. [26] T. Pluskal, S. Castillo, A. Villar-Briones, and M. Oresic, "MZmine 2: Modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data," BMC Bioinformatics, vol. 11, p. 395, 2010.
    連結:
  26. [27] E. Melamud, L. Vastag, and J. D. Rabinowitz, "Metabolomic Analysis and Visualization Engine for LC-MS Data," Analytical Chemistry, vol. 82, pp. 9818-9826, 2010.
    連結:
  27. [28] R. Tautenhahn, C. Bottcher, and S. Neumann, "Highly sensitive feature detection for high resolution LC/MS," BMC Bioinformatics, vol. 9, p. 504, 2008.
    連結:
  28. [29] D. Kessner, M. Chambers, R. Burke, D. Agus, and P. Mallick, "ProteoWizard: open source software for rapid proteomics tools development," Bioinformatics, vol. 24, pp. 2534-2536, November 1, 2008 2008.
    連結:
  29. [30] R. D. C. Team, R: A Language and Environment for Statistical Computing. Vienna, Austria, 2011.
    連結:
  30. [31] P. Du, W. A. Kibbe, and S. M. Lin, "Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching," Bioinformatics, vol. 22, pp. 2059-2065, September 1, 2006 2006.
    連結:
  31. [32] J. Zhang, E. Gonzalez, T. Hestilow, W. Haskins, and Y. Huang, "Review of peak detection algorithms in liquid-chromatography-mass spectrometry," Curr Genomics, vol. 10, pp. 388-401, Sep 2009.
    連結:
  32. [34] L. J. Jensen, J. Saric, and P. Bork, "Literature mining for the biologist: from information retrieval to biological discovery," Nat Rev Genet, vol. 7, pp. 119-129, 2006.
    連結:
  33. [35] C. J. v. Rijsbergen, {Information Retrieval}, 2 ed. London: Butterworths, 1979.
    連結:
  34. [36] S. Na and E. Paek, "Quality Assessment of Tandem Mass Spectra Based on Cumulative Intensity Normalization," Journal of Proteome Research, vol. 5, pp. 3241-3248, 2006.
    連結:
  35. [37] O. Schulz-Trieglaff, E. Machtejevas, K. Reinert, H. Schluter, J. Thiemann, and K. Unger, "Statistical quality assessment and outlier detection for liquid chromatography-mass spectrometry experiments," BioData Mining, vol. 2, p. 4, 2009.
    連結:
  36. [38] H. G. Gika, G. A. Theodoridis, J. E. Wingate, and I. D. Wilson, "Within-Day Reproducibility of an HPLC-MS-Based Method for Metabonomic Analysis: Application to Human Urine," Journal of Proteome Research, vol. 6, pp. 3291-3303, 2007.
    連結:
  37. [12] S. Ojanpera and I. Ojanpera, "Forensic Drug Screening by LC–MS Using Accurate Mass Measurement," LC GC EUROPE, vol. 18, pp. 607-614, 2005.
  38. [33] W. Hersh, A. Cohen, P. Roberts, and H. Rekapalli, "TREC 2006 Genomics Track Overview," in TREC Notebook, 2006.