Bioinformatics

Incorporating peak grouping information for alignment of multiple liquid chromatography-mass spectrometry datasets

Wandy, J., Daly, R., Breitling, R., Rogers, S..

Motivation: The combination of liquid chromatography and mass spectrometry (LC/MS) has been widely used for large-scale comparative studies in systems biology, including proteomics, glycomics and metabolomics. In almost all experimental design, it is necessary to compare chromatograms across biological or technical replicates and across sample groups. Central to this is the peak alignment step, which is one of the most important but challenging preprocessing steps. Existing alignment tools do not take into account the structural dependencies between related peaks that coelute and are derived from the same metabolite or peptide. We propose a direct matching peak alignment method for LC/MS data that incorporates related peaks information (within each LC/MS run) and investigate its effect on alignment performance (across runs). The groupings of related peaks necessary for our method can be obtained from any peak clustering method and are built into a pair-wise peak similarity score function. The similarity score matrix produced is used by an approximation algorithm for the weighted matching problem to produce the actual alignment result.

Results: We demonstrate that related peak information can improve alignment performance. The performance is evaluated on a set of benchmark datasets, where our method performs competitively compared to other popular alignment tools.

Availability: The proposed alignment method has been implemented as a stand-alone application in Python, available for download at http://github.com/joewandy/peak-grouping-alignment.

Contact: Simon.Rogers@glasgow.ac.uk

Supplementary information: Supplementary data are available at Bioinformatics online.