Bioinformatics

FERAL: network-based classifier with application to breast cancer outcome prediction

Allahyar, A., de Ridder, J..

Motivation: Breast cancer outcome prediction based on gene expression profiles is an important strategy for personalize patient care. To improve performance and consistency of discovered markers of the initial molecular classifiers, network-based outcome prediction methods (NOPs) have been proposed. In spite of the initial claims, recent studies revealed that neither performance nor consistency can be improved using these methods. NOPs typically rely on the construction of meta-genes by averaging the expression of several genes connected in a network that encodes protein interactions or pathway information. In this article, we expose several fundamental issues in NOPs that impede on the prediction power, consistency of discovered markers and obscures biological interpretation.

Results: To overcome these issues, we propose FERAL, a network-based classifier that hinges upon the Sparse Group Lasso which performs simultaneous selection of marker genes and training of the prediction model. An important feature of FERAL, and a significant departure from existing NOPs, is that it uses multiple operators to summarize genes into meta-genes. This gives the classifier the opportunity to select the most relevant meta-gene for each gene set. Extensive evaluation revealed that the discovered markers are markedly more stable across independent datasets. Moreover, interpretation of the marker genes detected by FERAL reveals valuable mechanistic insight into the etiology of breast cancer.

Availability and implementation: All code is available for download at: http://homepage.tudelft.nl/53a60/resources/FERAL/FERAL.zip.

Contact: j.deridder@tudelft.nl

Supplementary information: Supplementary data are available at Bioinformatics online.