Bioinformatics

FourCSeq: Analysis of 4C sequencing data

Klein, F. A., Pakozdi, T., Anders, S., Ghavi-Helm, Y., Furlong, E. E. M., Huber, W..

Motivation: Circularized Chromosome Conformation Capture (4C) is a powerful technique for studying the spatial interactions of a specific genomic region called the "viewpoint" with the rest of the genome, both in a single condition or comparing different experimental conditions or cell types. Observed ligation frequencies typically show a strong, regular dependence on genomic distance from the viewpoint, on top of which specific interaction peaks are superimposed. Here, we address the computational task to find these specific peaks and to detect changes between different biological conditions.

Results: We model the overall trend of decreasing interaction frequency with genomic distance by fitting a smooth monotonically decreasing function to suitably transformed count data. Based on the fit, z-scores are calculated from the residuals, and high z-scores are interpreted as peaks providing evidence for specific interactions. To compare different conditions, we normalize fragment counts between samples, and call for differential contact frequencies using the statistical method DESeq2 adapted from RNA-Seq analysis.

Availability and Implementation: A full end-to-end analysis pipeline is implemented in the R package FourCSeq available at www.bioconductor.org.

Contact: felix.klein@embl.de, whuber@embl.de