Bioinformatics

bio-samtools 2: a package for analysis and visualization of sequence and alignment data with SAMtools in Ruby

Etherington, G. J., Ramirez-Gonzalez, R. H., MacLean, D..

Motivation: bio-samtools is a Ruby language interface to SAMtools, the highly popular library that provides utilities for manipulating high-throughput sequence alignments in the Sequence Alignment/Map format. Advances in Ruby, now allow us to improve the analysis capabilities and increase bio-samtools utility, allowing users to accomplish a large amount of analysis using a very small amount of code. bio-samtools can also be easily developed to include additional SAMtools methods and hence stay current with the latest SAMtools releases.

Results: We have added new Ruby classes for the MPileup and Variant Call Format (VCF) data formats emitted by SAMtools and introduced more analysis methods for variant analysis, including alternative allele calculation and allele frequency calling for SNPs. Our new implementation of bio-samtools also ensures that all the functionality of the SAMtools library is now supported and that bio-samtools can be easily extended to include future changes in SAMtools. bio-samtools 2 also provides methods that allow the user to directly produce visualization of alignment data.

Availability and implementation: bio-samtools is available as a BioGem from http://www.biogems.info or as source code from https://github.com/helios/bioruby-samtools under the MIT License.

Contact: dan.maclean@tsl.ac.uk