Coevolution with Bacteriophages Drives Genome-Wide Host Evolution and Constrains the Acquisition of Abiotic-Beneficial Mutations

噬菌体的基因组进化的协同进化的驱动主机宽约束非生物有益突变的采集

Studies of antagonistic coevolution between hosts and parasites typically focus on resistance and infectivity traits. However, coevolution could also have genome-wide effects on the hosts due to pleiotropy, epistasis, or selection for evolvability. Here, we investigate these effects in the bacterium Pseudomonas fluorescens SBW25 during approximately 400 generations of evolution in the presence or absence of bacteriophage (coevolution or evolution treatments, respectively). Coevolution resulted in variable phage resistance, lower competitive fitness in the absence of phages, and greater genome-wide divergence both from the ancestor and between replicates, in part due to the evolution of increased mutation rates. Hosts from coevolution and evolution treatments had different suites of mutations. A high proportion of mutations observed in coevolved hosts were associated with a known phage target binding site, the lipopolysaccharide (LPS), and correlated with altered LPS length and phage resistance. Mutations in evolved bacteria were correlated with higher fitness in the absence of phages. However, the benefits of these growth-promoting mutations were completely lost when these bacteria were subsequently coevolved with phages, indicating that they were not beneficial in the presence of resistance mutations (consistent with negative epistasis). Our results show that in addition to affecting genome-wide evolution in loci not obviously linked to parasite resistance, coevolution can also constrain the acquisition of mutations beneficial for growth in the abiotic environment.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Discoveries

The Effect of Selection Environment on the Probability of Parallel Evolution

在平行进化的概率选择环境的影响

Across the great diversity of life, there are many compelling examples of parallel and convergent evolution—similar evolutionary changes arising in independently evolving populations. Parallel evolution is often taken to be strong evidence of adaptation occurring in populations that are highly constrained in their genetic variation. Theoretical models suggest a few potential factors driving the probability of parallel evolution, but experimental tests are needed. In this study, we quantify the degree of parallel evolution in 15 replicate populations of Pseudomonas fluorescens evolved in five different environments that varied in resource type and arrangement. We identified repeat changes across multiple levels of biological organization from phenotype, to gene, to nucleotide, and tested the impact of 1) selection environment, 2) the degree of adaptation, and 3) the degree of heterogeneity in the environment on the degree of parallel evolution at the gene-level. We saw, as expected, that parallel evolution occurred more often between populations evolved in the same environment; however, the extent of parallel evolution varied widely. The degree of adaptation did not significantly explain variation in the extent of parallelism in our system but number of available beneficial mutations correlated negatively with parallel evolution. In addition, degree of parallel evolution was significantly higher in populations evolved in a spatially structured, multiresource environment, suggesting that environmental heterogeneity may be an important factor constraining adaptation. Overall, our results stress the importance of environment in driving parallel evolutionary changes and point to a number of avenues for future work for understanding when evolution is predictable.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Discoveries

Shared Selective Pressures on Fungal and Human Metabolic Pathways Lead to Divergent yet Analogous Genetic Responses

共同的选择压力对真菌和人体代谢途径导致不同但类似的遗传反应

Reduced metabolic efficiency, toxic intermediate accumulation, and deficits of molecular building blocks, which all stem from disruptions of flux through metabolic pathways, reduce organismal fitness. Although these represent shared selection pressures across organisms, the genetic signatures of the responses to them may differ. In fungi, a frequently observed signature is the physical linkage of genes from the same metabolic pathway. In contrast, human metabolic genes are rarely tightly linked; rather, they tend to show tissue-specific coexpression. We hypothesized that the physical linkage of fungal metabolic genes and the tissue-specific coexpression of human metabolic genes are divergent yet analogous responses to the range of selective pressures imposed by disruptions of flux. To test this, we examined the degree to which the human homologs of physically linked metabolic genes in fungi (fungal linked homologs or FLOs) are coexpressed across six human tissues. We found that FLOs are significantly more correlated in their expression profiles across human tissues than other metabolic genes. We obtained similar results in analyses of the same six tissues from chimps, gorillas, orangutans, and macaques. We suggest that when selective pressures remain stable across large evolutionary distances, evidence of selection in a given evolutionary lineage can become a highly reliable predictor of the signature of selection in another, even though the specific adaptive response in each lineage is markedly different.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Discoveries

Discovery of Novel Genes Derived from Transposable Elements Using Integrative Genomic Analysis

来自使用综合基因组分析转座因子新基因的发现

Complex eukaryotes contain millions of transposable elements (TEs), comprising large fractions of their nuclear genomes. TEs consist of structural, regulatory, and coding sequences that are ordinarily associated with transposition, but that occasionally confer on the organism a selective advantage and may thereby become exapted. Exapted transposable element genes (ETEs) are known to play critical roles in diverse systems, from vertebrate adaptive immunity to plant development. Yet despite their evident importance, most ETEs have been identified fortuitously and few systematic searches have been conducted, suggesting that additional ETEs may await discovery. To explore this possibility, we develop a comprehensive systematic approach to searching for ETEs. We use TE-specific conserved domains to identify with high precision genes derived from TEs and screen them for signatures of exaptation based on their similarities to reference sets of known ETEs, conventional (non-TE) genes, and TE genes across diverse genetic attributes including repetitiveness, conservation of genomic location and sequence, and levels of expression and repressive small RNAs. Applying this approach in the model plant Arabidopsis thaliana, we discover a surprisingly large number of novel high confidence ETEs. Intriguingly, unlike known plant ETEs, several of the novel ETE families form tandemly arrayed gene clusters, whereas others are relatively young. Our results not only identify novel TE-derived genes that may have practical applications but also challenge the notion that TE exaptation is merely a relic of ancient life, instead suggesting that it may continue to fundamentally drive evolution.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Discoveries

Positive Selection Drives Preferred Segment Combinations during Influenza Virus Reassortment

正选择驱动优先段组合的流感病毒重配的过程中

Influenza A virus (IAV) has a segmented genome that allows for the exchange of genome segments between different strains. This reassortment accelerates evolution by breaking linkage, helping IAV cross species barriers to potentially create highly virulent strains. Challenges associated with monitoring the process of reassortment in molecular detail have limited our understanding of its evolutionary implications. We applied a novel deep sequencing approach with quantitative analysis to assess the in vitro temporal evolution of genomic reassortment in IAV. The combination of H1N1 and H3N2 strains reproducibly generated a new H1N2 strain with the hemagglutinin and nucleoprotein segments originating from H1N1 and the remaining six segments from H3N2. By deep sequencing the entire viral genome, we monitored the evolution of reassortment, quantifying the relative abundance of all IAV genome segments from the two parent strains over time and measuring the selection coefficients of the reassorting segments. Additionally, we observed several mutations coemerging with reassortment that were not found during passaging of pure parental IAV strains. Our results demonstrate how reassortment of the segmented genome can accelerate viral evolution in IAV, potentially enabled by the emergence of a small number of individual mutations.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Discoveries

Model-Based Verification of Hypotheses on the Origin of Modern Japanese Revisited by Bayesian Inference Based on Genome-Wide SNP Data

基于模型的验证假设,对现代日本的重新审视基于全基因组SNP数据的贝叶斯推理的起源

Various hypotheses for the peopling of the Japanese archipelago have been proposed, which can be classified into three models: transformation, replacement, and hybridization. In recent years, one of the hybridization models ("dual-structure model") has been widely accepted. According to this model, Neolithic hunter-gatherers known as Jomon, who are assumed to have originated in southeast Asia and lived in the Japanese archipelago greater than 10,000 years ago, admixed with an agricultural people known as Yayoi, whom were migrants from the East Asian continent 2,000–3,000 years ago. Meanwhile, some anthropologists propose that rather, morphological differences between the Jomon and Yayoi people can be explained by microevolution following the lifestyle change. To resolve this controversy, we compared three demographic models by approximate Bayesian computation using genome-wide single nucleotide polymorphism (gwSNP) data from the Ainu people who are thought to be direct descendants of indigenous Jomon. If we assume Chinese people sampled in Beijing from HapMap have the same ancestry as Yayoi, then the hybridization model is predicted to be between 29 and 63 times more likely than the replacement and transformation models, respectively. Furthermore, our data provide strong support for a model in which the Jomon lineages had population structure diversified in local areas before the admixture event. Initial divergence between the Jomon and Yayoi ancestries was dated to late Pleistocene, followed by the divergence of Jomon lineages at early Holocene. These results suggest gwSNP data provides a detailed picture of the complex hybridization model for Japanese population history.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Discoveries

A "Developmental Hourglass" in Fungi

“沙漏”真菌

The "developmental hourglass" concept suggests that intermediate developmental stages are most resistant to evolutionary changes and that differences between species arise through divergence later in development. This high conservation during middevelopment is illustrated by the "waist" of the hourglass and it represents a low probability of evolutionary change. Earlier molecular surveys both on animals and on plants have shown that the genes expressed at the waist stage are more ancient and more conserved in their expression. The existence of such a developmental hourglass has not been explored in fungi, another eukaryotic kingdom. In this study, we generated a series of transcriptomic data covering the entire lifecycle of a model mushroom-forming fungus, Coprinopsis cinerea, and we observed a molecular hourglass over its development. The "young fruiting body" is the stage that expresses the evolutionarily oldest (lowest transcriptome age index) transcriptome and gives the strongest signal of purifying selection (lowest transcriptome divergence index). We also demonstrated that all three kingdoms—animals, plants, and fungi—display high expression levels of genes in "information storage and processing" at the waist stages, whereas the genes in "metabolism" become more highly expressed later. Besides, the three kingdoms all show underrepresented "signal transduction mechanisms" at the waist stages. The synchronic existence of a molecular "hourglass" across the three kingdoms reveals a mutual strategy for eukaryotes to incorporate evolutionary innovations.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Discoveries

Evolution of an Ancient Venom: Recognition of a Novel Family of Cnidarian Toxins and the Common Evolutionary Origin of Sodium and Potassium Neurotoxins in Sea Anemone

一个古老的毒液进化:一种海葵毒素家族的识别和海葵神经毒素中钠和钾的共同的进化起源

Despite Cnidaria (sea anemones, corals, jellyfish, and hydroids) being the oldest venomous animal lineage, structure–function relationships, phyletic distributions, and the molecular evolutionary regimes of toxins encoded by these intriguing animals are poorly understood. Hence, we have comprehensively elucidated the phylogenetic and molecular evolutionary histories of pharmacologically characterized cnidarian toxin families, including peptide neurotoxins (voltage-gated Na+ and K+ channel-targeting toxins: NaTxs and KTxs, respectively), pore-forming toxins (actinoporins, aerolysin-related toxins, and jellyfish toxins), and the newly discovered small cysteine-rich peptides (SCRiPs). We show that despite long evolutionary histories, most cnidarian toxins remain conserved under the strong influence of negative selection—a finding that is in striking contrast to the rapid evolution of toxin families in evolutionarily younger lineages, such as cone snails and advanced snakes. In contrast to the previous suggestions that implicated SCRiPs in the biomineralization process in corals, we demonstrate that they are potent neurotoxins that are likely involved in the envenoming function, and thus represent the first family of neurotoxins from corals. We also demonstrate the common evolutionary origin of type III KTxs and NaTxs in sea anemones. We show that type III KTxs have evolved from NaTxs under the regime of positive selection, and likely represent a unique evolutionary innovation of the Actinioidea lineage. We report a correlation between the accumulation of episodically adaptive sites and the emergence of novel pharmacological activities in this rapidly evolving neurotoxic clade.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Discoveries

The Effects of Partitioning on Phylogenetic Inference

对系统推理划分的影响

Partitioning is a commonly used method in phylogenetics that aims to accommodate variation in substitution patterns among sites. Despite its popularity, there have been few systematic studies of its effects on phylogenetic inference, and there have been no studies that compare the effects of different approaches to partitioning across many empirical data sets. In this study, we applied four commonly used approaches to partitioning to each of 34 empirical data sets, and then compared the resulting tree topologies, branch-lengths, and bootstrap support estimated using each approach. We find that the choice of partitioning scheme often affects tree topology, particularly when partitioning is omitted. Most notably, we find occasional instances where the use of a suboptimal partitioning scheme produces highly supported but incorrect nodes in the tree. Branch-lengths and bootstrap support are also affected by the choice of partitioning scheme, sometimes dramatically so. We discuss the reasons for these effects and make some suggestions for best practice.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Methods

Reconstructing (Super)Trees from Data Sets with Missing Distances: Not All Is Lost

重构(超)丢失的距离数据集树:并非所有的损失

The wealth of phylogenetic information accumulated over many decades of biological research, coupled with recent technological advances in molecular sequence generation, presents significant opportunities for researchers to investigate relationships across and within the kingdoms of life. However, to make best use of this data wealth, several problems must first be overcome. One key problem is finding effective strategies to deal with missing data. Here, we introduce Lasso, a novel heuristic approach for reconstructing rooted phylogenetic trees from distance matrices with missing values, for data sets where a molecular clock may be assumed. Contrary to other phylogenetic methods on partial data sets, Lasso possesses desirable properties such as its reconstructed trees being both unique and edge-weighted. These properties are achieved by Lasso restricting its leaf set to a large subset of all possible taxa, which in many practical situations is the entire taxa set. Furthermore, the Lasso approach is distance-based, rendering it very fast to run and suitable for data sets of all sizes, including large data sets such as those generated by modern Next Generation Sequencing technologies. To better understand the performance of Lasso, we assessed it by means of artificial and real biological data sets, showing its effectiveness in the presence of missing data. Furthermore, by formulating the supermatrix problem as a particular case of the missing data problem, we assessed Lasso’s ability to reconstruct supertrees. We demonstrate that, although not specifically designed for such a purpose, Lasso performs better than or comparably with five leading supertree algorithms on a challenging biological data set. Finally, we make freely available a software implementation of Lasso so that researchers may, for the first time, perform both rooted tree and supertree reconstruction with branch lengths on their own partial data sets.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Methods

Evolution of tRNA Repertoires in Bacillus Inferred with OrthoAlign

tRNA的剧目在推断的orthoalign杆菌进化

OrthoAlign, an algorithm for the gene order alignment problem (alignment of orthologs), accounting for most genome-wide evolutionary events such as duplications, losses, rearrangements, and substitutions, was presented. OrthoAlign was used in a phylogenetic framework to infer the evolution of transfer RNA repertoires of 50 fully sequenced bacteria in the Bacillus genus. A prevalence of gene duplications and losses over rearrangement events was observed. The average rate of duplications inferred in Bacillus was 24 times lower than the one reported in Escherichia coli, whereas the average rates of losses and inversions were both 12 times lower. These rates were extremely low, suggesting a strong selective pressure acting on tRNA gene repertoires in Bacillus. An exhaustive analysis of the type, location, distribution, and length of evolutionary events was provided, together with ancestral configurations. OrthoAlign can be downloaded at: http://www.iro.umontreal.ca/~mabrouk/.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Methods

Identification of Lineage-Specific Cis-Regulatory Modules Associated with Variation in Transcription Factor Binding and Chromatin Activity Using Ornstein-Uhlenbeck Models

谱系特异的顺式调控模块与转录因子结合和染色质的活性使用OU模型变化相关的鉴定

Scoring the impact of noncoding variation on the function of cis-regulatory regions, on their chromatin state, and on the qualitative and quantitative expression levels of target genes is a fundamental problem in evolutionary genomics. A particular challenge is how to model the divergence of quantitative traits and to identify relationships between the changes across the different levels of the genome, the chromatin activity landscape, and the transcriptome. Here, we examine the use of the Ornstein–Uhlenbeck (OU) model to infer selection at the level of predicted cis-regulatory modules (CRMs), and link these with changes in transcription factor binding and chromatin activity. Using publicly available cross-species ChIP-Seq and STARR-Seq data we show how OU can be applied genome-wide to identify candidate transcription factors for which binding site and CRM turnover is correlated with changes in regulatory activity. Next, we profile open chromatin in the developing eye across three Drosophila species. We identify the recognition motifs of the chromatin remodelers, Trithorax-like and Grainyhead as mostly correlating with species-specific changes in open chromatin. In conclusion, we show in this study that CRM scores can be used as quantitative traits and that motif discovery approaches can be extended towards more complex models of divergence.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Research Article

Integrated Stochastic Model of DNA Damage Repair by Non-homologous End Joining and p53/p21- Mediated Early Senescence Signalling

DNA损伤修复的综合随机模型的非同源末端连接和p53、p21介导的早期衰老的信号

by David W. P. Dolan, Anze Zupanic, Glyn Nelson, Philip Hall, Satomi Miwa, Thomas B. L. Kirkwood, Daryl P. Shanley

Unrepaired or inaccurately repaired DNA damage can lead to a range of cell fates, such as apoptosis, cellular senescence or cancer, depending on the efficiency and accuracy of DNA damage repair and on the downstream DNA damage signalling. DNA damage repair and signalling have been studied and modelled in detail separately, but it is not yet clear how they integrate with one another to control cell fate. In this study, we have created an integrated stochastic model of DNA damage repair by non-homologous end joining and of gamma irradiation-induced cellular senescence in human cells that are not apoptosis-prone. The integrated model successfully explains the changes that occur in the dynamics of DNA damage repair after irradiation. Simulations of p53/p21 dynamics after irradiation agree well with previously published experimental studies, further validating the model. Additionally, the model predicts, and we offer some experimental support, that low-dose fractionated irradiation of cells leads to temporal patterns in p53/p21 that lead to significant cellular senescence. The integrated model is valuable for studying the processes of DNA damage induced cell fate and predicting the effectiveness of DNA damage related medical interventions at the cellular level.

[详细]

  • PLOS Computational Biology
  • 10年前

Summary of the DREAM8 Parameter Estimation Challenge: Toward Parameter Identification for Whole-Cell Models

参数估计的dream8挑战总结:对全细胞模型参数识别

by Jonathan R. Karr, Alex H. Williams, Jeremy D. Zucker, Andreas Raue, Bernhard Steiert, Jens Timmer, Clemens Kreutz, DREAM8 Parameter Estimation Challenge Consortium , Simon Wilkinson, Brandon A. Allgood, Brian M. Bot, Bruce R. Hoff, Michael R. Kellen, Markus W. Covert, Gustavo A. Stolovitzky, Pablo Meyer

Whole-cell models that explicitly represent all cellular components at the molecular level have the potential to predict phenotype from genotype. However, even for simple bacteria, whole-cell models will contain thousands of parameters, many of which are poorly characterized or unknown. New algorithms are needed to estimate these parameters and enable researchers to build increasingly comprehensive models. We organized the Dialogue for Reverse Engineering Assessments and Methods (DREAM) 8 Whole-Cell Parameter Estimation Challenge to develop new parameter estimation algorithms for whole-cell models. We asked participants to identify a subset of parameters of a whole-cell model given the model’s structure and in silico “experimental” data. Here we describe the challenge, the best performing methods, and new insights into the identifiability of whole-cell models. We also describe several valuable lessons we learned toward improving future challenges. Going forward, we believe that collaborative efforts supported by inexpensive cloud computing have the potential to solve whole-cell model parameter estimation.

[详细]

  • PLOS Computational Biology
  • 10年前

Experimental and Computational Analysis of a Large Protein Network That Controls Fat Storage Reveals the Design Principles of a Signaling Network

一个大的蛋白质网络,控制脂肪存储了信令网络的设计原理的实验和计算分析

by Bader Al-Anzi, Patrick Arpp, Sherif Gerges, Christopher Ormerod, Noah Olsman, Kai Zinn

An approach combining genetic, proteomic, computational, and physiological analysis was used to define a protein network that regulates fat storage in budding yeast (Saccharomyces cerevisiae). A computational analysis of this network shows that it is not scale-free, and is best approximated by the Watts-Strogatz model, which generates “small-world” networks with high clustering and short path lengths. The network is also modular, containing energy level sensing proteins that connect to four output processes: autophagy, fatty acid synthesis, mRNA processing, and MAP kinase signaling. The importance of each protein to network function is dependent on its Katz centrality score, which is related both to the protein’s position within a module and to the module’s relationship to the network as a whole. The network is also divisible into subnetworks that span modular boundaries and regulate different aspects of fat metabolism. We used a combination of genetics and pharmacology to simultaneously block output from multiple network nodes. The phenotypic results of this blockage define patterns of communication among distant network nodes, and these patterns are consistent with the Watts-Strogatz model.

[详细]

  • PLOS Computational Biology
  • 10年前

Predicting Cortical Dark/Bright Asymmetries from Natural Image Statistics and Early Visual Transforms

预测皮质暗/亮不对称从自然图像统计和早期视觉变换

by Emily A. Cooper, Anthony M. Norcia

The nervous system has evolved in an environment with structure and predictability. One of the ubiquitous principles of sensory systems is the creation of circuits that capitalize on this predictability. Previous work has identified predictable non-uniformities in the distributions of basic visual features in natural images that are relevant to the encoding tasks of the visual system. Here, we report that the well-established statistical distributions of visual features -- such as visual contrast, spatial scale, and depth -- differ between bright and dark image components. Following this analysis, we go on to trace how these differences in natural images translate into different patterns of cortical input that arise from the separate bright (ON) and dark (OFF) pathways originating in the retina. We use models of these early visual pathways to transform natural images into statistical patterns of cortical input. The models include the receptive fields and non-linear response properties of the magnocellular (M) and parvocellular (P) pathways, with their ON and OFF pathway divisions. The results indicate that there are regularities in visual cortical input beyond those that have previously been appreciated from the direct analysis of natural images. In particular, several dark/bright asymmetries provide a potential account for recently discovered asymmetries in how the brain processes visual features, such as violations of classic energy-type models. On the basis of our analysis, we expect that the dark/bright dichotomy in natural images plays a key role in the generation of both cortical and perceptual asymmetries.

[详细]

  • PLOS Computational Biology
  • 10年前

Predicted Role of NAD Utilization in the Control of Circadian Rhythms during DNA Damage Response

预测在昼夜节律的控制和利用DNA损伤反应中的作用

by Augustin Luna, Geoffrey B. McFadden, Mirit I. Aladjem, Kurt W. Kohn

The circadian clock is a set of regulatory steps that oscillate with a period of approximately 24 hours influencing many biological processes. These oscillations are robust to external stresses, and in the case of genotoxic stress (i.e. DNA damage), the circadian clock responds through phase shifting with primarily phase advancements. The effect of DNA damage on the circadian clock and the mechanism through which this effect operates remains to be thoroughly investigated. Here we build an in silico model to examine damage-induced circadian phase shifts by investigating a possible mechanism linking circadian rhythms to metabolism. The proposed model involves two DNA damage response proteins, SIRT1 and PARP1, that are each consumers of nicotinamide adenine dinucleotide (NAD), a metabolite involved in oxidation-reduction reactions and in ATP synthesis. This model builds on two key findings: 1) that SIRT1 (a protein deacetylase) is involved in both the positive (i.e. transcriptional activation) and negative (i.e. transcriptional repression) arms of the circadian regulation and 2) that PARP1 is a major consumer of NAD during the DNA damage response. In our simulations, we observe that increased PARP1 activity may be able to trigger SIRT1-induced circadian phase advancements by decreasing SIRT1 activity through competition for NAD supplies. We show how this competitive inhibition may operate through protein acetylation in conjunction with phosphorylation, consistent with reported observations. These findings suggest a possible mechanism through which multiple perturbations, each dominant during different points of the circadian cycle, may result in the phase advancement of the circadian clock seen during DNA damage.

[详细]

  • PLOS Computational Biology
  • 10年前

Inferring Horizontal Gene Transfer

推断基因水平转移

by Matt Ravenhall, Nives Škunca, Florent Lassalle, Christophe Dessimoz

Horizontal or Lateral Gene Transfer (HGT or LGT) is the transmission of portions of genomic DNA between organisms through a process decoupled from vertical inheritance. In the presence of HGT events, different fragments of the genome are the result of different evolutionary histories. This can therefore complicate the investigations of evolutionary relatedness of lineages and species. Also, as HGT can bring into genomes radically different genotypes from distant lineages, or even new genes bearing new functions, it is a major source of phenotypic innovation and a mechanism of niche adaptation. For example, of particular relevance to human health is the lateral transfer of antibiotic resistance and pathogenicity determinants, leading to the emergence of pathogenic lineages [1]. Computational identification of HGT events relies upon the investigation of sequence composition or evolutionary history of genes. Sequence composition-based ("parametric") methods search for deviations from the genomic average, whereas evolutionary history-based ("phylogenetic") approaches identify genes whose evolutionary history significantly differs from that of the host species. The evaluation and benchmarking of HGT inference methods typically rely upon simulated genomes, for which the true history is known. On real data, different methods tend to infer different HGT events, and as a result it can be difficult to ascertain all but simple and clear-cut HGT events.

[详细]

  • PLOS Computational Biology
  • 10年前

A novel termini analysis theory using HTS data alone for the identification of Enterococcus phage EF4-like genome termini

小说末端使用高温超导数据分析理论仅识别的<,> < / >肠球菌噬菌体EF4-like基因组末端

Background: Enterococcus faecalis and Enterococcus faecium are typical enterococcal bacterial pathogens. Antibiotic resistance means that the identification of novel E. faecalis and E. faecium phages against antibiotic-resistant Enterococcus have an important impact on public health. In this study, the E. faecalis phage IME-EF4, E. faecium phage IME-EFm1, and both their hosts were antibiotic resistant. To characterize the genome termini of these two phages, a termini analysis theory was developed to provide a wealth of terminal sequence information directly, using only high-throughput sequencing (HTS) read frequency statistics. Results: The complete genome sequences of phages IME-EF4 and IME-EFm1 were determined, and our termini analysis theory was used to determine the genome termini of these two phages. Results showed 9 bp 3′ protruding cohesive ends in both IME-EF4 and IME-EFm1 genomes by analyzing frequencies of HTS reads. For the positive strands of their genomes, the 9 nt 3′ protruding cohesive ends are 5′-TCATCACCG-3′ (IME-EF4) and 5′-GGGTCAGCG-3′ (IME-EFm1). Further experiments confirmed these results. These experiments included mega-primer polymerase chain reaction sequencing, terminal run-off sequencing, and adaptor ligation followed by run-off sequencing. Conclusion: Using this termini analysis theory, the termini of two newly isolated antibiotic-resistant Enterococcus phages, IME-EF4 and IME-EFm1, were identified as the byproduct of HTS. Molecular biology experiments confirmed the identification. Because it does not require time-consuming wet lab termini analysis experiments, the termini analysis theory is a fast and easy means of identifying phage DNA genome termini using HTS read frequency statistics alone. It may aid understanding of phage DNA packaging.

[详细]

  • BMC Genomics 2015, null:414
  • 10年前

Unravelling the genome of Holy basil: an “incomparable” “elixir of life” of traditional Indian medicine

解体的基因组神圣罗勒:一个

Background: Ocimum sanctum L. (O. tenuiflorum) family-Lamiaceae is an important component of Indian tradition of medicine as well as culture around the world, and hence is known as “Holy basil” in India. This plant is mentioned in the ancient texts of Ayurveda as an “elixir of life” (life saving) herb and worshipped for over 3000 years due to its healing properties. Although used in various ailments, validation of molecules for differential activities is yet to be fully analyzed, as about 80 % of the patents on this plant are on extracts or the plant parts, and mainly focussed on essential oil components. With a view to understand the full metabolic potential of this plant whole nuclear and chloroplast genomes were sequenced for the first time combining the sequence data from 4 libraries and three NGS platforms. Results: The saturated draft assembly of the genome was about 386 Mb, along with the plastid genome of 142,245 bp, turning out to be the smallest in Lamiaceae. In addition to SSR markers, 136 proteins were identified as homologous to five important plant genomes. Pathway analysis indicated an abundance of phenylpropanoids in O. sanctum. Phylogenetic analysis for chloroplast proteome placed Salvia miltiorrhiza as the nearest neighbor. Comparison of the chemical compounds and genes availability in O. sanctum and S. miltiorrhiza indicated the potential for the discovery of new active molecules. Conclusion: The genome sequence and annotation of O. sanctum provides new insights into the function of genes and the medicinal nature of the metabolites synthesized in this plant. This information is highly beneficial for mining biosynthetic pathways for important metabolites in related species.

[详细]

  • BMC Genomics 2015, null:413
  • 10年前

Polyketide synthesis genes associated with toxin production in two species of Gambierdiscus (Dinophyceae)

聚酮合成的基因与两种<它>其中< / >毒素生产相关的(研究)

Background: Marine microbial protists, in particular, dinoflagellates, produce polyketide toxins with ecosystem-wide and human health impacts. Species of Gambierdiscus produce the polyether ladder compounds ciguatoxins and maitotoxins, which can lead to ciguatera fish poisoning, a serious human illness associated with reef fish consumption. Genes associated with the biosynthesis of polyether ladder compounds are yet to be elucidated, however, stable isotope feeding studies of such compounds consistently support their polyketide origin indicating that polyketide synthases are involved in their biosynthesis. Results: Here, we report the toxicity, genome size, gene content and transcriptome of Gambierdiscus australes and G. belizeanus. G. australes produced maitotoxin-1 and maitotoxin-3, while G. belizeanus produced maitotoxin-3, for which cell extracts were toxic to mice by IP injection (LD50 = 3.8 mg kg-1). The gene catalogues comprised 83,353 and 84,870 unique contigs, with genome sizes of 32.5 ± 3.7 Gbp and 35 ± 0.88 Gbp, respectively, and are amongst the most comprehensive yet reported from a dinoflagellate. We found three hundred and six genes involved in polyketide biosynthesis, including one hundred and ninty-two ketoacyl synthase transcripts, which formed five unique phylogenetic clusters. Conclusions: Two clusters were unique to these maitotoxin-producing dinoflagellate species, suggesting that they may be associated with maitotoxin biosynthesis. This work represents a significant step forward in our understanding of the genetic basis of polyketide production in dinoflagellates, in particular, species responsible for ciguatera fish poisoning.

[详细]

  • BMC Genomics 2015, null:410
  • 10年前

Ambivalent covariance models

Ambivalent covariance模式

Background: Evolutionary variations let us define a set of similar nucleic acid sequences as a family if these different molecules execute a common function. Capturing their sequence variation by using e. g. position specific scoring matrices significantly improves sensitivity of detection tools. Members of a functional (non-coding) RNA family are affected by these variations not only on the sequence, but also on the structural level. For example, some transfer-RNAs exhibit a fifth helix in addition to the typical cloverleaf structure. Current covariance models – the unrivaled homology search approach for structured RNA – do not benefit from structural variation within a family, but rather penalize it. This leads to artificial subdivision of families and loss of information in the Rfam database. Results: We propose an extension to the fundamental architecture of covariance models to allow for several, compatible consensus structures. The resulting models are called ambivalent covariance models. Evaluation on several Rfam families shows that coalescence of structural variation within a family by using ambivalent consensus models is superior to subdividing the family into multiple classical covariance models. Conclusion: A prototype and source code is available at http://bibiserv.cebitec.uni-bielefeld.de/acms.

[详细]

  • BMC Bioinformatics 2015, null:178
  • 10年前

Learning-guided automatic three dimensional synapse quantification for drosophila neurons

Learning-guided自动三维突触量化<,> < / >果蝇神经元

Background: The subcellular distribution of synapses is fundamentally important for the assembly, function, and plasticity of the nervous system. Automated and effective quantification tools are a prerequisite to large-scale studies of the molecular mechanisms of subcellular synapse distribution. Common practices for synapse quantification in neuroscience labs remain largely manual or semi-manual. This is mainly due to computational challenges in automatic quantification of synapses, including large volume, high dimensions and staining artifacts. In the case of confocal imaging, optical limit and xy-z resolution disparity also require special considerations to achieve the necessary robustness. Results: A novel algorithm is presented in the paper for learning-guided automatic recognition and quantification of synaptic markers in 3D confocal images. The method developed a discriminative model based on 3D feature descriptors that detected the centers of synaptic markers. It made use of adaptive thresholding and multi-channel co-localization to improve the robustness. The detected markers then guided the splitting of synapse clumps, which further improved the precision and recall of the detected synapses. Algorithms were tested on lobula plate tangential cells (LPTCs) in the brain of Drosophila melanogaster, for GABAergic synaptic markers on axon terminals as well as dendrites. Conclusions: The presented method was able to overcome the staining artifacts and the fuzzy boundaries of synapse clumps in 3D confocal image, and automatically quantify synaptic markers in a complex neuron such as LPTC. Comparison with some existing tools used in automatic 3D synapse quantification also proved the effectiveness of the proposed method.

[详细]

  • BMC Bioinformatics 2015, null:177
  • 10年前

MANTIS: an R package that simulates multilocus models of pathogen evolution

螳螂:R包,模拟表皮、病原体的进化模型

Background: In host-pathogen systems the development of immunity by the host places pressure on pathogens, by setting up competition between genetic variants due to the establishment of cross-protective responses. These pressures can lead to pathogen-specific, ubiquitous dynamic behaviours. Understanding the evolutionary forces that shape these patterns is one of the key goals of computationally simulated epidemiological models. Despite the contribution of such research methods in recent years to our current understanding of pathogen evolution, the availability of free software tools for the general public remains scarce. Results: We developed the Multilocus ANTIgenic Simulator (MANTIS) software package for the R statistical environment. MANTIS can simulate and analyse epidemiological time-series generated under the biological assumptions of the strain theory of host-pathogen systems by Gupta et al. Conclusions: MANTIS wraps a C/C++ ordinary-differential equations system and Runge-Kutta solver into a set of user-friendly R functions. These include routines to numerically simulate the system and others to analyse, visualize and export results. For this, the package offers its own set of time-series plotting and exportation functions. MANTIS’s main goal is to serve as a free, ready-to-use academic software tool. Its open source nature further provides an opportunity for users with advanced programming skills to expand its capabilities. Here, we describe the background theory, implementation, basic functionality and usage of this package. MANTIS is freely available from http://www.eeid.ox.ac.uk/mantisunder the GPL license.

[详细]

  • BMC Bioinformatics 2015, null:176
  • 10年前