Data-dependent bucketing improves reference-free compression of sequencing reads

视用桶装改善reference-free压缩的顺序读取

Motivation: The storage and transmission of high-throughput sequencing data consumes significant resources. As our capacity to produce such data continues to increase, this burden will only grow. One approach to reduce storage and transmission requirements is to compress this sequencing data.

Results: We present a novel technique to boost the compression of sequencing that is based on the concept of bucketing similar reads so that they appear nearby in the file. We demonstrate that, by adopting a data-dependent bucketing scheme and employing a number of encoding ideas, we can achieve substantially better compression ratios than existing de novo sequence compression tools, including other bucketing and reordering schemes. Our method, Mince, achieves up to a 45% reduction in file sizes (28% on average) compared with existing state-of-the-art de novo compression schemes.

Availability and implementation: Mince is written in C++11, is open source and has been made available under the GPLv3 license. It is available at http://www.cs.cmu.edu/~ckingsf/software/mince.

Contact: carlk@cs.cmu.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

[详细]

  • Bioinformatics
  • 10年前
  • ORIGINAL PAPER

pez: phylogenetics for the environmental sciences

沥青:环境科学系统发生学

Summary: pez is an R package that permits measurement, modelling and simulation of phylogenetic structure in ecological data. pez contains the first implementation of many methods in R, and aggregates existing data structures and methods into a single, coherent package.

Availability and implementation: pez is released under the GPL v3 open-source license, available on the Internet from CRAN (http://cran.r-project.org). The package is under active development, and the authors welcome contributions (see http://github.com/willpearse/pez).

Contact: will.pearse@gmail.com

[详细]

  • Bioinformatics
  • 10年前
  • APPLICATIONS NOTE

Phylodynamic inference with kernel ABC and its application to HIV epidemiology

与内核的ABC和艾滋病流行病学研究中的应用phylodynamic推理

The shapes of phylogenetic trees relating virus populations are determined by the adaptation of viruses within each host, and by the transmission of viruses among hosts. Phylodynamic inference attempts to reverse this flow of information, estimating parameters of these processes from the shape of a virus phylogeny reconstructed from a sample of genetic sequences from the epidemic. A key challenge to phylodynamic inference is quantifying the similarity between two trees in an efficient and comprehensive way. In this study, I demonstrate that a new distance measure, based on a subset tree kernel function from computational linguistics, confers a significant improvement over previous measures of tree shape for classifying trees generated under different epidemiological scenarios. Next, I incorporate this kernel-based distance measure into an approximate Bayesian computation (ABC) framework for phylodynamic inference. ABC bypasses the need for an analytical solution of model likelihood, since it only requires the ability to simulate data from the model. I validate this ‘kernel-ABC’ method for phylodynamic inference by estimating parameters from data simulated under a simple epidemiological model. Results indicate that kernel-ABC attained greater accuracy for parameters associated with virus transmission than leading software on the same data sets. Lastly, I apply the kernel-ABC framework to study a recent outbreak of a recombinant HIV subtype in China. Kernel-ABC provides a versatile framework for phylodynamic inference because it can fit a broader range of models than methods that rely on the computation of exact likelihoods.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Research Article

IgSimulator: a versatile immunosequencing simulator

IgSimulator:多功能immunosequencing模拟器

Motivation: The recent introduction of next generation sequencing technologies to antibody studies have resulted in a growing number of immunoinformatics tools for antibody repertoire analysis. However, benchmarking these newly emerging tools remains problematic since the gold standard datasets that are needed to validate these tools are typically not available.

Results: Since simulating antibody repertoires is often the only feasible way to benchmark new immunoinformatics tools, we developed the IgSimulator tool that addresses various complications in generating realistic antibody repertoires. IgSimulator’s code has modular structure and can be easily adapted to new requirements to simulation.

Availability: IgSimulator is open source and freely available as a C++ and Python program running on all Unix-compatible platforms. The source code is available from yana-safonova.github.io/ig_simulator.

Contact: safonova.yana@gmail.com

Supplementary information: Supplementary data are available at Bioinformatics online.

[详细]

  • Bioinformatics
  • 10年前
  • APPLICATIONS NOTE

Identifying high-affinity aptamer ligands with defined cross-reactivity using high-throughput guided systematic evolution of ligands by exponential enrichment

识别与定义的交叉反应,利用高通量引导通过指数富集的配体系统进化的高亲和力配体的配体

Oligonucleotide aptamers represent a novel platform for creating ligands with desired specificity, and they offer many potentially significant advantages over monoclonal antibodies in terms of feasibility, cost, and clinical applicability. However, the isolation of high-affinity aptamer ligands from random oligonucleotide pools has been challenging. Although high-throughput sequencing (HTS) promises to significantly facilitate systematic evolution of ligands by exponential enrichment (SELEX) analysis, the enormous datasets generated in the process pose new challenges for identifying those rare, high-affinity aptamers present in a given pool. We show that emulsion PCR preserves library diversity, preventing the loss of rare high-affinity aptamers that are difficult to amplify. We also demonstrate the importance of using reference targets to eliminate binding candidates with reduced specificity. Using a combination of bioinformatics and functional analyses, we show that the rate of amplification is more predictive than prevalence with respect to binding affinity and that the mutational landscape within a cluster of related aptamers can guide the identification of high-affinity aptamer ligands. Finally, we demonstrate the power of this selection process for identifying cross-species aptamers that can bind human receptors and cross-react with their murine orthologs.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Methods Online

DNA3'pp5'G de-capping activity of aprataxin: effect of cap nucleoside analogs and structural basis for guanosine recognition

dna3'pp5'g封活动帽aprataxin:核苷类似物和鸟苷识别的结构基础的影响

DNA3'pp5'G caps synthesized by the 3'-PO4/5'-OH ligase RtcB have a strong impact on enzymatic reactions at DNA 3'-OH ends. Aprataxin, an enzyme that repairs A5'pp5'DNA ends formed during abortive ligation by classic 3'-OH/5'-PO4 ligases, is also a DNA 3' de-capping enzyme, converting DNAppG to DNA3'p and GMP. By taking advantage of RtcB's ability to utilize certain GTP analogs to synthesize DNAppN caps, we show that aprataxin hydrolyzes inosine and 6-O-methylguanosine caps, but is not adept at removing a deoxyguanosine cap. We report a 1.5 Å crystal structure of aprataxin in a complex with GMP, which reveals that: (i) GMP binds at the same position and in the same anti nucleoside conformation as AMP; and (ii) aprataxin makes more extensive nucleobase contacts with guanine than with adenine, via a hydrogen bonding network to the guanine O6, N1, N2 base edge. Alanine mutations of catalytic residues His147 and His149 abolish DNAppG de-capping activity, suggesting that the 3' de-guanylylation and 5' de-adenylylation reactions follow the same pathway of nucleotidyl transfer through a covalent aprataxin-(His147)–NMP intermediate. Alanine mutation of Asp63, which coordinates the guanosine ribose hydroxyls, impairs DNAppG de-capping.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Nucleic Acid Enzymes

A modular open platform for systematic functional studies under physiological conditions

一个模块化的开放式平台系统功能的研究,在生理条件下

Any profound comprehension of gene function requires detailed information about the subcellular localization, molecular interactions and spatio-temporal dynamics of gene products. We developed a multifunctional integrase (MIN) tag for rapid and versatile genome engineering that serves not only as a genetic entry site for the Bxb1 integrase but also as a novel epitope tag for standardized detection and precipitation. For the systematic study of epigenetic factors, including Dnmt1, Dnmt3a, Dnmt3b, Tet1, Tet2, Tet3 and Uhrf1, we generated MIN-tagged embryonic stem cell lines and created a toolbox of prefabricated modules that can be integrated via Bxb1-mediated recombination. We used these functional modules to study protein interactions and their spatio-temporal dynamics as well as gene expression and specific mutations during cellular differentiation and in response to external stimuli. Our genome engineering strategy provides a versatile open platform for efficient generation of multiple isogenic cell lines to study gene function under physiological conditions.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Methods online

Spring loading a pre-cleavage intermediate for hairpin telomere formation

弹簧加载发夹端粒形成预裂解中间

The Borrelia telomere resolvase, ResT, forms the unusual hairpin telomeres of the linear Borrelia replicons in a process referred to as telomere resolution. Telomere resolution is a DNA cleavage and rejoining reaction that proceeds from a replicated telomere intermediate in a reaction with mechanistic similarities to that catalyzed by type IB topoisomerases. Previous reports have implicated the hairpin-binding module, at the end of the N-terminal domain of ResT, in distorting the DNA between the scissile phosphates so as to promote DNA cleavage and hairpin formation by the catalytic domain. We report that unwinding the DNA between the scissile phosphates, prior to DNA cleavage, is a key cold-sensitive step in telomere resolution. Through the analysis of ResT mutants, rescued by substrate modifications that mimic DNA unwinding between the cleavage sites, we show that formation and/or stabilization of an underwound pre-cleavage intermediate depends upon cooperation of the hairpin-binding module and catalytic domain. The phenotype of the mutants argues that the pre-cleavage intermediate promotes strand ejection to favor the forward reaction and that subsequent hairpin capture is a reversible reaction step. These reaction features are proposed to promote hairpin formation over strand resealing while allowing reversal back to substrate of aborted reactions.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Nucleic Acid Enzymes

Two mechanisms coordinate replication termination by the Escherichia coli Tus-Ter complex

两种机制协调复制的终止由大肠杆菌后复杂的状况

The Escherichia coli replication terminator protein (Tus) binds to Ter sequences to block replication forks approaching from one direction. Here, we used single molecule and transient state kinetics to study responses of the heterologous phage T7 replisome to the Tus–Ter complex. The T7 replisome was arrested at the non-permissive end of Tus–Ter in a manner that is explained by a composite mousetrap and dynamic clamp model. An unpaired C(6) that forms a lock by binding into the cytosine binding pocket of Tus was most effective in arresting the replisome and mutation of C(6) removed the barrier. Isolated helicase was also blocked at the non-permissive end, but unexpectedly the isolated polymerase was not, unless C(6) was unpaired. Instead, the polymerase was blocked at the permissive end. This indicates that the Tus–Ter mechanism is sensitive to the translocation polarity of the DNA motor. The polymerase tracking along the template strand traps the C(6) to prevent lock formation; the helicase tracking along the other strand traps the complementary G(6) to aid lock formation. Our results are consistent with the model where strand separation by the helicase unpairs the GC(6) base pair and triggers lock formation immediately before the polymerase can sequester the C(6) base.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Genome Integrity, Repair and Replication

Dynamics of MBD2 deposition across methylated DNA regions during malignant transformation of human mammary epithelial cells

人类乳腺上皮细胞恶性转化过程中沉积在肿瘤DNA甲基化区域动力学

DNA methylation is thought to induce transcriptional silencing through the combination of two mechanisms: the repulsion of transcriptional activators unable to bind their target sites when methylated, and the recruitment of transcriptional repressors with specific affinity for methylated DNA. The Methyl CpG Binding Domain proteins MeCP2, MBD1 and MBD2 belong to the latter category. Here, we present MBD2 ChIPseq data obtained from the endogenous MBD2 in an isogenic cellular model of oncogenic transformation of human mammary cells. In immortalized (HMEC-hTERT) or transformed (HMLER) cells, MBD2 was found in a large proportion of methylated regions and associated with transcriptional silencing. A redistribution of MBD2 on methylated DNA occurred during oncogenic transformation, frequently independently of local DNA methylation changes. Genes downregulated during HMEC-hTERT transformation preferentially gained MBD2 on their promoter. Furthermore, depletion of MBD2 induced an upregulation of MBD2-bound genes methylated at their promoter regions, in HMLER cells. Among the 3,160 genes downregulated in transformed cells, 380 genes were methylated at their promoter regions in both cell lines, specifically associated by MBD2 in HMLER cells, and upregulated upon MBD2 depletion in HMLER. The transcriptional MBD2-dependent downregulation occurring during oncogenic transformation was also observed in two additional models of mammary cell transformation. Thus, the dynamics of MBD2 deposition across methylated DNA regions was associated with the oncogenic transformation of human mammary cells.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Gene regulation, Chromatin and Epigenetics

Destruction of a distal hypoxia response element abolishes trans-activation of the PAG1 gene mediated by HIF-independent chromatin looping

一个远端缺氧反应元件破坏废除反式激活的基因介导的HIF独立化染色质环

A crucial step in the cellular adaptation to oxygen deficiency is the binding of hypoxia-inducible factors (HIFs) to hypoxia response elements (HREs) of oxygen-regulated genes. Genome-wide HIF-1α/2α/β DNA-binding studies revealed that the majority of HREs reside distant to the promoter regions, but the function of these distal HREs has only been marginally studied in the genomic context. We used chromatin immunoprecipitation (ChIP), gene editing (TALEN) and chromosome conformation capture (3C) to localize and functionally characterize a 82 kb upstream HRE that solely drives oxygen-regulated expression of the newly identified HIF target gene PAG1. PAG1, a transmembrane adaptor protein involved in Src signalling, was hypoxically induced in various cell lines and mouse tissues. ChIP and reporter gene assays demonstrated that the –82 kb HRE regulates PAG1, but not an equally distant gene further upstream, by direct interaction with HIF. Ablation of the consensus HRE motif abolished the hypoxic induction of PAG1 but not general oxygen signalling. 3C assays revealed that the –82 kb HRE physically associates with the PAG1 promoter region, independent of HIF-DNA interaction. These results demonstrate a constitutive interaction between the –82 kb HRE and the PAG1 promoter, suggesting a physiologically important rapid response to hypoxia.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Gene regulation, Chromatin and Epigenetics

Tailor: a computational framework for detecting non-templated tailing of small silencing RNAs

裁缝:用于检测非模板小沉默RNA尾矿计算框架

Small silencing RNAs, including microRNAs, endogenous small interfering RNAs (endo-siRNAs) and Piwi-interacting RNAs (piRNAs), have been shown to play important roles in fine-tuning gene expression, defending virus and controlling transposons. Loss of small silencing RNAs or components in their pathways often leads to severe developmental defects, including lethality and sterility. Recently, non-templated addition of nucleotides to the 3' end, namely tailing, was found to associate with the processing and stability of small silencing RNAs. Next Generation Sequencing has made it possible to detect such modifications at nucleotide resolution in an unprecedented throughput. Unfortunately, detecting such events from millions of short reads confounded by sequencing errors and RNA editing is still a tricky problem. Here, we developed a computational framework, Tailor, driven by an efficient and accurate aligner specifically designed for capturing the tailing events directly from the alignments without extensive post-processing. The performance of Tailor was fully tested and compared favorably with other general-purpose aligners using both simulated and real datasets for tailing analysis. Moreover, to show the broad utility of Tailor, we used Tailor to reanalyze published datasets and revealed novel findings worth further experimental validation. The source code and the executable binaries are freely available at https://github.com/jhhung/Tailor.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Methods Online

StemChecker: a web-based tool to discover and explore stemness signatures in gene sets

stemchecker:一个基于网络的工具来发现和探索基因集干性签名

Stem cells present unique regenerative abilities, offering great potential for treatment of prevalent pathologies such as diabetes, neurodegenerative and heart diseases. Various research groups dedicated significant effort to identify sets of genes—so-called stemness signatures—considered essential to define stem cells. However, their usage has been hindered by the lack of comprehensive resources and easy-to-use tools. For this we developed StemChecker, a novel stemness analysis tool, based on the curation of nearly fifty published stemness signatures defined by gene expression, RNAi screens, Transcription Factor (TF) binding sites, literature reviews and computational approaches. StemChecker allows researchers to explore the presence of stemness signatures in user-defined gene sets, without carrying-out lengthy literature curation or data processing. To assist in exploring underlying regulatory mechanisms, we collected over 80 target gene sets of TFs associated with pluri- or multipotency. StemChecker presents an intuitive graphical display, as well as detailed statistical results in table format, which helps revealing transcriptionally regulatory programs, indicating the putative involvement of stemness-associated processes in diseases like cancer. Overall, StemChecker substantially expands the available repertoire of online tools, designed to assist the stem cell biology, developmental biology, regenerative medicine and human disease research community. StemChecker is freely accessible at http://stemchecker.sysbiolab.eu.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server Issue

Regulation of the Type I-F CRISPR-Cas system by CRP-cAMP and GalM controls spacer acquisition and interference

通过CRP cAMP和新控件间隔采集和干扰的类型如CRISPR-Cas系统调节

The CRISPR-Cas prokaryotic ‘adaptive immune systems’ represent a sophisticated defence strategy providing bacteria and archaea with protection from invading genetic elements, such as bacteriophages or plasmids. Despite intensive research into their mechanism and application, how CRISPR-Cas systems are regulated is less clear, and nothing is known about the regulation of Type I-F systems. We used Pectobacterium atrosepticum, a Gram-negative phytopathogen, to study CRISPR-Cas regulation, since it contains a single Type I-F system. The CRP-cAMP complex activated the cas operon, increasing the expression of the adaptation genes cas1 and cas2–3 in addition to the genes encoding the Csy surveillance complex. Mutation of crp or cyaA (encoding adenylate cyclase) resulted in reductions in both primed spacer acquisition and interference. Furthermore, we identified a galactose mutarotase, GalM, which reduced cas operon expression in a CRP- and CyaA-dependent manner. We propose that the Type I-F system senses metabolic changes, such as sugar availability, and regulates cas genes to initiate an appropriate defence response. Indeed, elevated glucose levels reduced cas expression in a CRP- and CyaA-dependent manner. Taken together, these findings highlight that a metabolite-sensing regulatory pathway controls expression of the Type I-F CRISPR-Cas system to modulate levels of adaptation and interference.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Molecular Biology

Disturbance-free rapid solution exchange for magnetic tweezers single-molecule studies

无干扰的快速解决方案交流磁镊单分子研究

Single-molecule manipulation technologies have been extensively applied to studies of the structures and interactions of DNA and proteins. An important aspect of such studies is to obtain the dynamics of interactions; however the initial binding is often difficult to obtain due to large mechanical perturbation during solution introduction. Here, we report a simple disturbance-free rapid solution exchange method for magnetic tweezers single-molecule manipulation experiments, which is achieved by tethering the molecules inside microwells (typical dimensions–diameter (D): 40–50 μm, height (H): 100 μm; H:D~2:1). Our simulations and experiments show that the flow speed can be reduced by several orders of magnitude near the bottom of the microwells from that in the flow chamber, effectively eliminating the flow disturbance to molecules tethered in the microwells. We demonstrate a wide scope of applications of this method by measuring the force dependent DNA structural transitions in response to solution condition change, and polymerization dynamics of RecA on ssDNA/SSB-coated ssDNA/dsDNA of various tether lengths under constant forces, as well as the dynamics of vinculin binding to α-catenin at a constant force (< 5 pN) applied to the α-catenin protein.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Methods Online

LYRA, a webserver for lymphocyte receptor structural modeling

莱拉,一种淋巴细胞受体结构模型的Web服务器

The accurate structural modeling of B- and T-cell receptors is fundamental to gain a detailed insight in the mechanisms underlying immunity and in developing new drugs and therapies. The LYRA (LYmphocyte Receptor Automated modeling) web server (http://www.cbs.dtu.dk/services/LYRA/) implements a complete and automated method for building of B- and T-cell receptor structural models starting from their amino acid sequence alone. The webserver is freely available and easy to use for non-specialists. Upon submission, LYRA automatically generates alignments using ad hoc profiles, predicts the structural class of each hypervariable loop, selects the best templates in an automatic fashion, and provides within minutes a complete 3D model that can be downloaded or inspected online. Experienced users can manually select or exclude template structures according to case specific information. LYRA is based on the canonical structure method, that in the last 30 years has been successfully used to generate antibody models of high accuracy, and in our benchmarks this approach proves to achieve similarly good results on TCR modeling, with a benchmarked average RMSD accuracy of 1.29 and 1.48 Å for B- and T-cell receptors, respectively. To the best of our knowledge, LYRA is the first automated server for the prediction of TCR structure.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server Issue

Gene target specificity of the Super Elongation Complex (SEC) family: how HIV-1 Tat employs selected SEC members to activate viral transcription

的超级延伸复合物基因靶向特异性(SEC)家庭:如何选择HIV-1 Tat采用SEC的成员激活病毒的转录

The AF4/FMR2 proteins AFF1 and AFF4 act as a scaffold to assemble the Super Elongation Complex (SEC) that strongly activates transcriptional elongation of HIV-1 and cellular genes. Although they can dimerize, it is unclear whether the dimers exist and function within a SEC in vivo. Furthermore, it is unknown whether AFF1 and AFF4 function similarly in mediating SEC-dependent activation of diverse genes. Providing answers to these questions, our current study shows that AFF1 and AFF4 reside in separate SECs that display largely distinct gene target specificities. While the AFF1-SEC is more potent in supporting HIV-1 transactivation by the viral Tat protein, the AFF4-SEC is more important for HSP70 induction upon heat shock. The functional difference between AFF1 and AFF4 in Tat-transactivation has been traced to a single amino acid variation between the two proteins, which causes them to enhance the affinity of Tat for P-TEFb, a key SEC component, with different efficiency. Finally, genome-wide analysis confirms that the genes regulated by AFF1-SEC and AFF4-SEC are largely non-overlapping and perform distinct functions. Thus, the SEC represents a family of related complexes that exist to increase the regulatory diversity and gene control options during transactivation of diverse cellular and viral genes.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Gene regulation, Chromatin and Epigenetics

A Novel Essential Domain Perspective for Exploring Gene Essentiality

小说基本域角度探索基因本质

Motivation: Genes with indispensable functions are identified as essential; however, the traditional gene-level studies of essentiality have several limitations. In this study, we characterized gene essentiality from a new perspective of protein domains, the independent structural or functional units of a polypeptide chain.

Results: To identify such essential domains, we have developed an Expectation-Maximization (EM) algorithm-based Essential Domain Prediction (EDP) Model. With simulated datasets, the model provided convergent results given different initial values and offered accurate predictions even with noise. We then applied the EDP model to six microbial species and predicted 1,879 domains to be essential in at least one species, ranging 10-23% in each species. The predicted essential domains were more conserved than either non-essential domains or essential genes. Comparing essential domains in prokaryotes and eukaryotes revealed an evolutionary distance consistent with that inferred from ribosomal RNA. When utilizing these essential domains to reproduce the annotation of essential genes, we received accurate results that suggest protein domains are more basic units for the essentiality of genes. Furthermore, we presented several examples to illustrate how the combination of essential and non-essential domains can lead to genes with divergent essentiality. In summary, we have described the first systematic analysis on gene essentiality on the level of domains.

Contact: Long.Lu@cchmc.org

Supplementary Information: Supplementary data are available at Bioinformatics online.

[详细]

  • Bioinformatics
  • 10年前
  • ORIGINAL PAPER

Metabolic Needs and Capabilities of Toxoplasma gondii through Combined Computational and Experimental Analysis

通过计算和实验分析的代谢需求和弓形虫的能力

by Stepan Tymoshenko, Rebecca D. Oppenheim, Rasmus Agren, Jens Nielsen, Dominique Soldati-Favre, Vassily Hatzimanikatis

Toxoplasma gondii is a human pathogen prevalent worldwide that poses a challenging and unmet need for novel treatment of toxoplasmosis. Using a semi-automated reconstruction algorithm, we reconstructed a genome-scale metabolic model, ToxoNet1. The reconstruction process and flux-balance analysis of the model offer a systematic overview of the metabolic capabilities of this parasite. Using ToxoNet1 we have identified significant gaps in the current knowledge of Toxoplasma metabolic pathways and have clarified its minimal nutritional requirements for replication. By probing the model via metabolic tasks, we have further defined sets of alternative precursors necessary for parasite growth. Within a human host cell environment, ToxoNet1 predicts a minimal set of 53 enzyme-coding genes and 76 reactions to be essential for parasite replication. Double-gene-essentiality analysis identified 20 pairs of genes for which simultaneous deletion is deleterious. To validate several predictions of ToxoNet1 we have performed experimental analyses of cytosolic acetyl-CoA biosynthesis. ATP-citrate lyase and acetyl-CoA synthase were localised and their corresponding genes disrupted, establishing that each of these enzymes is dispensable for the growth of T. gondii, however together they make a synthetic lethal pair.

[详细]

  • PLOS Computational Biology
  • 10年前

Improving 3D Genome Reconstructions Using Orthologous and Functional Constraints

利用同源,功能约束提高三维基因组重建

by Alon Diament, Tamir Tuller

The study of the 3D architecture of chromosomes has been advancing rapidly in recent years. While a number of methods for 3D reconstruction of genomic models based on Hi-C data were proposed, most of the analyses in the field have been performed on different 3D representation forms (such as graphs). Here, we reproduce most of the previous results on the 3D genomic organization of the eukaryote Saccharomyces cerevisiae using analysis of 3D reconstructions. We show that many of these results can be reproduced in sparse reconstructions, generated from a small fraction of the experimental data (5% of the data), and study the properties of such models. Finally, we propose for the first time a novel approach for improving the accuracy of 3D reconstructions by introducing additional predicted physical interactions to the model, based on orthologous interactions in an evolutionary-related organism and based on predicted functional interactions between genes. We demonstrate that this approach indeed leads to the reconstruction of improved models.

[详细]

  • PLOS Computational Biology
  • 10年前