Free Energy Landscape and Multiple Folding Pathways of an H-Type RNA Pseudoknot

自由能景观和结晶器内的多个折叠途径RNA Pseudoknot

by Yunqiang Bian, Jian Zhang, Jun Wang, Jihua Wang, Wei Wang

How RNA sequences fold to specific tertiary structures is one of the key problems for understanding their dynamics and functions. Here, we study the folding process of an H-type RNA pseudoknot by performing a large-scale all-atom MD simulation and bias-exchange metadynamics. The folding free energy landscapes are obtained and several folding intermediates are identified. It is suggested that the folding occurs via multiple mechanisms, including a step-wise mechanism starting either from the first helix or the second, and a cooperative mechanism with both helices forming simultaneously. Despite of the multiple mechanism nature, the ensemble folding kinetics estimated from a Markov state model is single-exponential. It is also found that the correlation between folding and binding of metal ions is significant, and the bound ions mediate long-range interactions in the intermediate structures. Non-native interactions are found to be dominant in the unfolded state and also present in some intermediates, possibly hinder the folding process of the RNA.

[详细]

  • PloS one
  • 10年前

Proteins Related to the Type I Secretion System Are Associated with Secondary SecA_DEAD Domain Proteins in Some Species of Planctomycetes, Verrucomicrobia, Proteobacteria, Nitrospirae and Chlorobi

蛋白质与I型分泌系统与二次SecA_DEAD域蛋白质在某些种类的Planctomycetes Verrucomicrobia,变形菌门,Nitrospirae Chlorobi

by Olga K. Kamneva, Saroj Poudel, Naomi L. Ward

A number of bacteria belonging to the PVC (Planctomycetes-Verrucomicrobia-Chlamydiae) super-phylum contain unusual ribosome-bearing intracellular membranes. The evolutionary origins and functions of these membranes are unknown. Some proteins putatively associated with the presence of intracellular membranes in PVC bacteria contain signal peptides. Signal peptides mark proteins for translocation across the cytoplasmic membrane in prokaryotes, and the membrane of the endoplasmic reticulum in eukaryotes, by highly conserved Sec machinery. This suggests that proteins might be targeted to intracellular membranes in PVC bacteria via the Sec pathway. Here, we show that canonical signal peptides are significantly over-represented in proteins preferentially present in PVC bacteria possessing intracellular membranes, indicating involvement of Sec translocase in their cellular targeting. We also characterized Sec proteins using comparative genomics approaches, focusing on the PVC super-phylum. While we were unable to detect unique changes in Sec proteins conserved among membrane-bearing PVC species, we identified (1) SecA ATPase domain re-arrangements in some Planctomycetes, and (2) secondary SecA_DEAD domain proteins in the genomes of some Planctomycetes, Verrucomicrobia, Proteobacteria, Nitrospirae and Chlorobi. This is the first report of potentially duplicated SecA in Gram-negative bacteria. The phylogenetic distribution of secondary SecA_DEAD domain proteins suggests that the presence of these proteins is not related to the occurrence of PVC endomembranes. Further genomic analysis showed that secondary SecA_DEAD domain proteins are located within genomic neighborhoods that also encode three proteins possessing domains specific for the Type I secretion system.

[详细]

  • PloS one
  • 10年前

Molecular Dynamic Simulations Reveal the Structural Determinants of Fatty Acid Binding to Oxy-Myoglobin

分子动态模拟显示Oxy-Myoglobin脂肪酸绑定的结构性因素

by Sree V. Chintapalli, Gaurav Bhardwaj, Reema Patel, Natasha Shah, Randen L. Patterson, Damian B. van Rossum, Andriy Anishkin, Sean H. Adams

The mechanism(s) by which fatty acids are sequestered and transported in muscle have not been fully elucidated. A potential key player in this process is the protein myoglobin (Mb). Indeed, there is a catalogue of empirical evidence supporting direct interaction of globins with fatty acid metabolites; however, the binding pocket and regulation of the interaction remains to be established. In this study, we employed a computational strategy to elucidate the structural determinants of fatty acids (palmitic & oleic acid) binding to Mb. Sequence analysis and docking simulations with a horse (Equus caballus) structural Mb reference reveals a fatty acid-binding site in the hydrophobic cleft near the heme region in Mb. Both palmitic acid and oleic acid attain a “U” shaped structure similar to their conformation in pockets of other fatty acid-binding proteins. Specifically, we found that the carboxyl head group of palmitic acid coordinates with the amino group of Lys45, whereas the carboxyl group of oleic acid coordinates with both the amino groups of Lys45 and Lys63. The alkyl tails of both fatty acids are supported by surrounding hydrophobic residues Leu29, Leu32, Phe33, Phe43, Phe46, Val67, Val68 and Ile107. In the saturated palmitic acid, the hydrophobic tail moves freely and occasionally penetrates deeper inside the hydrophobic cleft, making additional contacts with Val28, Leu69, Leu72 and Ile111. Our simulations reveal a dynamic and stable binding pocket in which the oxygen molecule and heme group in Mb are required for additional hydrophobic interactions. Taken together, these findings support a mechanism in which Mb acts as a muscle transporter for fatty acid when it is in the oxygenated state and releases fatty acid when Mb converts to deoxygenated state.

[详细]

  • PloS one
  • 10年前

Small RNA-mediated DNA (cytosine-5) methyltransferase 1 inhibition leads to aberrant DNA methylation

小RNA介导的DNA(胞嘧啶)甲基转移酶1抑制导致DNA异常甲基化

Mammalian cells contain copious amounts of RNA including both coding and noncoding RNA (ncRNA). Generally the ncRNAs function to regulate gene expression at the transcriptional and post-transcriptional level. Among ncRNA, the long ncRNA and small ncRNA can affect histone modification, DNA methylation targeting and gene silencing. Here we show that endogenous DNA methyltransferase 1 (DNMT1) co-purifies with inhibitory ncRNAs. MicroRNAs (miRNAs) bind directly to DNMT1 with high affinity. The binding of miRNAs, such as miR-155-5p, leads to inhibition of DNMT1 enzyme activity. Exogenous miR-155-5p in cells induces aberrant DNA methylation of the genome, resulting in hypomethylation of low to moderately methylated regions. And small shift of hypermethylation of previously hypomethylated region was also observed. Furthermore, hypomethylation led to activation of genes. Based on these observations, overexpression of miR-155-5p resulted in aberrant DNA methylation by inhibiting DNMT1 activity, resulting in altered gene expression.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Nucleic Acid Enzymes

Characteristics of de novo structural changes in the human genome [RESEARCH]

从头结构的变化在人类基因组[研究]特性

Small insertions and deletions (indels) and large structural variations (SVs) are major contributors to human genetic diversity and disease. However, mutation rates and characteristics of de novo indels and SVs in the general population have remained largely unexplored. We report 332 validated de novo structural changes identified in whole genomes of 250 families, including complex indels, retrotransposon insertions, and interchromosomal events. These data indicate a mutation rate of 2.94 indels (1–20 bp) and 0.16 SVs (>20 bp) per generation. De novo structural changes affect on average 4.1 kbp of genomic sequence and 29 coding bases per generation, which is 91 and 52 times more nucleotides than de novo substitutions, respectively. This contrasts with the equal genomic footprint of inherited SVs and substitutions. An excess of structural changes originated on paternal haplotypes. Additionally, we observed a nonuniform distribution of de novo SVs across offspring. These results reveal the importance of different mutational mechanisms to changes in human genome structure across generations.

[详细]

  • Genome Research
  • 10年前
  • RESEARCH

Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells [RESEARCH]

常见的体细胞转移线粒体DNA的人类癌症细胞的核基因组[研究]

Mitochondrial genomes are separated from the nuclear genome for most of the cell cycle by the nuclear double membrane, intervening cytoplasm, and the mitochondrial double membrane. Despite these physical barriers, we show that somatically acquired mitochondrial-nuclear genome fusion sequences are present in cancer cells. Most occur in conjunction with intranuclear genomic rearrangements, and the features of the fusion fragments indicate that nonhomologous end joining and/or replication-dependent DNA double-strand break repair are the dominant mechanisms involved. Remarkably, mitochondrial-nuclear genome fusions occur at a similar rate per base pair of DNA as interchromosomal nuclear rearrangements, indicating the presence of a high frequency of contact between mitochondrial and nuclear DNA in some somatic cells. Transmission of mitochondrial DNA to the nuclear genome occurs in neoplastically transformed cells, but we do not exclude the possibility that some mitochondrial-nuclear DNA fusions observed in cancer occurred years earlier in normal somatic cells.

[详细]

  • Genome Research
  • 10年前
  • RESEARCH

ChIP-exo signal associated with DNA-binding motifs provides insight into the genomic binding of the glucocorticoid receptor and cooperating transcription factors [RESEARCH]

芯片外信号与DNA结合基序关联提供了洞察的糖皮质激素受体的基因组结合和协同转录因子[研究]

The classical DNA recognition sequence of the glucocorticoid receptor (GR) appears to be present at only a fraction of bound genomic regions. To identify sequences responsible for recruitment of this transcription factor (TF) to individual loci, we turned to the high-resolution ChIP-exo approach. We exploited this signal by determining footprint profiles of TF binding at single-base-pair resolution using ExoProfiler, a computational pipeline based on DNA binding motifs. When applied to our GR and the few available public ChIP-exo data sets, we find that ChIP-exo footprints are protein- and recognition sequence-specific signatures of genomic TF association. Furthermore, we show that ChIP-exo captures information about TFs other than the one directly targeted by the antibody in the ChIP procedure. Consequently, the shape of the ChIP-exo footprint can be used to discriminate between direct and indirect (tethering to other DNA-bound proteins) DNA association of GR. Together, our findings indicate that the absence of classical recognition sequences can be explained by direct GR binding to a broader spectrum of sequences than previously known, either as a homodimer or as a heterodimer binding together with a member of the ETS or TEAD families of TFs, or alternatively by indirect recruitment via FOX or STAT proteins. ChIP-exo footprints also bring structural insights and locate DNA:protein cross-link points that are compatible with crystal structures of the studied TFs. Overall, our generically applicable footprint-based approach uncovers new structural and functional insights into the diverse ways of genomic cooperation and association of TFs.

[详细]

  • Genome Research
  • 10年前
  • RESEARCH

Genomic redistribution of GR monomers and dimers mediates transcriptional response to exogenous glucocorticoid in vivo [RESEARCH]

GR基因分布的单体和二聚体介导外源体内糖皮质激素的转录反应[研究]

Glucocorticoids (GCs) are commonly prescribed drugs, but their anti-inflammatory benefits are mitigated by metabolic side effects. Their transcriptional effects, including tissue-specific gene activation and repression, are mediated by the glucocorticoid receptor (GR), which is known to bind as a homodimer to a palindromic DNA sequence. Using ChIP-exo in mouse liver under endogenous corticosterone exposure, we report here that monomeric GR interaction with a half-site motif is more prevalent than homodimer binding. Monomers colocalize with lineage-determining transcription factors in both liver and primary macrophages, and the GR half-site motif drives transcription, suggesting that monomeric binding is fundamental to GR's tissue-specific functions. In response to exogenous GC in vivo, GR dimers assemble on chromatin near ligand-activated genes, concomitant with monomer evacuation of sites near repressed genes. Thus, pharmacological GCs mediate gene expression by favoring GR homodimer occupancy at classic palindromic sites at the expense of monomeric binding. The findings have important implications for improving therapies that target GR.

[详细]

  • Genome Research
  • 10年前
  • RESEARCH

Dynamics of chromatin accessibility and long-range interactions in response to glucocorticoid pulsing [RESEARCH]

染色质与糖皮质激素脉冲[研究]的长程相互作用的动力学

Although physiological steroid levels are often pulsatile (ultradian), the genomic effects of this pulsatility are poorly understood. By utilizing glucocorticoid receptor (GR) signaling as a model system, we uncovered striking spatiotemporal relationships between receptor loading, lifetimes of the DNase I hypersensitivity sites (DHSs), long-range interactions, and gene regulation. We found that hormone-induced DHSs were enriched within ±50 kb of GR-responsive genes and displayed a broad spectrum of lifetimes upon hormone withdrawal. These lifetimes dictate the strength of the DHS interactions with gene targets and contribute to gene regulation from a distance. Our results demonstrate that pulsatile and constant hormone stimulations induce unique, treatment-specific patterns of gene and regulatory element activation. These modes of activation have implications for corticosteroid function in vivo and for steroid therapies in various clinical settings.

[详细]

  • Genome Research
  • 10年前
  • RESEARCH

Antagonistic regulation of mRNA expression and splicing by CELF and MBNL proteins [RESEARCH]

mRNA的表达,拮抗调节CELF MBNL蛋白质剪接[研究]

RNA binding proteins of the conserved CUGBP1, Elav-like factor (CELF) family contribute to heart and skeletal muscle development and are implicated in myotonic dystrophy (DM). To understand their genome-wide functions, we analyzed the transcriptome dynamics following induction of CELF1 or CELF2 in adult mouse heart and of CELF1 in muscle by RNA-seq, complemented by crosslinking/immunoprecipitation-sequencing (CLIP-seq) analysis of mouse cells and tissues to distinguish direct from indirect regulatory targets. We identified hundreds of mRNAs bound in their 3' UTRs by both CELF1 and the developmentally induced MBNL1 protein, a threefold greater overlap in target messages than expected, including messages involved in development and cell differentiation. The extent of 3' UTR binding by CELF1 and MBNL1 predicted the degree of mRNA repression or stabilization, respectively, following CELF1 induction. However, CELF1's RNA binding specificity in vitro was not detectably altered by coincubation with recombinant MBNL1. These findings support a model in which CELF and MBNL proteins bind independently to mRNAs but functionally compete to specify down-regulation or localization/stabilization, respectively, of hundreds of mRNA targets. Expression of many alternative 3' UTR isoforms was altered following CELF1 induction, with 3' UTR binding associated with down-regulation of isoforms and genes. The splicing of hundreds of alternative exons was oppositely regulated by these proteins, confirming an additional layer of regulatory antagonism previously observed in a handful of cases. The regulatory relationships between CELFs and MBNLs in control of both mRNA abundance and splicing appear to have evolved to enhance developmental transitions in major classes of heart and muscle genes.

[详细]

  • Genome Research
  • 10年前
  • RESEARCH

A nucleosome turnover map reveals that the stability of histone H4 Lys20 methylation depends on histone recycling in transcribed chromatin [RESEARCH]

核小体组蛋白H4周转图表明Lys20甲基化的稳定性取决于转录的染色质蛋白回收[研究]

Nucleosome composition actively contributes to chromatin structure and accessibility. Cells have developed mechanisms to remove or recycle histones, generating a landscape of differentially aged nucleosomes. This study aimed to create a high-resolution, genome-wide map of nucleosome turnover in Schizosaccharomyces pombe. The recombination-induced tag exchange (RITE) method was used to study replication-independent nucleosome turnover through the appearance of new histone H3 and the disappearance or preservation of old histone H3. The genome-wide location of histones was determined by chromatin immunoprecipitation–exonuclease methodology (ChIP-exo). The findings were compared with diverse chromatin marks, including histone variant H2A.Z, post-translational histone modifications, and Pol II binding. Finally, genome-wide mapping of the methylation states of H4K20 was performed to determine the relationship between methylation (mono, di, and tri) of this residue and nucleosome turnover. Our analysis showed that histone recycling resulted in low nucleosome turnover in the coding regions of active genes, stably expressed at intermediate levels. High levels of transcription resulted in the incorporation of new histones primarily at the end of transcribed units. H4K20 was methylated in low-turnover nucleosomes in euchromatic regions, notably in the coding regions of long genes that were expressed at low levels. This transcription-dependent accumulation of histone methylation was dependent on the histone chaperone complex FACT. Our data showed that nucleosome turnover is highly dynamic in the genome and that several mechanisms are at play to either maintain or suppress stability. In particular, we found that FACT-associated transcription conserves histones by recycling them and is required for progressive H4K20 methylation.

[详细]

  • Genome Research
  • 10年前
  • RESEARCH

Widespread exon skipping triggers degradation by nuclear RNA surveillance in fission yeast [RESEARCH]

广泛的外显子跳跃触发在裂殖酵母[研究]核RNA监视降解

Exon skipping is considered a principal mechanism by which eukaryotic cells expand their transcriptome and proteome repertoires, creating different splice variants with distinct cellular functions. Here we analyze RNA-seq data from 116 transcriptomes in fission yeast (Schizosaccharomyces pombe), covering multiple physiological conditions as well as transcriptional and RNA processing mutants. We applied brute-force algorithms to detect all possible exon-skipping events, which were widespread but rare compared to normal splicing events. Exon-skipping events increased in cells deficient for the nuclear exosome or the 5'-3' exonuclease Dhp1, and also at late stages of meiotic differentiation when nuclear-exosome transcripts decreased. The pervasive exon-skipping transcripts were stochastic, did not increase in specific physiological conditions, and were mostly present at less than one copy per cell, even in the absence of nuclear RNA surveillance and during late meiosis. These exon-skipping transcripts are therefore unlikely to be functional and may reflect splicing errors that are actively removed by nuclear RNA surveillance. The average splicing rate by exon skipping was ~0.24% in wild type and ~1.75% in nuclear exonuclease mutants. We also detected approximately 250 circular RNAs derived from single or multiple exons. These circular RNAs were rare and stochastic, although a few became stabilized during quiescence and in splicing mutants. Using an exhaustive search algorithm, we also uncovered thousands of previously unknown splice sites, indicating pervasive splicing; yet most of these splicing variants were cryptic and increased in nuclear degradation mutants. This study highlights widespread but low frequency alternative or aberrant splicing events that are targeted by nuclear RNA surveillance.

[详细]

  • Genome Research
  • 10年前
  • RESEARCH

Sumoylation of Rap1 mediates the recruitment of TFIID to promote transcription of ribosomal protein genes [RESEARCH]

SUMO化修饰介导的Rap1 TFIID招聘促进核糖体蛋白基因转录[研究]

Transcription factors are abundant Sumo targets, yet the global distribution of Sumo along the chromatin and its physiological relevance in transcription are poorly understood. Using Saccharomyces cerevisiae, we determined the genome-wide localization of Sumo along the chromatin. We discovered that Sumo-enriched genes are almost exclusively involved in translation, such as tRNA genes and ribosomal protein genes (RPGs). Genome-wide expression analysis showed that Sumo positively regulates their transcription. We also discovered that the Sumo consensus motif at RPG promoters is identical to the DNA binding motif of the transcription factor Rap1. We demonstrate that Rap1 is a molecular target of Sumo and that sumoylation of Rap1 is important for cell viability. Furthermore, Rap1 sumoylation promotes recruitment of the basal transcription machinery, and sumoylation of Rap1 cooperates with the target of rapamycin kinase complex 1 (TORC1) pathway to promote RPG transcription. Strikingly, our data reveal that sumoylation of Rap1 functions in a homeostatic feedback loop that sustains RPG transcription during translational stress. Taken together, Sumo regulates the cellular translational capacity by promoting transcription of tRNA genes and RPGs.

[详细]

  • Genome Research
  • 10年前
  • RESEARCH

A pooling-based approach to mapping genetic variants associated with DNA methylation [METHOD]

一池的基础测绘的遗传变异与DNA甲基化[方法]相关的方法

DNA methylation is an epigenetic modification that plays a key role in gene regulation. Previous studies have investigated its genetic basis by mapping genetic variants that are associated with DNA methylation at specific sites, but these have been limited to microarrays that cover <2% of the genome and cannot account for allele-specific methylation (ASM). Other studies have performed whole-genome bisulfite sequencing on a few individuals, but these lack statistical power to identify variants associated with DNA methylation. We present a novel approach in which bisulfite-treated DNA from many individuals is sequenced together in a single pool, resulting in a truly genome-wide map of DNA methylation. Compared to methods that do not account for ASM, our approach increases statistical power to detect associations while sharply reducing cost, effort, and experimental variability. As a proof of concept, we generated deep sequencing data from a pool of 60 human cell lines; we evaluated almost twice as many CpGs as the largest microarray studies and identified more than 2000 genetic variants associated with DNA methylation. We found that these variants are highly enriched for associations with chromatin accessibility and CTCF binding but are less likely to be associated with traits indirectly linked to DNA, such as gene expression and disease phenotypes. In summary, our approach allows genome-wide mapping of genetic variants associated with DNA methylation in any tissue of any species, without the need for individual-level genotype or methylation data.

[详细]

  • Genome Research
  • 10年前
  • METHOD

An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data [METHOD]

一个有效的和可扩展的DNA序列数据,从人口规模[方法]和细化变量提取分析框架

The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data quality. We present GotCloud, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data. GotCloud automates sequence alignment, sample-level quality control, variant calling, filtering of likely artifacts using machine-learning techniques, and genotype refinement using haplotype information. The pipeline can process thousands of samples in parallel and requires less computational resources than current alternatives. Experiments with whole-genome and exome-targeted sequence data generated by the 1000 Genomes Project show that the pipeline provides effective filtering against false positive variants and high power to detect true variants. Our pipeline has already contributed to variant detection and genotyping in several large-scale sequencing projects, including the 1000 Genomes Project and the NHLBI Exome Sequencing Project. We hope it will now prove useful to many medical sequencing studies.

[详细]

  • Genome Research
  • 10年前
  • METHOD

Mango: A bias correcting ChIA-PET analysis pipeline

芒果:偏差纠正ChIA-PET分析管道

Motivation: Chromatin Interaction Analysis by Paired-End Tag sequencing (ChIA-PET) is an established method for detecting genome-wide looping interactions at high resolution. Current ChIA-PET analysis software packages either fail to correct for non-specific interactions due to genomic proximity or only address a fraction of the steps required for data processing. We present Mango, a complete ChIA-PET data analysis pipeline that provides statistical confidence estimates for interactions and corrects for major sources of bias including differential peak enrichment and genomic proximity.

Results: Comparison to the existing software packages, ChIA-PET Tool and ChiaSig, revealed that Mango interactions exhibit much better agreement with high-resolution Hi-C data. Importantly Mango executes all steps required for processing ChIA-PET data sets whereas ChiaSig only completes 20% of the required steps. Application of Mango to multiple available ChIA-PET data sets permitted the independent rediscovery of known trends in chromatin loops including enrichment of CTCF, RAD21, SMC3, and ZNF143 at the anchor regions of interactions as well as strong bias for convergent CTCF motifs.

Availability: Mango is open source and distributed through github at https://github.com/dphansti/mango.

Contact: *mpsnyder@standford.edu, Dept. of Genetics, MC: 5120, 300 Pasteur Dr., M-344, Stanford, CA 94305-5120

[详细]

  • Bioinformatics
  • 10年前
  • ORIGINAL PAPER

Cas9-chromatin binding information enables more accurate CRISPR off-target prediction

它的染色质结合的信息能够更精确的CRISPR目标预测

The CRISPR system has become a powerful biological tool with a wide range of applications. However, improving targeting specificity and accurately predicting potential off-targets remains a significant goal. Here, we introduce a web-based CRISPR/Cas9 Off-target Prediction and Identification Tool (CROP-IT) that performs improved off-target binding and cleavage site predictions. Unlike existing prediction programs that solely use DNA sequence information; CROP-IT integrates whole genome level biological information from existing Cas9 binding and cleavage data sets. Utilizing whole-genome chromatin state information from 125 human cell types further enhances its computational prediction power. Comparative analyses on experimentally validated datasets show that CROP-IT outperforms existing computational algorithms in predicting both Cas9 binding as well as cleavage sites. With a user-friendly web-interface, CROP-IT outputs scored and ranked list of potential off-targets that enables improved guide RNA design and more accurate prediction of Cas9 binding or cleavage sites.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Methods Online

Mechanism of heat stress-induced cellular senescence elucidates the exclusive vulnerability of early S-phase cells to mild genotoxic stress

热应激诱导的细胞衰老机理的阐明独家漏洞早S期细胞轻度毒性压力

Heat stress is one of the best-studied cellular stress factors; however, little is known about its delayed effects. Here, we demonstrate that heat stress induces p21-dependent cellular senescence-like cell cycle arrest. Notably, only early S-phase cells undergo such an arrest in response to heat stress. The encounter of DNA replication forks with topoisomerase I-generated single-stranded DNA breaks resulted in the generation of persistent double-stranded DNA breaks was found to be a primary cause of heat stress-induced cellular senescence in these cells. This investigation of heat stress-induced cellular senescence elucidates the mechanisms underlying the exclusive sensitivity of early S-phase cells to ultra-low doses of agents that induce single-stranded DNA breaks.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Genome Integrity, Repair and Replication

I-COMS: Interprotein-COrrelated Mutations Server

i-coms:interprotein相关突变的服务器

Interprotein contact prediction using multiple sequence alignments (MSAs) is a useful approach to help detect protein–protein interfaces. Different computational methods have been developed in recent years as an approximation to solve this problem. However, as there are discrepancies in the results provided by them, there is still no consensus on which is the best performing methodology. To address this problem, I-COMS (interprotein COrrelated Mutations Server) is presented. I-COMS allows to estimate covariation between residues of different proteins by four different covariation methods. It provides a graphical and interactive output that helps compare results obtained using different methods. I-COMS automatically builds the required MSA for the calculation and produces a rich visualization of either intraprotein and/or interprotein covariating positions in a circos representation. Furthermore, comparison between any two methods is available as well as the overlap between any or all four methodologies. In addition, as a complementary source of information, a matrix visualization of the corresponding scores is made available and the density plot distribution of the inter, intra and inter+intra scores are calculated. Finally, all the results can be downloaded (including MSAs, scores and graphics) for comparison and visualization and/or for further analysis.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server Issue

Phylesystem: a git-based data store for community-curated phylogenetic estimates

Phylesystem:git-based数据存储community-curated系统估计

Motivation: Phylogenetic estimates from published studies can be archived using general platforms like Dryad (Vision, 2010) or TreeBASE (Sanderson et al., 1994). Such services fulfill a crucial role in ensuring transparency and reproducibility in phylogenetic research. However, digital tree data files often require some editing (e.g. rerooting) to improve the accuracy and reusability of the phylogenetic statements. Furthermore, establishing the mapping between tip labels used in a tree and taxa in a single common taxonomy dramatically improves the ability of other researchers to reuse phylogenetic estimates. As the process of curating a published phylogenetic estimate is not error-free, retaining a full record of the provenance of edits to a tree is crucial for openness, allowing editors to receive credit for their work and making errors introduced during curation easier to correct.

Results: Here, we report the development of software infrastructure to support the open curation of phylogenetic data by the community of biologists. The backend of the system provides an interface for the standard database operations of creating, reading, updating and deleting records by making commits to a git repository. The record of the history of edits to a tree is preserved by git’s version control features. Hosting this data store on GitHub (http://github.com/) provides open access to the data store using tools familiar to many developers. We have deployed a server running the ‘phylesystem-api’, which wraps the interactions with git and GitHub. The Open Tree of Life project has also developed and deployed a JavaScript application that uses the phylesystem-api and other web services to enable input and curation of published phylogenetic statements.

Availability and implementation: Source code for the web service layer is available at https://github.com/OpenTreeOfLife/phylesystem-api. The data store can be cloned from: https://github.com/OpenTreeOfLife/phylesystem. A web application that uses the phylesystem web services is deployed at http://tree.opentreeoflife.org/curator. Code for that tool is available from https://github.com/OpenTreeOfLife/opentree.

Contact: mtholder@gmail.com

[详细]

  • Bioinformatics
  • 10年前
  • ORIGINAL PAPER

FourCSeq: Analysis of 4C sequencing data

FourCSeq:4 c测序数据的分析

Motivation: Circularized Chromosome Conformation Capture (4C) is a powerful technique for studying the spatial interactions of a specific genomic region called the "viewpoint" with the rest of the genome, both in a single condition or comparing different experimental conditions or cell types. Observed ligation frequencies typically show a strong, regular dependence on genomic distance from the viewpoint, on top of which specific interaction peaks are superimposed. Here, we address the computational task to find these specific peaks and to detect changes between different biological conditions.

Results: We model the overall trend of decreasing interaction frequency with genomic distance by fitting a smooth monotonically decreasing function to suitably transformed count data. Based on the fit, z-scores are calculated from the residuals, and high z-scores are interpreted as peaks providing evidence for specific interactions. To compare different conditions, we normalize fragment counts between samples, and call for differential contact frequencies using the statistical method DESeq2 adapted from RNA-Seq analysis.

Availability and Implementation: A full end-to-end analysis pipeline is implemented in the R package FourCSeq available at www.bioconductor.org.

Contact: felix.klein@embl.de, whuber@embl.de

[详细]

  • Bioinformatics
  • 10年前
  • ORIGINAL PAPER

Interactive analysis of large cancer copy number studies with Copy Number Explorer

交互式分析的大型癌症与拷贝数Explorer拷贝数的研究

Summary: Copy number abnormalities (CNAs) such as somatically-acquired chromosomal deletions and duplications drive the development of cancer. As individual tumor genomes can contain tens or even hundreds of large and/or focal CNAs, a major difficulty is differentiating between important, recurrent pathogenic changes and benign changes unrelated to the subject’s phenotype. Here we present Copy Number Explorer, an interactive tool for mining large copy number datasets. Copy Number Explorer facilitates rapid visual and statistical identification of recurrent regions of gain or loss, identifies the genes most likely to drive CNA formation using the cghMCR method and identifies recurrently broken genes that may be disrupted or fused. The software also allows users to identify recurrent CNA regions that may be associated with differential survival.

Availability and Implementation: Copy Number Explorer is available under the GNU public license (GPL-3). Source code is available at: https://sourceforge.net/projects/copynumberexplorer/

Contact: scott.newman@emory.edu

[详细]

  • Bioinformatics
  • 10年前
  • APPLICATIONS NOTE

INSPEcT: a computational tool to infer mRNA synthesis, processing and degradation dynamics from RNA- and 4sU-seq time course experiments

检查:一个计算工具来推断信使核糖核酸合成、加工和降解动力学RNA - 4 su-seq时间课程实验

Motivation: Cellular mRNA levels originate from the combined action of multiple regulatory processes, which can be recapitulated by the rates of pre-mRNA synthesis, pre-mRNA processing and mRNA degradation. Recent experimental and computational advances set the basis to study these intertwined levels of regulation. Nevertheless, software for the comprehensive quantification of RNA dynamics is still lacking.

Results: INSPEcT is an R package for the integrative analysis of RNA- and 4sU-seq data to study the dynamics of transcriptional regulation. INSPEcT provides gene-level quantification of these rates, and a modeling framework to identify which of these regulatory processes are most likely to explain the observed mRNA and pre-mRNA concentrations. Software performance is tested on a synthetic dataset, instrumental to guide the choice of the modeling parameters and the experimental design.

Availability and implementation: INSPEcT is submitted to Bioconductor and is currently available as Supplementary Additional File S1.

Contact: mattia.pelizzola@iit.it

Supplementary Information: Supplementary data are available at Bioinformatics online.

[详细]

  • Bioinformatics
  • 10年前
  • ORIGINAL PAPER

SpeeDB: fast structural protein searches

SpeeDB:快速搜索结构蛋白质

Motivation: Interactions between amino acids are important determinants of the structure, stability and function of proteins. Several tools have been developed for the identification and analysis of such interactions in proteins based on the extensive studies carried out on high-resolution structures from Protein Data Bank (PDB). Although these tools allow users to identify and analyze interactions, analysis can only be performed on one structure at a time. This makes it difficult and time consuming to study the significance of these interactions on a large scale.

Results: SpeeDB is a web-based tool for the identification of protein structures based on structural properties. SpeeDB queries are executed on all structures in the PDB at once, quickly enough for interactive use. SpeeDB includes standard queries based on published criteria for identifying various structures: disulphide bonds, catalytic triads and aromatic–aromatic, sulphur–aromatic, cation– and ionic interactions. Users can also construct custom queries in the user interface without any programming. Results can be downloaded in a Comma Separated Value (CSV) format for further analysis with other tools. Case studies presented in this article demonstrate how SpeeDB can be used to answer various biological questions. Analysis of human proteases revealed that disulphide bonds are the predominant type of interaction and are located close to the active site, where they promote substrate specificity. When comparing the two homologous G protein-coupled receptors and the two protein kinase paralogs analyzed, the differences in the types of interactions responsible for stability accounts for the differences in specificity and functionality of the structures.

Availability and implementation: SpeeDB is available at http://www.parallelcomputing.ca as a web service.

Contact: d@drobilla.net

Supplementary Information: Supplementary data are available at Bioinformatics online.

[详细]

  • Bioinformatics
  • 10年前
  • ORIGINAL PAPER