MultiMeta: an R package for meta-analyzing multi-phenotype genome-wide association studies

MultiMeta: an R package for meta - analyzing multi - phenotype genome - wide association studies

Summary: As new methods for multivariate analysis of genome wide association studies become available, it is important to be able to combine results from different cohorts in a meta-analysis. The R package MultiMeta provides an implementation of the inverse-variance-based method for meta-analysis, generalized to an n-dimensional setting.

Availability and implementation: The R package MultiMeta can be downloaded from CRAN.

Contact: dragana.vuckovic@burlo.trieste.it; vi1@sanger.ac.uk

Supplementary information: Supplementary data are available at Bioinformatics online.

[详细]

  • Bioinformatics
  • 10年前
  • APPLICATIONS NOTE

Yeast high mobility group protein HMO1 stabilizes chromatin and is evicted during repair of DNA double strand breaks

酵母高迁移率族蛋白hmo1稳定染色质和驱逐的DNA双链断裂的修复过程

DNA is packaged into condensed chromatin fibers by association with histones and architectural proteins such as high mobility group (HMGB) proteins. However, this DNA packaging reduces accessibility of enzymes that act on DNA, such as proteins that process DNA after double strand breaks (DSBs). Chromatin remodeling overcomes this barrier. We show here that the Saccharomyces cerevisiae HMGB protein HMO1 stabilizes chromatin as evidenced by faster chromatin remodeling in its absence. HMO1 was evicted along with core histones during repair of DSBs, and chromatin remodeling events such as histone H2A phosphorylation and H3 eviction were faster in absence of HMO1. The facilitated chromatin remodeling in turn correlated with more efficient DNA resection and recruitment of repair proteins; for example, inward translocation of the DNA-end-binding protein Ku was faster in absence of HMO1. This chromatin stabilization requires the lysine-rich C-terminal extension of HMO1 as truncation of the HMO1 C-terminal tail phenocopies hmo1 deletion. Since this is reminiscent of the need for the basic C-terminal domain of mammalian histone H1 in chromatin compaction, we speculate that HMO1 promotes chromatin stability by DNA bending and compaction imposed by its lysine-rich domain and that it must be evicted along with core histones for efficient DSB repair.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Gene regulation, Chromatin and Epigenetics

CSI 3.0: a web server for identifying secondary and super-secondary structure in proteins using NMR chemical shifts

沪深3:寻找使用NMR化学位移蛋白质二级和三级结构的Web服务器

The Chemical Shift Index or CSI 3.0 (http://csi3.wishartlab.com) is a web server designed to accurately identify the location of secondary and super-secondary structures in protein chains using only nuclear magnetic resonance (NMR) backbone chemical shifts and their corresponding protein sequence data. Unlike earlier versions of CSI, which only identified three types of secondary structure (helix, β-strand and coil), CSI 3.0 now identifies total of 11 types of secondary and super-secondary structures, including helices, β-strands, coil regions, five common β-turns (type I, II, I', II' and VIII), β hairpins as well as interior and edge β-strands. CSI 3.0 accepts experimental NMR chemical shift data in multiple formats (NMR Star 2.1, NMR Star 3.1 and SHIFTY) and generates colorful CSI plots (bar graphs) and secondary/super-secondary structure assignments. The output can be readily used as constraints for structure determination and refinement or the images may be used for presentations and publications. CSI 3.0 uses a pipeline of several well-tested, previously published programs to identify the secondary and super-secondary structures in protein chains. Comparisons with secondary and super-secondary structure assignments made via standard coordinate analysis programs such as DSSP, STRIDE and VADAR on high-resolution protein structures solved by X-ray and NMR show >90% agreement between those made with CSI 3.0.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server Issue

SIFTER search: a web server for accurate phylogeny-based protein function prediction

筛搜索:基于蛋白质功能预测的准确系统的Web服务器

We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Here, we introduce a user-friendly web interface for accurate protein function prediction using the SIFTER algorithm. SIFTER is a state-of-the-art sequence-based gene molecular function prediction algorithm that uses a statistical model of function evolution to incorporate annotations throughout the phylogenetic tree. Due to the resources needed by the SIFTER algorithm, running SIFTER locally is not trivial for most users, especially for large-scale problems. The SIFTER web server thus provides access to precomputed predictions on 16 863 537 proteins from 232 403 species. Users can explore SIFTER predictions with queries for proteins, species, functions, and homologs of sequences not in the precomputed prediction set. The SIFTER web server is accessible at http://sifter.berkeley.edu/ and the source code can be downloaded.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server issue

De novo design of heat-repressible RNA thermosensors in E. coli

从头热阻遏RNA大肠杆菌热敏元件的设计

RNA-based temperature sensing is common in bacteria that live in fluctuating environments. Most naturally-occurring RNA thermosensors are heat-inducible, have long sequences, and function by sequestering the ribosome binding site in a hairpin structure at lower temperatures. Here, we demonstrate the de novo design of short, heat-repressible RNA thermosensors. These thermosensors contain a cleavage site for RNase E, an enzyme native to Escherichia coli and many other organisms, in the 5' untranslated region of the target gene. At low temperatures, the cleavage site is sequestered in a stem–loop, and gene expression is unobstructed. At high temperatures, the stem–loop unfolds, allowing for mRNA degradation and turning off expression. We demonstrated that these thermosensors respond specifically to temperature and provided experimental support for the central role of RNase E in the mechanism. We also demonstrated the modularity of these RNA thermosensors by constructing a three-input composite circuit that utilizes transcriptional, post-transcriptional, and post-translational regulation. A thorough analysis of the 24 thermosensors allowed for the development of design guidelines for systematic construction of similar thermosensors in future applications. These short, modular RNA thermosensors can be applied to the construction of complex genetic circuits, facilitating rational reprogramming of cellular processes for synthetic biology applications.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Synthetic Biology and Bioengineering

DroughtDB: an expert-curated compilation of plant drought stress genes and their homologs in nine species

droughtdb:专家策划的植物干旱胁迫基因编写和九种同系物

Plants are sessile and therefore exposed to a number of biotic and abiotic stresses. Drought is the major abiotic stress restricting plant growth worldwide. A number of genes involved in drought stress response have already been characterized, mainly in the model species Arabidopsis thaliana and Oryza sativa. However, with the aim to produce drought tolerant crop varieties, it is of importance to identify the respective orthologs for each species. We have developed DroughtDB, a manually curated compilation of molecularly characterized genes that are involved in drought stress response. DroughtDB includes information about the originally identified gene, its physiological and/or molecular function and mutant phenotypes and provides detailed information about computed orthologous genes in nine model and crop plant species including maize and barley. All identified orthologs are interlinked with the respective reference entry in MIPS/PGSB PlantsDB, which allows retrieval of additional information like genome context and sequence information. Thus, DroughtDB is a valuable resource and information tool for researchers working on drought stress and will facilitate the identification, analysis and characterization of genes involved in drought stress tolerance in agriculturally important crop plants.

Database URL: http://pgsb.helmholtz-muenchen.de/droughtdb/

[详细]

  • Database
  • 10年前
  • Database Tool

Molecular evolution of freshwater snails with contrasting mating systems

淡水蜗牛对比鲜明的的交配系统的分子进化

Because mating systems affect population genetics and ecology, they are expected to impact the molecular evolution of species. Self-fertilizing species experience reduced effective population size, recombination rates and heterozygosity, which in turn should decrease the efficacy of natural selection, both adaptive and purifying, and the strength of meiotic drive processes such as GC-biased gene conversion. The empirical evidence is only partly congruent with these predictions, depending on the analyzed species, some, but not all, of the expected effects have been observed. One possible reason is that self-fertilization is an evolutionary dead-end, so that most current selfers recently evolved self-fertilization, and their genome has not yet been strongly impacted by selfing. Here we investigate the molecular evolution of two groups of freshwater snails in which mating systems have likely been stable for several millions of years. Analyzing coding sequence polymorphism, divergence and expression levels, we report a strongly reduced genetic diversity, decreased efficacy of purifying selection, slower rate of adaptive evolution and weakened codon usage bias/GC-biased gene conversion in the selfer Galba compared to the outcrosser Physa, in full agreement with theoretical expectations. Our results demonstrate that self-fertilization, when effective in the long run, is a major driver of population genomic and molecular evolutionary processes. Despite the genomic effects of selfing, Galba truncatula seems to escape the demographic consequences of the genetic load. We suggest that the particular ecology of the species may buffer the negative consequences of selfing, shedding new light on the dead-end hypothesis.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Research Article

Likelihood Based Complex Trait Association Testing for Arbitrary Depth Sequencing Data

可能性基于复杂特征关联测试任意深度测序数据

Summary: In next generation sequencing (NGS)-based genetic studies, researchers typically perform genotype calling first and then apply standard genotype-based methods for association testing. However, such a two-step approach ignores genotype calling uncertainty in the association testing step and may incur power loss and/or inflated type-I error. In the recent literature, a few robust and efficient likelihood based methods including both likelihood ratio test (LRT) and score test have been proposed to carry out association testing without intermediate genotype calling. These methods take genotype calling uncertainty into account by directly incorporating genotype likelihood function (GLF) of NGS data into association analysis. However, existing LRT methods are computationally demanding or do not allow covariate adjustment; while existing score tests are not applicable to markers with low minor allele frequency (MAF). We provide an LRT allowing flexible covariate adjustment, develop a statistically more powerful score test and propose a combination strategy (UNC combo) to leverage the advantages of both tests. We have carried out extensive simulations to evaluate the performance of our proposed LRT and score test. Simulations and real data analysis demonstrate the advantages of our proposed combination strategy: it offers a satisfactory trade-off in terms of computational efficiency, applicability (accommodating both common variants and variants with low MAF) and statistical power, particularly for the analysis of quantitative trait where the power gain can be up to ~60% when the causal variant is of low frequency (MAF < 0.01).

Availability and implementation: UNC combo and the associated R files, including documentation, examples, are available at http://www.unc.edu/~yunmli/UNCcombo/

Contact: yunli@med.unc.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

[详细]

  • Bioinformatics
  • 10年前
  • ORIGINAL PAPER

COREGNET: reconstruction and integrated analysis of co-regulatory networks

COREGNET:co-regulatory网络的重建和综合分析

COREGNET is an R/Bioconductor package to analyze large-scale transcriptomic data by highlighting sets of co-regulators. Based on a transcriptomic dataset, COREGNET can be used to: reconstruct a large-scale co-regulatory network, integrate regulation evidences such as transcription factor binding sites and ChIP data, estimate sample-specific regulator activity, identify cooperative transcription factors and analyze the sample-specific combinations of active regulators through an interactive visualization tool. In this study COREGNET was used to identify driver regulators of bladder cancer.

Availability: COREGNET is available through Bioconductor : http://www.bioconductor.org/packages/release/bioc/html/CoRegNet.html

Contact: remy.nicolle@issb.genopole.fr

[详细]

  • Bioinformatics
  • 10年前
  • APPLICATIONS NOTE

MEDUSA: a multi-draft based scaffolder

美杜莎:multi-draft架子工

Motivation: Completing the genome sequence of an organism is an important task in comparative, functional and structural genomics. However, this remains a challenging issue from both a computational and an experimental viewpoint. Genome scaffolding (i.e. the process of ordering and orientating contigs) of de novo assemblies usually represents the first step in most genome finishing pipelines.

Results: In this article we present MeDuSa (Multi-Draft based Scaffolder), an algorithm for genome scaffolding. MeDuSa exploits information obtained from a set of (draft or closed) genomes from related organisms to determine the correct order and orientation of the contigs. MeDuSa formalizes the scaffolding problem by means of a combinatorial optimization formulation on graphs and implements an efficient constant factor approximation algorithm to solve it. In contrast to currently used scaffolders, it does not require either prior knowledge on the microrganisms dataset under analysis (e.g. their phylogenetic relationships) or the availability of paired end read libraries. This makes usability and running time two additional important features of our method. Moreover, benchmarks and tests on real bacterial datasets showed that MeDuSa is highly accurate and, in most cases, outperforms traditional scaffolders. The possibility to use MeDuSa on eukaryotic datasets has also been evaluated, leading to interesting results.

Availability and implementation: MeDuSa web server: http://combo.dbe.unifi.it/medusa. A stand-alone version of the software can be downloaded from https://github.com/combogenomics/medusa/releases. All results presented in this work have been obtained with MeDuSa v. 1.3.

Contact: marco.fondi@unifi.it

Supplementary information: Supplementary data are available at Bioinformatics online.

[详细]

  • Bioinformatics
  • 10年前
  • ORIGINAL PAPER

Correction: GOBLET: The Global Organisation for Bioinformatics Learning, Education and Training

修正:杯状:生物信息学的全球性组织,教育和培训

by Teresa K. Attwood, Erik Bongcam-Rudloff, Michelle E. Brazas, Manuel Corpas, Pascale Gaudet, Fran Lewitter, Nicola Mulder, Patricia M. Palagi, Maria Victoria Schneider, Celia W. G. van Gelder, GOBLET Consortium

[详细]

  • PLOS Computational Biology
  • 10年前

ScaffMatch: scaffolding algorithm based on maximum weight matching

ScaffMatch:脚手架算法基于最大重量匹配

Motivation: Next-generation high-throughput sequencing has become a state-of-the-art technique in genome assembly. Scaffolding is one of the main stages of the assembly pipeline. During this stage, contigs assembled from the paired-end reads are merged into bigger chains called scaffolds. Because of a high level of statistical noise, chimeric reads, and genome repeats the problem of scaffolding is a challenging task. Current scaffolding software packages widely vary in their quality and are highly dependent on the read data quality and genome complexity. There are no clear winners and multiple opportunities for further improvements of the tools still exist.

Results: This article presents an efficient scaffolding algorithm ScaffMatch that is able to handle reads with both short (<600 bp) and long (>35 000 bp) insert sizes producing high-quality scaffolds. We evaluate our scaffolding tool with the F score and other metrics (N50, corrected N50) on eight datasets comparing it with the most available packages. Our experiments show that ScaffMatch is the tool of preference for the most datasets.

Availability and implementation: The source code is available at http://alan.cs.gsu.edu/NGS/?q=content/scaffmatch.

Contact: mandric@cs.gsu.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

[详细]

  • Bioinformatics
  • 10年前
  • ORIGINAL PAPER

iFoldRNA v2: folding RNA with constraints

iFoldRNA v2:折叠RNA与约束

Summary: A key to understanding RNA function is to uncover its complex 3D structure. Experimental methods used for determining RNA 3D structures are technologically challenging and laborious, which makes the development of computational prediction methods of substantial interest. Previously, we developed the iFoldRNA server that allows accurate prediction of short (<50 nt) tertiary RNA structures starting from primary sequences. Here, we present a new version of the iFoldRNA server that permits the prediction of tertiary structure of RNAs as long as a few hundred nucleotides. This substantial increase in the server capacity is achieved by utilization of experimental information such as base-pairing and hydroxyl-radical probing. We demonstrate a significant benefit provided by integration of experimental data and computational methods.

Availability and implementation: http://ifoldrna.dokhlab.org

Contact: dokh@unc.eu

[详细]

  • Bioinformatics
  • 10年前
  • APPLICATIONS NOTE

Urothelial cancer gene regulatory networks inferred from large-scale RNAseq, Bead and Oligo gene expression data

移行细胞癌基因调控网络推断从大规模RNAseq珠和低聚糖基因表达数据

Background: Urothelial pathogenesis is a complex process driven by an underlying network of interconnected genes. The identification of novel genomic target regions and gene targets that drive urothelial carcinogenesis is crucial in order to improve our current limited understanding of urothelial cancer (UC) on the molecular level. The inference of genome-wide gene regulatory networks (GRN) from large-scale gene expression data provides a promising approach for a detailed investigation of the underlying network structure associated to urothelial carcinogenesis. Methods: In our study we inferred and compared three GRNs by the application of the BC3Net inference algorithm to large-scale transitional cell carcinoma gene expression data sets from Illumina RNAseq (179 samples), Illumina Bead arrays (165 samples) and Affymetrix Oligo microarrays (188 samples). We investigated the structural and functional properties of GRNs for the identification of molecular targets associated to urothelial cancer. Results: We found that the urothelial cancer (UC) GRNs show a significant enrichment of subnetworks that are associated with known cancer hallmarks including cell cycle, immune response, signaling, differentiation and translation. Interestingly, the most prominent subnetworks of co-located genes were found on chromosome regions 5q31.3 (RNAseq), 8q24.3 (Oligo) and 1q23.3 (Bead), which all represent known genomic regions frequently deregulated or aberated in urothelial cancer and other cancer types. Furthermore, the identified hub genes of the individual GRNs, e.g., HID1/DMC1 (tumor development), RNF17/TDRD4 (cancer antigen) and CYP4A11 (angiogenesis/ metastasis) are known cancer associated markers. The GRNs were highly dataset specific on the interaction level between individual genes, but showed large similarities on the biological function level represented by subnetworks. Remarkably, the RNAseq UC GRN showed twice the proportion of significant functional subnetworks. Based on our analysis of inferential and experimental networks the Bead UC GRN showed the lowest performance compared to the RNAseq and Oligo UC GRNs. Conclusion: To our knowledge, this is the first study investigating genome-scale UC GRNs. RNAseq based gene expression data is the data platform of choice for a GRN inference. Our study offers new avenues for the identification of novel putative diagnostic targets for subsequent studies in bladder tumors.

[详细]

  • BMC Systems Biology 2015, null:21
  • 10年前

NPDock: a web server for protein-nucleic acid docking

npdock:一种核酸蛋白对接Web服务器

Protein–RNA and protein–DNA interactions play fundamental roles in many biological processes. A detailed understanding of these interactions requires knowledge about protein–nucleic acid complex structures. Because the experimental determination of these complexes is time-consuming and perhaps futile in some instances, we have focused on computational docking methods starting from the separate structures. Docking methods are widely employed to study protein–protein interactions; however, only a few methods have been made available to model protein–nucleic acid complexes. Here, we describe NPDock (Nucleic acid–Protein Docking); a novel web server for predicting complexes of protein–nucleic acid structures which implements a computational workflow that includes docking, scoring of poses, clustering of the best-scored models and refinement of the most promising solutions. The NPDock server provides a user-friendly interface and 3D visualization of the results. The smallest set of input data consists of a protein structure and a DNA or RNA structure in PDB format. Advanced options are available to control specific details of the docking process and obtain intermediate results. The web server is available at http://genesilico.pl/NPDock.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server Issue

ZCURVE 3.0: identify prokaryotic genes with higher accuracy as well as automatically and accurately select essential genes

3:原核基因识别识别具有较高的精度以及自动准确地选择必需基因

In 2003, we developed an ab initio program, ZCURVE 1.0, to find genes in bacterial and archaeal genomes. In this work, we present the updated version (i.e. ZCURVE 3.0). Using 422 prokaryotic genomes, the average accuracy was 93.7% with the updated version, compared with 88.7% with the original version. Such results also demonstrate that ZCURVE 3.0 is comparable with Glimmer 3.02 and may provide complementary predictions to it. In fact, the joint application of the two programs generated better results by correctly finding more annotated genes while also containing fewer false-positive predictions. As the exclusive function, ZCURVE 3.0 contains one post-processing program that can identify essential genes with high accuracy (generally >90%). We hope ZCURVE 3.0 will receive wide use with the web-based running mode. The updated ZCURVE can be freely accessed from http://cefg.uestc.edu.cn/zcurve/ or http://tubic.tju.edu.cn/zcurveb/ without any restrictions.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server Issue

PrionW: a server to identify proteins containing glutamine/asparagine rich prion-like domains and their amyloid cores

prionw:服务器鉴定含有谷氨酰胺/天冬酰胺丰富朊病毒域和淀粉样蛋白核蛋白

Prions are a particular type of amyloids with the ability to self-perpetuate and propagate in vivo. Prion-like conversion underlies important biological processes but is also connected to human disease. Yeast prions are the best understood transmissible amyloids. In these proteins, prion formation from an initially soluble state involves a structural conversion, driven, in many cases, by specific domains enriched in glutamine/asparagine (Q/N) residues. Importantly, domains sharing this compositional bias are also present in the proteomes of higher organisms, thus suggesting that prion-like conversion might be an evolutionary conserved mechanism. We have recently shown that the identification and evaluation of the potency of amyloid nucleating sequences in putative prion domains allows discrimination of genuine prions. PrionW is a web application that exploits this principle to scan sequences in order to identify proteins containing Q/N enriched prion-like domains (PrLDs) in large datasets. When used to scan the complete yeast proteome, PrionW identifies previously experimentally validated prions with high accuracy. Users can analyze up to 10 000 sequences at a time, PrLD-containing proteins are identified and their putative PrLDs and amyloid nucleating cores visualized and scored. The output files can be downloaded for further analysis. PrionW server can be accessed at http://bioinf.uab.cat/prionw/.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server Issue

Heterologous protein production using euchromatin-containing expression vectors in mammalian cells

利用哺乳动物细胞表达载体含有染色生产异源蛋白

Upon stable cell line generation, chromosomal integration site of the vector DNA has a major impact on transgene expression. Here we apply an active gene environment, rather than specified genetic elements, in expression vectors used for random integration. We generated a set of Bacterial Artificial Chromosome (BAC) vectors with different open chromatin regions, promoters and gene regulatory elements and tested their impact on recombinant protein expression in CHO cells. We identified the Rosa26 BAC as the most efficient vector backbone showing a nine-fold increase in both polyclonal and clonal production of the human IgG-Fc. Clonal protein production was directly proportional to integrated vector copy numbers and remained stable during 10 weeks without selection pressure. Finally, we demonstrated the advantages of BAC-based vectors by producing two additional proteins, HIV-1 glycoprotein CN54gp140 and HIV-1 neutralizing PG9 antibody, in bioreactors and shake flasks reaching a production yield of 1 g/l.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Methods Online

Web-Beagle: a web server for the alignment of RNA secondary structures

Web的比格犬:一种RNA二级结构对齐的Web服务器

Web-Beagle (http://beagle.bio.uniroma2.it) is a web server for the pairwise global or local alignment of RNA secondary structures. The server exploits a new encoding for RNA secondary structure and a substitution matrix of RNA structural elements to perform RNA structural alignments. The web server allows the user to compute up to 10 000 alignments in a single run, taking as input sets of RNA sequences and structures or primary sequences alone. In the latter case, the server computes the secondary structure prediction for the RNAs on-the-fly using RNAfold (free energy minimization). The user can also compare a set of input RNAs to one of five pre-compiled RNA datasets including lncRNAs and 3' UTRs. All types of comparison produce in output the pairwise alignments along with structural similarity and statistical significance measures for each resulting alignment. A graphical color-coded representation of the alignments allows the user to easily identify structural similarities between RNAs. Web-Beagle can be used for finding structurally related regions in two or more RNAs, for the identification of homologous regions or for functional annotation. Benchmark tests show that Web-Beagle has lower computational complexity, running time and better performances than other available methods.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server Issue

Longitudinal epigenetic and gene expression profiles analyzed by three-component analysis reveal down-regulation of genes involved in protein translation in human aging

纵向表观遗传学和基因表达谱的三成分分析揭示参与人类老化的蛋白翻译基因的下调

Data on biological mechanisms of aging are mostly obtained from cross-sectional study designs. An inherent disadvantage of this design is that inter-individual differences can mask small but biologically significant age-dependent changes. A serially sampled design (same individual at different time points) would overcome this problem but is often limited by the relatively small numbers of available paired samples and the statistics being used. To overcome these limitations, we have developed a new vector-based approach, termed three-component analysis, which incorporates temporal distance, signal intensity and variance into one single score for gene ranking and is combined with gene set enrichment analysis. We tested our method on a unique age-based sample set of human skin fibroblasts and combined genome-wide transcription, DNA methylation and histone methylation (H3K4me3 and H3K27me3) data. Importantly, our method can now for the first time demonstrate a clear age-dependent decrease in expression of genes coding for proteins involved in translation and ribosome function. Using analogies with data from lower organisms, we propose a model where age-dependent down-regulation of protein translation-related components contributes to extend human lifespan.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Methods Online

DIANA-miRPath v3.0: deciphering microRNA function with experimental support

戴安娜mirpath V3.0:解密microRNA功能实验支持

The functional characterization of miRNAs is still an open challenge. Here, we present DIANA-miRPath v3.0 (http://www.microrna.gr/miRPathv3) an online software suite dedicated to the assessment of miRNA regulatory roles and the identification of controlled pathways. The new miRPath web server renders possible the functional annotation of one or more miRNAs using standard (hypergeometric distributions), unbiased empirical distributions and/or meta-analysis statistics. DIANA-miRPath v3.0 database and functionality have been significantly extended to support all analyses for KEGG molecular pathways, as well as multiple slices of Gene Ontology (GO) in seven species (Homo sapiens, Mus musculus, Rattus norvegicus, Drosophila melanogaster, Caenorhabditis elegans, Gallus gallus and Danio rerio). Importantly, more than 600 000 experimentally supported miRNA targets from DIANA-TarBase v7.0 have been incorporated into the new schema. Users of DIANA-miRPath v3.0 can harness this wealth of information and substitute or combine the available in silico predicted targets from DIANA-microT-CDS and/or TargetScan v6.2 with high quality experimentally supported interactions. A unique feature of DIANA-miRPath v3.0 is its redesigned Reverse Search module, which enables users to identify and visualize miRNAs significantly controlling selected pathways or belonging to specific GO categories based on in silico or experimental data. DIANA-miRPath v3.0 is freely available to all users without any login requirement.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server Issue

BCSearch: fast structural fragment mining over large collections of protein structures

bcsearch:快速挖掘结构片段的蛋白质结构的大集合

Resources to mine the large amount of protein structures available today are necessary to better understand how amino acid variations are compatible with conformation preservation, to assist protein design, engineering and, further, the development of biologic therapeutic compounds. BCSearch is a versatile service to efficiently mine large collections of protein structures. It relies on a new approach based on a Binet–Cauchy kernel that is more discriminative than the widely used root mean square deviation criterion. It has statistics independent of size even for short fragments, and is fast. The systematic mining of large collections of structures such as the complete SCOPe protein structural classification or comprehensive subsets of the Protein Data Bank can be performed in few minutes. Based on this new score, we propose four innovative applications: BCFragSearch and BCMirrorSearch, respectively, search for fragments similar and anti-similar to a query and return information on the diversity of the sequences of the hits. BCLoopSearch identifies candidate fragments of fixed size matching the flanks of a gaped structure. BCSpecificitySearch analyzes a complete protein structure and returns information about sites having few similar fragments. BCSearch is available at http://bioserv.rpbs.univ-paris-diderot.fr/services/BCSearch.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server issue

Background mutational features of the radiation-resistant bacterium Deinococcus radiodurans

在耐辐射球菌背景突变特征

Deinococcus bacteria are extremely resistant to radiation, oxidation, and desiccation. Resilience to these factors has been suggested to be due to enhanced damage prevention and repair mechanisms, as well as highly efficient antioxidant protection systems. Here, using mutation-accumulation experiments we find that the GC-rich Deinococcus radiodurans has an overall background genomic mutation rate similar to that of E. coli, but differs in mutation spectrum, with the A/T to G/C mutation rate (based on a total count of 88 A:T->G:C transitions and 82 A:T->C:G transversions) per site per generation higher than that in the other direction (based on a total count of 157 G:C->A:T transitions and 33 G:C->T:A transversions). We propose that this unique spectrum is shaped mainly by the abundant uracil DNA glycosylases (UDG) reducing G:C->T:A transversions, adenine methylation elevating A:T->C:G transversions, and absence of cytosine methylation decreasing G:C->A:T transitions. As opposed to the >100x elevation of the mutation rate in MMR- strains of most other organisms, MMR- D. radiodurans only exhibits a four-fold elevation, raising the possibility that other DNA repair mechanisms compensate for a relatively low-efficiency DNA mismatch repair pathway. Since D. radiodurans has plentiful insertion sequence (IS) elements in the genome and the activities of IS elements are rarely directly explored, we also estimated the insertion (transposition) rate of the IS elements to be 2.50 x 10-3 per genome per generation in the wild-type strain; knocking out MMR did not elevate the IS element insertion rate in this organism.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Research Article

Genome-wide dosage-dependent and -independent regulation contributes to gene expression and evolutionary novelty in plant polyploids

基因组剂量依赖和独立的调控有助于在植物多倍体基因表达和进化的新颖性

Polyploidy provides evolutionary and morphological novelties in many plants and some animals. However, the role of genome dosage and composition in gene expression changes remains poorly understood. Here, we generated a series of resynthesized Arabidopsis tetraploids that contain 0–4 copies of Arabidopsis thaliana and Arabidopsis arenosa genomes and investigated ploidy and hybridity effects on gene expression. Allelic expression can be defined as dosage dependent (expression levels correlate with genome dosages) or otherwise as dosage independent. Here, we show that many dosage-dependent genes contribute to cell cycle, photosynthesis, and metabolism, whereas dosage-independent genes are enriched in biotic and abiotic stress responses. Interestingly, dosage-dependent genes tend to be preserved in ancient biochemical pathways present in both plant and nonplant species, whereas many dosage-independent genes belong to plant-specific pathways. This is confirmed by an independent analysis using Arabidopsis phylostratigraphic map. For A. thaliana loci, the dosage-dependent alleles are devoid of TEs and tend to correlate with H3K9ac, H3K4me3, and CG methylation, whereas the majority of dosage-independent alleles are enriched with TEs and correspond to H3K27me1, H3K27me3, and CHG (H = A, T, or C) methylation. Furthermore, there is a parent-of-origin effect on nonadditively expressed genes in the reciprocal allotetraploids especially when A. arenosa is used as the pollen donor, leading to metabolic and morphological changes. Thus, ploidy, epigenetic modifications, and cytoplasmic-nuclear interactions shape gene expression diversity in polyploids. Dosage-dependent expression can maintain growth and developmental stability, whereas dosage-independent expression can facilitate functional divergence between homeologs (subfunctionalization and/or neofunctionalization) during polyploid evolution.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Research Article