ENViz: a Cytoscape App for integrated statistical analysis and visualization of sample-matched data with multiple data types

ENViz:Cytoscape sample-matched数据的综合统计分析和可视化应用程序与多个数据类型

Summary: ENViz (Enrichment Analysis and Visualization) is a Cytoscape app that performs joint enrichment analysis of two types of sample matched datasets in the context of systematic annotations. Such datasets may be gene expression or any other high-throughput data collected in the same set of samples. The enrichment analysis is done in the context of pathway information, gene ontology or any custom annotation of the data. The results of the analysis consist of significant associations between profiled elements of one of the datasets to the annotation terms (e.g. miR-19 was associated to the cell-cycle process in breast cancer samples). The results of the enrichment analysis are visualized as an interactive Cytoscape network.

Availability and implementation: ENViz is publically available in the Cytoscape App Store (http://apps.cytoscape.org/apps/enviz). For additional information please visit the tool website: http://www.agilent.com/labs/research/compbio/enviz/

Contact: israel_steinfeld@agilent.com

[详细]

  • Bioinformatics
  • 10年前
  • SYSTEMS BIOLOGY

Functional Gene Networks: R/Bioc package to generate and analyse gene networks derived from functional enrichment and clustering

功能基因网络:R / Bioc包生成和分析基因网络源于功能性浓缩和集群

Summary: Functional Gene Networks (FGNet) is an R/Bioconductor package that generates gene networks derived from the results of functional enrichment analysis (FEA) and annotation clustering. The sets of genes enriched with specific biological terms (obtained from a FEA platform) are transformed into a network by establishing links between genes based on common functional annotations and common clusters. The network provides a new view of FEA results revealing gene modules with similar functions and genes that are related to multiple functions. In addition to building the functional network, FGNet analyses the similarity between the groups of genes and provides a distance heatmap and a bipartite network of functionally overlapping genes. The application includes an interface to directly perform FEA queries using different external tools: DAVID, GeneTerm Linker, TopGO or GAGE; and a graphical interface to facilitate the use.

Availability and implementation: FGNet is available in Bioconductor, including a tutorial. URL: http://bioconductor.org/packages/release/bioc/html/FGNet.html

Contact: jrivas@usal.es

Supplementary information: Supplementary data are available at Bioinformatics online.

[详细]

  • Bioinformatics
  • 10年前
  • SYSTEMS BIOLOGY

QTLMiner: QTL database curation by mining tables in literature

QTLMiner:QTL在文学矿业管理的数据库表

Motivation: Figures and tables in biomedical literature record vast amounts of important experiment results. In scientific papers, for example, quantitative trait locus (QTL) information is usually presented in tables. However, most of the popular text-mining methods focus on extracting knowledge from unstructured free text. As far as we know, there are no published works on mining tables in biomedical literature. In this article, we propose a method to extract QTL information from tables and plain text found in literature. Heterogeneous and complex tables were converted into a structured database, combined with information extracted from plain text. Our method could greatly reduce labor burdens involved with database curation.

Results: We applied our method on a soybean QTL database curation, from which 2278 records were extracted from 228 papers with a precision rate of 96.9% and a recall rate of 83.3%, F value for the method is 89.6%.

Availability and implementation: QTLMiner is available at www.soyomics.com/qtlminer/.

Contact: yuanxh@iga.ac.cn

[详细]

  • Bioinformatics
  • 10年前
  • DATA AND TEXT MINING

drexplorer: A tool to explore dose-response relationships and drug-drug interactions

Drexplorer: A tool to explore dosed - response relationships and drug - drug interactions

Motivation: Nonlinear dose–response models are primary tools for estimating the potency [e.g. half-maximum inhibitory concentration (IC) known as IC50] of anti-cancer drugs. We present drexplorer software, which enables biologists to evaluate replicate reproducibility, detect outlier data points, fit different models, select the best model, estimate IC values at different percentiles and assess drug–drug interactions. drexplorer serves as a computation engine within the R environment and a graphical interface for users who do not have programming backgrounds.

Availability and implementation: The drexplorer R package is freely available from GitHub at https://github.com/nickytong/drexplorer. A graphical user interface is shipped with the package.

Contact: jingwang@mdanderson.org

Supplementary information: Supplementary data are available at Bioinformatics online.

[详细]

  • Bioinformatics
  • 10年前
  • DATA AND TEXT MINING

ADME SARfari: comparative genomics of drug metabolizing systems

ADME SARfari:药物代谢系统的比较基因组学

Motivation: ADME SARfari is a freely available web resource that enables comparative analyses of drug-disposition genes. It does so by integrating a number of publicly available data sources, which have subsequently been used to build data mining services, predictive tools and visualizations for drug metabolism researchers. The data include the interactions of small molecules with ADME (absorption, distribution, metabolism and excretion) proteins responsible for the metabolism and transport of molecules; available pharmacokinetic (PK) data; protein sequences of ADME-related molecular targets for pre-clinical model species and human; alignments of the orthologues including information on known SNPs (Single Nucleotide Polymorphism) and information on the tissue distribution of these proteins. In addition, in silico models have been developed, which enable users to predict which ADME relevant protein targets a novel compound is likely to interact with.

Availability and implementation: https://www.ebi.ac.uk/chembl/admesarfari

Contact: jpo@ebi.ac.uk

Supplementary information: Supplementary data are available at Bioinformatics online.

[详细]

  • Bioinformatics
  • 10年前
  • DATABASES AND ONTOLOGIES

WALTZ-DB: a benchmark database of amyloidogenic hexapeptides

WALTZ-DB:基准数据库amyloidogenic hexapeptides

Summary: Accurate prediction of amyloid-forming amino acid sequences remains an important challenge. We here present an online database that provides open access to the largest set of experimentally characterized amyloid forming hexapeptides. To this end, we expanded our previous set of 280 hexapeptides used to develop the Waltz algorithm with 89 peptides from literature review and by systematic experimental characterisation of the aggregation of 720 hexapeptides by transmission electron microscopy, dye binding and Fourier transform infrared spectroscopy. This brings the total number of experimentally characterized hexapeptides in the WALTZ-DB database to 1089, of which 244 are annotated as positive for amyloid formation.

Availability and implementation: The WALTZ-DB database is freely available without any registration requirement at http://waltzdb.switchlab.org.

Contact: frederic.rousseau@switch.vib-kuleuven.be or joost.schymkowitz@switch.vib-kuleuven.be

[详细]

  • Bioinformatics
  • 10年前
  • DATABASES AND ONTOLOGIES

OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species

orthovenn:一种全基因组比较和同源跨多个物种注释Web服务器集群

Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that is useful for genome wide comparisons and visualization of orthologous clusters. OrthoVenn provides coverage of vertebrates, metazoa, protists, fungi, plants and bacteria for the comparison of orthologous clusters and also supports uploading of customized protein sequences from user-defined species. An interactive Venn diagram, summary counts, and functional summaries of the disjunction and intersection of clusters shared between species are displayed as part of the OrthoVenn result. OrthoVenn also includes in-depth views of the clusters using various sequence analysis tools. Furthermore, OrthoVenn identifies orthologous clusters of single copy genes and allows for a customized search of clusters of specific genes through key words or BLAST. OrthoVenn is an efficient and user-friendly web server freely accessible at http://probes.pw.usda.gov/OrthoVenn or http://aegilops.wheat.ucdavis.edu/OrthoVenn.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server Issue

mRNA transfection of a novel TAL effector nuclease (TALEN) facilitates efficient knockout of HIV co-receptor CCR5

一种新型TAL效应器核酸酶基因转染(TALEN)促进HIV辅助受体CCR5基因敲除效率

Homozygosity for a natural deletion variant of the HIV-coreceptor molecule CCR5, CCR532, confers resistance toward HIV infection. Allogeneic stem cell transplantation from a CCR532-homozygous donor has resulted in the first cure from HIV (‘Berlin patient’). Based thereon, genetic disruption of CCR5 using designer nucleases was proposed as a promising HIV gene-therapy approach. Here we introduce a novel TAL-effector nuclease, CCR5-Uco-TALEN that can be efficiently delivered into T cells by mRNA electroporation, a gentle and truly transient gene-transfer technique. CCR5-Uco-TALEN mediated high-rate CCR5 knockout (>90% in PM1 and >50% in primary T cells) combined with low off-target activity, as assessed by flow cytometry, next-generation sequencing and a newly devised, very convenient gene-editing frequency digital-PCR (GEF-dPCR). GEF-dPCR facilitates simultaneous detection of wild-type and gene-edited alleles with remarkable sensitivity and accuracy as shown for the CCR5 on-target and CCR2 off-target loci. CCR5-edited cells were protected from infection with HIV-derived lentiviral vectors, but also with the wild-type CCR5-tropic HIV-1BaL strain. Long-term exposure to HIV-1BaL resulted in almost complete suppression of viral replication and selection of CCR5-gene edited T cells. In conclusion, we have developed a novel TALEN for the targeted, high-efficiency knockout of CCR5 and a useful dPCR-based gene-editing detection method.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Nucleic Acid Enzymes

CATH FunFHMMer web server: protein functional annotations using functional family assignments

导管funfhmmer Web服务器:蛋白质功能注释使用功能的家庭作业

The widening function annotation gap in protein databases and the increasing number and diversity of the proteins being sequenced presents new challenges to protein function prediction methods. Multidomain proteins complicate the protein sequence–structure–function relationship further as new combinations of domains can expand the functional repertoire, creating new proteins and functions. Here, we present the FunFHMMer web server, which provides Gene Ontology (GO) annotations for query protein sequences based on the functional classification of the domain-based CATH-Gene3D resource. Our server also provides valuable information for the prediction of functional sites. The predictive power of FunFHMMer has been validated on a set of 95 proteins where FunFHMMer performs better than BLAST, Pfam and CDD. Recent validation by an independent international competition ranks FunFHMMer as one of the top function prediction methods in predicting GO annotations for both the Biological Process and Molecular Function Ontology. The FunFHMMer web server is available at http://www.cathdb.info/search/by_funfhmmer.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server issue

RNA-Redesign: a web server for fixed-backbone 3D design of RNA

RNA的重新设计:一种RNA固定骨干三维设计的Web服务器

RNA is rising in importance as a design medium for interrogating fundamental biology and for developing therapeutic and bioengineering applications. While there are several online servers for design of RNA secondary structure, there are no tools available for the rational design of 3D RNA structure. Here we present RNA-Redesign (http://rnaredesign.stanford.edu), an online 3D design tool for RNA. This resource utilizes fixed-backbone design to optimize the sequence identity and nucleobase conformations of an RNA to match a desired backbone, analogous to fundamental tools that underlie rational protein engineering. The resulting sequences suggest thermostabilizing mutations that can be experimentally verified. Further, sequence preferences that differ between natural and computationally designed sequences can suggest whether natural sequences possess functional constraints besides folding stability, such as cofactor binding or conformational switching. Finally, for biochemical studies, the designed sequences can suggest experimental tests of 3D models, including concomitant mutation of base triples. In addition to the designs generated, detailed graphical analysis is presented through an integrated and user-friendly environment.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server Issue

Species tree inference using a mixture model

物种树推理使用混合模型

Species tree reconstruction has been a subject of substantial research due to its central role across biology and medicine. A species tree is often reconstructed using a set of gene trees or by directly using sequence data. In either of these cases, one of the main confounding phenomena is the discordance between a species tree and a gene tree due to evolutionary events such as duplications and losses. Probabilistic methods can resolve the discordance by coestimating gene trees and the species tree but this approach poses a scalability problem for larger data sets. We present MixTreEM-DLRS: A two-phase approach for reconstructing a species tree in the presence of gene duplications and losses. In the first phase, MixTreEM, a novel structural expectation maximization algorithm based on a mixture model is used to reconstruct a set of candidate species trees, given sequence data for monocopy gene families from the genomes under study. In the second phase, PrIME-DLRS, a method based on the DLRS model (Åkerborg O, Sennblad B, Arvestad L, Lagergren J. 2009. Simultaneous Bayesian gene tree reconstruction and reconciliation analysis. Proc Natl Acad Sci U S A. 106(14):5714–5719), is used for selecting the best species tree. PrIME-DLRS can handle multicopy gene families since DLRS, apart from modeling sequence evolution, models gene duplication and loss using a gene evolution model (Arvestad L, Lagergren J, Sennblad B. 2009. The gene evolution model and computing its associated probabilities. J ACM. 56(2):1–44). We evaluate MixTreEM-DLRS using synthetic and biological data, and compare its performance with a recent genome-scale species tree reconstruction method PHYLDOG (Boussau B, Szöllősi GJ, Duret L, Gouy M, Tannier E, Daubin V. 2013. Genome-scale coestimation of species and gene trees. Genome Res. 23(2):323–330) as well as with a fast parsimony-based algorithm Duptree (Wehe A, Bansal MS, Burleigh JG, Eulenstein O. 2008. Duptree: a program for large-scale phylogenetic analyses using gene tree parsimony. Bioinformatics 24(13):1540–1541). Our method is competitive with PHYLDOG in terms of accuracy and runs significantly faster and our method outperforms Duptree in accuracy. The analysis constituted by MixTreEM without DLRS may also be used for selecting the target species tree, yielding a fast and yet accurate algorithm for larger data sets. MixTreEM is freely available at http://prime.scilifelab.se/mixtreem/.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Research Article

Are Convergent and Parallel Amino Acid Substitutions in Protein Evolution More Prevalent Than Neutral Expectations?

收敛和平行氨基酸替换蛋白质进化比中性预期更普遍?

Convergent and parallel amino acid substitutions in protein evolution, collectively referred to as molecular convergence here, have small probabilities under neutral evolution. For this reason, molecular convergence is commonly viewed as evidence for similar adaptations of different species. The surge in the number of reports of molecular convergence in the last decade raises the intriguing question of whether molecular convergence occurs substantially more frequently than expected under neutral evolution. We here address this question using all one-to-one orthologous proteins encoded by the genomes of 12 fruit fly species and those encoded by 17 mammals. We found that the expected amount of molecular convergence varies greatly depending on the specific neutral substitution model assumed at each amino acid site and that the observed amount of molecular convergence is explainable by neutral models incorporating site-specific information of acceptable amino acids. Interestingly, the total number of convergent and parallel substitutions between two lineages, relative to the neutral expectation, decreases with the genetic distance between the two lineages, regardless of the model used in computing the neutral expectation. We hypothesize that this trend results from differences in the amino acids acceptable at a given site among different clades of a phylogeny, due to prevalent epistasis, and provide simulation as well as empirical evidence for this hypothesis. Together, our study finds no genomic evidence for higher-than-neutral levels of molecular convergence, but suggests the presence of abundant epistasis that decreases the likelihood of molecular convergence between distantly related lineages.

[详细]

  • Molecular Biology and Evolution
  • 10年前
  • Research Article

MYC regulates the core pre-mRNA splicing machinery as an essential step in lymphomagenesis

MYC调节核心剪接机械在淋巴瘤的重要一步

Deregulated expression of the MYC transcription factor occurs in most human cancers and correlates with high proliferation, reprogrammed cellular metabolism and poor prognosis. Overexpressed MYC binds to virtually all active promoters within a cell, although with different binding affinities, and modulates the expression of distinct subsets of genes. However, the critical effectors of MYC in tumorigenesis remain largely unknown. Here we show that during lymphomagenesis in Eµ-myc transgenic mice, MYC directly upregulates the transcription of the core small nuclear ribonucleoprotein particle assembly genes, including Prmt5, an arginine methyltransferase that methylates Sm proteins. This coordinated regulatory effect is critical for the core biogenesis of small nuclear ribonucleoprotein particles, effective pre-messenger-RNA splicing, cell survival and proliferation. Our results demonstrate that MYC maintains the splicing fidelity of exons with a weak 5′ donor site. Additionally, we identify pre-messenger-RNAs that are particularly sensitive to the perturbation of the MYC–PRMT5 axis, resulting in either intron retention (for example, Dvl1) or exon skipping (for example, Atr, Ep400). Using antisense oligonucleotides, we demonstrate the contribution of these splicing defects to the anti-proliferative/apoptotic phenotype observed in PRMT5-depleted Eµ-myc B cells. We conclude that, in addition to its well-documented oncogenic functions in transcription and translation, MYC also safeguards proper pre-messenger-RNA splicing as an essential step in lymphomagenesis.

[详细]

  • Nature
  • 10年前
  • Letter

Automated determination of fibrillar structures by simultaneous model building and fiber diffraction refinement

自动测定的纤维状结构,同时建立模型和纤维衍射细化

A computational approach—including a cross-validation metric—for automated model building and refinement using X-ray fiber diffraction data is described and applied to solve structures of protein fibers.

[详细]

  • Nature Methods
  • 10年前
  • Article

Design and bioinformatics analysis of genome-wide CLIP experiments

全基因组芯片实验设计和生物信息学分析

The past decades have witnessed a surge of discoveries revealing RNA regulation as a central player in cellular processes. RNAs are regulated by RNA-binding proteins (RBPs) at all post-transcriptional stages, including splicing, transportation, stabilization and translation. Defects in the functions of these RBPs underlie a broad spectrum of human pathologies. Systematic identification of RBP functional targets is among the key biomedical research questions and provides a new direction for drug discovery. The advent of cross-linking immunoprecipitation coupled with high-throughput sequencing (genome-wide CLIP) technology has recently enabled the investigation of genome-wide RBP–RNA binding at single base-pair resolution. This technology has evolved through the development of three distinct versions: HITS-CLIP, PAR-CLIP and iCLIP. Meanwhile, numerous bioinformatics pipelines for handling the genome-wide CLIP data have also been developed. In this review, we discuss the genome-wide CLIP technology and focus on bioinformatics analysis. Specifically, we compare the strengths and weaknesses, as well as the scopes, of various bioinformatics tools. To assist readers in choosing optimal procedures for their analysis, we also review experimental design and procedures that affect bioinformatics analyses.

[详细]

  • Nucleic Acids Research
  • 10年前
  • SURVEY AND SUMMARY

Identification of human telomerase assembly inhibitors enabled by a novel method to produce hTERT

端粒酶抑制剂识别组件启用一种新的方法来产生hTERT

Telomerase is the enzyme that maintains the length of telomeres. It is minimally constituted of two components: a core reverse transcriptase protein (hTERT) and an RNA (hTR). Despite its significance as an almost universal cancer target, the understanding of the structure of telomerase and the optimization of specific inhibitors have been hampered by the limited amount of enzyme available. Here, we present a breakthrough method to produce unprecedented amounts of recombinant hTERT and to reconstitute human telomerase with purified components. This system provides a decisive tool to identify regulators of the assembly of this ribonucleoprotein complex. It also enables the large-scale screening of small-molecules capable to interfere with telomerase assembly. Indeed, it has allowed us to identify a compound that inhibits telomerase activity when added prior to the assembly of the enzyme, while it has no effect on an already assembled telomerase. Therefore, the novel system presented here may accelerate the understanding of human telomerase assembly and facilitate the discovery of potent and mechanistically unique inhibitors.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Methods Online

Duplex stem-loop-containing quadruplex motifs in the human genome: a combined genomic and structural study

双茎环含有四图案在人类基因组:联合基因组和结构研究

Duplex stem-loops and four-stranded G-quadruplexes have been implicated in (patho)biological processes. Overlap of stem-loop- and quadruplex-forming sequences could give rise to quadruplex–duplex hybrids (QDH), which combine features of both structural forms and could exhibit unique properties. Here, we present a combined genomic and structural study of stem-loop-containing quadruplex sequences (SLQS) in the human genome. Based on a maximum loop length of 20 nt, our survey identified 80 307 SLQS, embedded within 60 172 unique clusters. Our analysis suggested that these should cover close to half of total SLQS in the entire genome. Among these, 48 508 SLQS were strand-specifically located in genic/promoter regions, with the majority of genes displaying a low number of SLQS. Notably, genes containing abundant SLQS clusters were strongly associated with brain tissues. Enrichment analysis of SLQS-positive genes and mapping of SLQS onto transcriptional/mutagenesis hotspots and cancer-associated genes, provided a statistical framework supporting the biological involvements of SLQS. In vitro formation of diverse QDH by selective SLQS hits were successfully verified by nuclear magnetic resonance spectroscopy. Folding topologies of two SLQS were elucidated in detail. We also demonstrated that sequence changes at mutation/single-nucleotide polymorphism loci could affect the structural conformations adopted by SLQS. Thus, our predicted SLQS offer novel insights into the potential involvement of QDH in diverse (patho)biological processes and could represent novel regulatory signals.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Structural Biology

The identity of the discriminator base has an impact on CCA addition

鉴频器的基础上加入身份CCA的影响

CCA-adding enzymes synthesize and maintain the C-C-A sequence at the tRNA 3'-end, generating the attachment site for amino acids. While tRNAs are the most prominent substrates for this polymerase, CCA additions on non-tRNA transcripts are described as well. To identify general features for substrate requirement, a pool of randomized transcripts was incubated with the human CCA-adding enzyme. Most of the RNAs accepted for CCA addition carry an acceptor stem-like terminal structure, consistent with tRNA as the main substrate group for this enzyme. While these RNAs show no sequence conservation, the position upstream of the CCA end was in most cases represented by an adenosine residue. In tRNA, this position is described as discriminator base, an important identity element for correct aminoacylation. Mutational analysis of the impact of the discriminator identity on CCA addition revealed that purine bases (with a preference for adenosine) are strongly favoured over pyrimidines. Furthermore, depending on the tRNA context, a cytosine discriminator can cause a dramatic number of misincorporations during CCA addition. The data correlate with a high frequency of adenosine residues at the discriminator position observed in vivo. Originally identified as a prominent identity element for aminoacylation, this position represents a likewise important element for efficient and accurate CCA addition.

[详细]

  • Nucleic Acids Research
  • 10年前
  • RNA

Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences

PSE一:用于生成各种DNA,RNA伪组件模式的Web服务器,和蛋白质序列

With the avalanche of biological sequences generated in the post-genomic age, one of the most challenging problems in computational biology is how to effectively formulate the sequence of a biological sample (such as DNA, RNA or protein) with a discrete model or a vector that can effectively reflect its sequence pattern information or capture its key features concerned. Although several web servers and stand-alone tools were developed to address this problem, all these tools, however, can only handle one type of samples. Furthermore, the number of their built-in properties is limited, and hence it is often difficult for users to formulate the biological sequences according to their desired features or properties. In this article, with a much larger number of built-in properties, we are to propose a much more flexible web server called Pse-in-One (http://bioinformatics.hitsz.edu.cn/Pse-in-One/), which can, through its 28 different modes, generate nearly all the possible feature vectors for DNA, RNA and protein sequences. Particularly, it can also generate those feature vectors with the properties defined by users themselves. These feature vectors can be easily combined with machine-learning algorithms to develop computational predictors and analysis methods for various tasks in bioinformatics and system biology. It is anticipated that the Pse-in-One web server will become a very useful tool in computational proteomics, genomics, as well as biological sequence analysis. Moreover, to maximize users’ convenience, its stand-alone version can also be downloaded from http://bioinformatics.hitsz.edu.cn/Pse-in-One/download/, and directly run on Windows, Linux, Unix and Mac OS.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server Issue

Intercalation processes of copper complexes in DNA

DNA中的铜配合物的插层过程

The family of anticancer complexes that include the transition metal copper known as Casiopeínas® shows promising results. Two of these complexes are currently in clinical trials. The interaction of these compounds with DNA has been observed experimentally and several hypotheses regarding the mechanism of action have been developed, and these include the generation of reactive oxygen species, phosphate hydrolysis and/or base-pair intercalation. To advance in the understanding on how these ligands interact with DNA, we present a molecular dynamics study of 21 Casiopeínas with a DNA dodecamer using 10 μs of simulation time for each compound. All the complexes were manually inserted into the minor groove as the starting point of the simulations. The binding energy of each complex and the observed representative type of interaction between the ligand and the DNA is reported. With this extended sampling time, we found that four of the compounds spontaneously flipped open a base pair and moved inside the resulting cavity and four compounds formed stacking interactions with the terminal base pairs. The complexes that formed the intercalation pocket led to more stable interactions.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Computational Biology

PhyloGene server for identification and visualization of co-evolving proteins using normalized phylogenetic profiles

和共进化蛋白质使用归一化的系统发育谱可视化识别系统发育的服务器

Proteins that function in the same pathways, protein complexes or the same environmental conditions can show similar patterns of sequence conservation across phylogenetic clades. In species that no longer require a specific protein complex or pathway, these proteins, as a group, tend to be lost or diverge. Analysis of the similarity in patterns of sequence conservation across a large set of eukaryotes can predict functional associations between different proteins, identify new pathway members and reveal the function of previously uncharacterized proteins. We used normalized phylogenetic profiling to predict protein function and identify new pathway members and disease genes. The phylogenetic profiles of tens of thousands conserved proteins in the human, mouse, Caenorhabditis elegans and Drosophila genomes can be queried on our new web server, PhyloGene. PhyloGene provides intuitive and user-friendly platform to query the patterns of conservation across 86 animal, fungal, plant and protist genomes. A protein query can be submitted either by selecting the name from whole-genome protein sets of the intensively studied species or by entering a protein sequence. The graphic output shows the profile of sequence conservation for the query and the most similar phylogenetic profiles for the proteins in the genome of choice. The user can also download this output in numerical form.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server Issue

NaviCell Web Service for network-based data visualization

navicell网络数据可视化Web服务

Data visualization is an essential element of biological research, required for obtaining insights and formulating new hypotheses on mechanisms of health and disease. NaviCell Web Service is a tool for network-based visualization of ‘omics’ data which implements several data visual representation methods and utilities for combining them together. NaviCell Web Service uses Google Maps and semantic zooming to browse large biological network maps, represented in various formats, together with different types of the molecular data mapped on top of them. For achieving this, the tool provides standard heatmaps, barplots and glyphs as well as the novel map staining technique for grasping large-scale trends in numerical values (such as whole transcriptome) projected onto a pathway map. The web service provides a server mode, which allows automating visualization tasks and retrieving data from maps via RESTful (standard HTTP) calls. Bindings to different programming languages are provided (Python and R). We illustrate the purpose of the tool with several case studies using pathway maps created by different research groups, in which data visualization provides new insights into molecular mechanisms involved in systemic diseases such as cancer and neurodegenerative diseases.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server issue