The Hydrophobic Temperature Dependence of Amino Acids Directly Calculated from Protein Structures

氨基酸直接从蛋白质结构计算疏水温度依赖性

by Erik van Dijk, Arlo Hoogeveen, Sanne Abeln

The hydrophobic effect is the main driving force in protein folding. One can estimate the relative strength of this hydrophobic effect for each amino acid by mining a large set of experimentally determined protein structures. However, the hydrophobic force is known to be strongly temperature dependent. This temperature dependence is thought to explain the denaturation of proteins at low temperatures. Here we investigate if it is possible to extract this temperature dependence directly from a large set of protein structures determined at different temperatures. Using NMR structures filtered for sequence identity, we were able to extract hydrophobicity propensities for all amino acids at five different temperature ranges (spanning 265-340 K). These propensities show that the hydrophobicity becomes weaker at lower temperatures, in line with current theory. Alternatively, one can conclude that the temperature dependence of the hydrophobic effect has a measurable influence on protein structures. Moreover, this work provides a method for probing the individual temperature dependence of the different amino acid types, which is difficult to obtain by direct experiment.

[详细]

  • PLOS Computational Biology
  • 10年前

Structure-guided sequence specificity engineering of the modification-dependent restriction endonuclease LpnPI

结构引导的修改相关的限制性内切酶lpnpi序列特异性工程

The eukaryotic Set and Ring Associated (SRA) domains and structurally similar DNA recognition domains of prokaryotic cytosine modification-dependent restriction endonucleases recognize methylated, hydroxymethylated or glucosylated cytosine in various sequence contexts. Here, we report the apo-structure of the N-terminal SRA-like domain of the cytosine modification-dependent restriction enzyme LpnPI that recognizes modified cytosine in the 5'-C(mC)DG-3' target sequence (where mC is 5-methylcytosine or 5-hydroxymethylcytosine and D = A/T/G). Structure-guided mutational analysis revealed LpnPI residues involved in base-specific interactions and demonstrated binding site plasticity that allowed limited target sequence degeneracy. Furthermore, modular exchange of the LpnPI specificity loops by structural equivalents of related enzymes AspBHI and SgrTI altered sequence specificity of LpnPI. Taken together, our results pave the way for specificity engineering of the cytosine modification-dependent restriction enzymes.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Nucleic Acid Enzymes

Global transcription network incorporating distal regulator binding reveals selective cooperation of cancer drivers and risk genes

全球转录网络将远端调节器结合了选择性合作的癌症驱动和风险基因

Global network modeling of distal regulatory interactions is essential in understanding the overall architecture of gene expression programs. Here, we developed a Bayesian probabilistic model and computational method for global causal network construction with breast cancer as a model. Whereas physical regulator binding was well supported by gene expression causality in general, distal elements in intragenic regions or loci distant from the target gene exhibited particularly strong functional effects. Modeling the action of long-range enhancers was critical in recovering true biological interactions with increased coverage and specificity overall and unraveling regulatory complexity underlying tumor subclasses and drug responses in particular. Transcriptional cancer drivers and risk genes were discovered based on the network analysis of somatic and genetic cancer-related DNA variants. Notably, we observed that the risk genes were functionally downstream of the cancer drivers and were selectively susceptible to network perturbation by tumorigenic changes in their upstream drivers. Furthermore, cancer risk alleles tended to increase the susceptibility of the transcription of their associated genes. These findings suggest that transcriptional cancer drivers selectively induce a combinatorial misregulation of downstream risk genes, and that genetic risk factors, mostly residing in distal regulatory regions, increase transcriptional susceptibility to upstream cancer-driving somatic changes.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Computational Biology

Haploinsufficiency predictions without study bias

无偏预测研究的不足

Any given human individual carries multiple genetic variants that disrupt protein-coding genes, through structural variation, as well as nucleotide variants and indels. Predicting the phenotypic consequences of a gene disruption remains a significant challenge. Current approaches employ information from a range of biological networks to predict which human genes are haploinsufficient (meaning two copies are required for normal function) or essential (meaning at least one copy is required for viability). Using recently available study gene sets, we show that these approaches are strongly biased towards providing accurate predictions for well-studied genes. By contrast, we derive a haploinsufficiency score from a combination of unbiased large-scale high-throughput datasets, including gene co-expression and genetic variation in over 6000 human exomes. Our approach provides a haploinsufficiency prediction for over twice as many genes currently unassociated with papers listed in Pubmed as three commonly-used approaches, and outperforms these approaches for predicting haploinsufficiency for less-studied genes. We also show that fine-tuning the predictor on a set of well-studied ‘gold standard’ haploinsufficient genes does not improve the prediction for less-studied genes. This new score can readily be used to prioritize gene disruptions resulting from any genetic variant, including copy number variants, indels and single-nucleotide variants.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Methods Online

Bacillus subtilis RecO and SsbA are crucial for RecA-mediated recombinational DNA repair

枯草芽孢杆菌RECO,更加是至关重要的RecA蛋白介导的重组DNA修复

Genetic data have revealed that the absence of Bacillus subtilis RecO and one of the end-processing avenues (AddAB or RecJ) renders cells as sensitive to DNA damaging agents as the null recA, suggesting that both end-resection pathways require RecO for recombination. RecA, in the rATP·Mg2+ bound form (RecA·ATP), is inactive to catalyze DNA recombination between linear double-stranded (ds) DNA and naked complementary circular single-stranded (ss) DNA. We showed that RecA·ATP could not nucleate and/or polymerize on SsbA·ssDNA or SsbB·ssDNA complexes. RecA·ATP nucleates and polymerizes on RecO·ssDNA·SsbA complexes more efficiently than on RecO·ssDNA·SsbB complexes. Limiting SsbA concentrations were sufficient to stimulate RecA·ATP assembly on the RecO·ssDNA·SsbB complexes. RecO and SsbA are necessary and sufficient to ‘activate’ RecA·ATP to catalyze DNA strand exchange, whereas the AddAB complex, RecO alone or in concert with SsbB was not sufficient. In presence of AddAB, RecO and SsbA are still necessary for efficient RecA·ATP-mediated three-strand exchange recombination. Based on genetic and biochemical data, we proposed that SsbA and RecO (or SsbA, RecO and RecR in vivo) are crucial for RecA activation for both, AddAB and RecJ–RecQ (RecS) recombinational repair pathways.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Genome Integrity, Repair and Replication

Global analysis of RNA cleavage by 5'-hydroxyl RNA sequencing

对RNA裂解5’-羟基RNA序列的全局分析

RNA cleavage by some endoribonucleases and self-cleaving ribozymes produces RNA fragments with 5'-hydroxyl (5'-OH) and 2',3'-cyclic phosphate termini. To identify 5'-OH RNA fragments produced by these cleavage events, we exploited the unique ligation mechanism of Escherichia coli RtcB RNA ligase to attach an oligonucleotide linker to RNAs with 5'-OH termini, followed by steps for library construction and analysis by massively parallel DNA sequencing. We applied the method to RNA from budding yeast and captured known 5'-OH fragments produced by tRNA Splicing Endonuclease (SEN) during processing of intron-containing pre-tRNAs and by Ire1 cleavage of HAC1 mRNA following induction of the unfolded protein response (UPR). We identified numerous novel 5'-OH fragments derived from mRNAs: some 5'-OH mRNA fragments were derived from single, localized cleavages, while others were likely produced by multiple, distributed cleavages. Many 5'-OH fragments derived from mRNAs were produced upstream of codons for highly electrostatic peptides, suggesting that the fragments may be generated by co-translational mRNA decay. Several 5'-OH RNA fragments accumulated during the induction of the UPR, some of which share a common sequence motif that may direct cleavage of these mRNAs. This method enables specific capture of 5'-OH termini and complements existing methods for identifying RNAs with 2',3'-cyclic phosphate termini.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Methods Online

Immunotherapy: Killer combo

免疫治疗:致命的组合

Cytotoxic T lymphocyte-associated antigen 4 (CTLA4) and programmed cell death 1 (PD1) receptor inhibit antitumour immunity through complementary and non-redundant mechanisms. A new trial assessed the combination of ipilimumab (a CTLA4-specific monoclonal antibody) and nivolumab (a PD1-specific monoclonal antibody) in 142 patients with metastatic melanoma

[详细]

  • Nature Reviews Cancer 15, 320 (2015)
  • 10年前
  • Research Highlight

Hijacked in cancer: the KMT2 (MLL) family of methyltransferases

癌症:劫持的kmt2(MLL)家族的甲基转移酶

Histone–lysine N-methyltransferase 2 (KMT2) family proteins methylate lysine 4 on the histone H3 tail at important regulatory regions in the genome and thereby impart crucial functions through modulating chromatin structures and DNA accessibility. Although the human KMT2 family was initially named the mixed-lineage leukaemia

[详细]

  • Nature Reviews Cancer 15, 334 (2015)
  • 10年前
  • Review

Building better monoclonal antibody-based therapeutics

建立更好的单克隆抗体为基础的疗法

For 20 years, monoclonal antibodies (mAbs) have been a standard component of cancer therapy, but there is still much room for improvement. Efforts continue to build better cancer therapeutics based on mAbs. Anticancer mAbs function through various mechanisms, including directly targeting the malignant cells, modifying

[详细]

  • Nature Reviews Cancer 15, 361 (2015)
  • 10年前
  • Review

Diffusion maps for high-dimensional single-cell analysis of differentiation data

扩散映射为高维单细胞分析的区别数据

Motivation: Single-cell technologies have recently gained popularity in cellular differentiation studies regarding their ability to resolve potential heterogeneities in cell populations. Analysing such high-dimensional single-cell data has its own statistical and computational challenges. Popular multivariate approaches are based on data normalisation, followed by dimension reduction and clustering to identify subgroups. However, in the case of cellular differentiation, we would not expect clear clusters to be present but instead expect the cells to follow continuous branching lineages.

Results: Here we propose the use of diffusion maps to deal with the problem of defining differentiation trajectories. We adapt this method to single-cell data by adequate choice of kernel width and inclusion of uncertainties or missing measurement values, which enables the establishment of a pseudo-temporal ordering of single cells in a high-dimensional gene expression space. We expect this output to reflect cell differentiation trajectories, where the data originates from intrinsic diffusion-like dynamics. Starting from a pluripotent stage, cells move smoothly within the transcriptional landscape towards more differentiated states with some stochasticity along their path. We demonstrate the robustness of our method with respect to extrinsic noise (e.g. measurement noise) and sampling density heterogeneities on simulated toy data as well as two single-cell quantitative polymerase chain reaction (qPCR) data sets (i.e. mouse haematopoietic stem cells and mouse embryonic stem cells) and an RNA-Seq data of human pre-implantation embryos. We show that diffusion maps perform considerably better than Principal Component Analysis (PCA) and are advantageous over other techniques for non-linear dimension reduction such as t-distributed Stochastic Neighbour Embedding (t-SNE) for preserving the global structures and pseudotemporal ordering of cells.

Availability: The Matlab implementation of diffusion maps for single-cell data is available at https://www.helmholtz-muenchen.de/icb/single-cell-diffusion-map.

Contact: fbuettner.phys@gmail.com, fabian.theis@helmholtz-muenchen.de

[详细]

  • Bioinformatics
  • 10年前
  • ORIGINAL PAPER

PopGeV: A Web-based Large-scale Population Genome Browser

PopGeV:一个基于网络的大规模人口基因组浏览器

Motivation: The development of high-throughput sequencing technology has made it possible for more and more researchers to use population sequencing data to mine genes associated with specific traits. However, the massive amounts of sequencing data have also brought new challenges to the researchers. The question of how to browse population genomic data in an easy and intuitive manner must be addressed. Web-based genome browsers allow user to conveniently view the results of genomic analyses, but heavy usage can reduce the response speed of the webpage, which limits its usefulness in the display of large-scale genome data. IndexedDB technology is a good solution to this problem; it supports web browsers and so creates local databases. In this way, data can be read from the local storage, achieving a smooth display of population genomic data.

Results: PopGeV has the following characteristics. First, it uses a new encoding method for compression of population SNP and INDEL data. IndexedDB technology is used to download the results to local storage so that users can browse the results smoothly even when the network traffic is heavy. Second, PopGeV identify similar genomic regions between two individuals based on SNP data. Population diversity indexes are calculated when comparing two populations. Third, user defined annotation information can be integrated for user-friendly mining of gene functions. Simulation shows that PopGeV can smoothly display analysis results of population genome containing over 500 individuals with 2 millions SNP data.

Availability: PopGeV is available at www.soyomics.com/popgev/

Contact: yuanxh@iga.ac.cn

[详细]

  • Bioinformatics
  • 10年前
  • APPLICATIONS NOTE

pwOmics: An R package for pathway-based integration of time-series omics data using public database knowledge

pwOmics:R包pathway-based时序组学数据利用公共数据库知识的集成

Summary: Characterization of biological processes is progressively enabled with the increased generation of omics data on different signaling levels. Here we present a straightforward approach for the integrative analysis of data from different high-throughput technologies based on pathway and interaction models from public databases. pwOmics performs pathway-based level-specific data comparison of coupled human proteomic and genomic/transcriptomic data sets based on their log fold changes. Separate downstream and upstream analyses results on the functional levels of pathways, transcription factors and genes/transcripts are performed in the cross-platform consensus analysis. These provide a basis for the combined interpretation of regulatory effects over time. Via network reconstruction and inference methods (steiner tree, dynamic bayesian network inference) consensus graphical networks can be generated for further analyses and visualization.

Availability: The R package pwOmics is freely available on Bioconductor (http://www.bioconductor.org/).

Contact: astrid.wachter@med.uni-goettingen.de

[详细]

  • Bioinformatics
  • 10年前
  • APPLICATIONS NOTE

tmle.npvi: targeted, integrative search of associations between DNA copy number and gene expression, accounting for DNA methylation

Tmle. Npvi: targeted, integrative search of associations between DNA copy number and gene expression, accounting for DNA methylation

Summary: We describe the implementation of the method introduced by Chambaz et al. (2012). We also demonstrate its genome-wide application to the integrative search of new regions with strong association between DNA copy number and gene expression accounting for DNA methylation in breast cancers.

Availability and implementation: An open-source R package tmle.npvi is available from CRAN (http://cran.r-project.org/).

Contact: pierre.neuvial@genopole.cnrs.fr

[详细]

  • Bioinformatics
  • 10年前
  • APPLICATIONS NOTE

Bayesian mixture analysis for metagenomic community profiling

贝叶斯分析混合物metagenomic社区分析

Motivation: Deep sequencing of clinical samples is now an established tool for the detection of infectious pathogens, with direct medical applications. The large amount of data generated produces an opportunity to detect species even at very low levels, provided that computational tools can effectively profile the relevant metagenomic communities. Data interpretation is complicated by the fact that short sequencing reads can match multiple organisms and by the lack of completeness of existing databases, in particular for viral pathogens. Here we present metaMix, a Bayesian mixture model framework for resolving complex metagenomic mixtures. We show that the use of parallel Monte Carlo Markov chains (MCMC) for the exploration of the species space enables the identification of the set of species most likely to contribute to the mixture.

Results: We demonstrate the greater accuracy of metaMix compared to relevant methods, particularly for profiling complex communities consisting of several related species. We designed metaMix specifically for the analysis of deep transcriptome sequencing datasets, with a focus on viral pathogen detection, however the principles are generally applicable to all types of metagenomic mixtures.

Availability: metaMix is implemented as a user friendly R package, freely available on CRAN: http://cran.r-project.org/web/packages/metaMix

Contact: sofia.morfopoulou.10@ucl.ac.uk

Supplementary Information: Supplementary Material is available at Bionformatics online.

[详细]

  • Bioinformatics
  • 10年前
  • ORIGINAL PAPER

Reverse engineering of logic-based differential equation models using a mixed-integer dynamic optimisation approach

逆向工程中基于逻辑微分方程模型使用一个混合整数动态优化方法

Motivation: Systems biology models can be used to test new hypotheses formulated on the basis of previous knowledge or new experimental data, contradictory with a previously existing model. New hypotheses often come in the shape of a set of possible regulatory mechanisms. This search is usually not limited to finding a single regulation link, but rather a combination of links subject to great uncertainty or no information about the kinetic parameters.

Results: In this work, we combine a logic-based formalism, to describe all the possible regulatory structures for a given dynamic model of a pathway, with mixed-integer dynamic optimization (MIDO). This framework aims to simultaneously identify the regulatory structure (represented by binary parameters) and the real-valued parameters that are consistent with the available experimental data, resulting in a logic-based differential equation model. The alternative to this would be to perform real-valued parameter estimation for each possible model structure, which is not tractable for models of the size presented in this work. The performance of the method presented here is illustrated with several case studies: a synthetic pathway problem of signaling regulation, a two component signal transduction pathway in bacterial homeostasis, and a signaling network in liver cancer cells.

Supplementary information: Supplementary materials are available at Bioinformatics online.

Contact: julio@iim.csic.es, saezrodriguez@ebi.ac.uk

[详细]

  • Bioinformatics
  • 10年前
  • ORIGINAL PAPER

The Opponent Channel Population Code of Sound Location Is an Efficient Representation of Natural Binaural Sounds

声音定位对手通道人口代码自然双耳的声音的一种有效形式

by Wiktor Młynarski

In mammalian auditory cortex, sound source position is represented by a population of broadly tuned neurons whose firing is modulated by sounds located at all positions surrounding the animal. Peaks of their tuning curves are concentrated at lateral position, while their slopes are steepest at the interaural midline, allowing for the maximum localization accuracy in that area. These experimental observations contradict initial assumptions that the auditory space is represented as a topographic cortical map. It has been suggested that a “panoramic” code has evolved to match specific demands of the sound localization task. This work provides evidence suggesting that properties of spatial auditory neurons identified experimentally follow from a general design principle- learning a sparse, efficient representation of natural stimuli. Natural binaural sounds were recorded and served as input to a hierarchical sparse-coding model. In the first layer, left and right ear sounds were separately encoded by a population of complex-valued basis functions which separated phase and amplitude. Both parameters are known to carry information relevant for spatial hearing. Monaural input converged in the second layer, which learned a joint representation of amplitude and interaural phase difference. Spatial selectivity of each second-layer unit was measured by exposing the model to natural sound sources recorded at different positions. Obtained tuning curves match well tuning characteristics of neurons in the mammalian auditory cortex. This study connects neuronal coding of the auditory space with natural stimulus statistics and generates new experimental predictions. Moreover, results presented here suggest that cortical regions with seemingly different functions may implement the same computational strategy-efficient coding.

[详细]

  • PLOS Computational Biology
  • 10年前

Regulators Associated with Clinical Outcomes Revealed by DNA Methylation Data in Breast Cancer

临床乳腺癌中DNA甲基化数据显示相关的监管机构

by Matthew H. Ung, Frederick S. Varn, Shaoke Lou, Chao Cheng

The regulatory architecture of breast cancer is extraordinarily complex and gene misregulation can occur at many levels, with transcriptional malfunction being a major cause. This dysfunctional process typically involves additional regulatory modulators including DNA methylation. Thus, the interplay between transcription factor (TF) binding and DNA methylation are two components of a cancer regulatory interactome presumed to display correlated signals. As proof of concept, we performed a systematic motif-based in silico analysis to infer all potential TFs that are involved in breast cancer prognosis through an association with DNA methylation changes. Using breast cancer DNA methylation and clinical data derived from The Cancer Genome Atlas (TCGA), we carried out a systematic inference of TFs whose misregulation underlie different clinical subtypes of breast cancer. Our analysis identified TFs known to be associated with clinical outcomes of p53 and ER (estrogen receptor) subtypes of breast cancer, while also predicting new TFs that may also be involved. Furthermore, our results suggest that misregulation in breast cancer can be caused by the binding of alternative factors to the binding sites of TFs whose activity has been ablated. Overall, this study provides a comprehensive analysis that links DNA methylation to TF binding to patient prognosis.

[详细]

  • PLOS Computational Biology
  • 10年前

Primerize: automated primer assembly for transcribing non-coding RNA domains

primerize:自动转录的非编码RNA引物组合域

Customized RNA synthesis is in demand for biological and biotechnological research. While chemical synthesis and gel or chromatographic purification of RNA is costly and difficult for sequences longer than tens of nucleotides, a pipeline of primer assembly of DNA templates, in vitro transcription by T7 RNA polymerase and kit-based purification provides a cost-effective and fast alternative for preparing RNA molecules. Nevertheless, designing template primers that optimize cost and avoid mispriming during polymerase chain reaction currently requires expert inspection, downloading specialized software or both. Online servers are currently not available or maintained for the task. We report here a server named Primerize that makes available an efficient algorithm for primer design developed and experimentally tested in our laboratory for RNA domains with lengths up to 300 nucleotides. Free access: http://primerize.stanford.edu.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Web Server Issue

Modulation of LSD1 phosphorylation by CK2/WIP1 regulates RNF168-dependent 53BP1 recruitment in response to DNA damage

由CK2磷酸化/ Wip1 LSD1调制调节DNA损伤应答RNF168依赖53BP1招聘

Proper DNA damage response is essential for the maintenance of genome integrity. The E3 ligase RNF168 deficiency fully prevents both the initial recruitment and retention of 53BP1 at sites of DNA damage. In response to DNA damage, RNF168-dependent recruitment of the lysine-specific demethylase LSD1 to the site of DNA damage promotes local H3K4me2 demethylation and ubiquitination of H2A/H2AX, facilitating 53BP1 recruitment to sites of DNA damage. Alternatively, RNF168-mediated K63-linked ubiquitylation of 53BP1 is required for the initial recruitment of 53BP1 to sites of DNA damage and for its function in repair. We demonstrated here that phosphorylation and dephosphorylation of LSD1 at S131 and S137 was mediated by casein kinase 2 (CK2) and wild-type p53-induced phosphatase 1 (WIP1), respectively. LSD1, RNF168 and 53BP1 interacted with each other directly. CK2-mediated phosphorylation of LSD1 exhibited no impact on its interaction with 53BP1, but promoted its interaction with RNF168 and RNF168-dependent 53BP1 ubiquitination and subsequent recruitment to the DNA damage sites. Furthermore, overexpression of phosphorylation-defective mutants failed to restore LSD1 depletion-induced cellular sensitivity to DNA damage. Taken together, our results suggest that LSD1 phosphorylation modulated by CK2/WIP1 regulates RNF168-dependent 53BP1 recruitment directly in response to DNA damage and cellular sensitivity to DNA damaging agents.

[详细]

  • Nucleic Acids Research
  • 10年前
  • Genome Integrity, Repair and Replication