Article, 2024

Conformational ensembles of the human intrinsically disordered proteome

Nature, ISSN 0028-0836, 1476-4687, Volume 626, 8000, Pages 897-904, 10.1038/s41586-023-07004-5

Contributors

Tesei, Giulio 0000-0003-4339-4460 (Corresponding author) [1] Trolle, Anna Ida 0009-0005-9314-6627 [1] Jonsson, Nicolas 0000-0002-7838-1814 [1] Betz, Johannes 0000-0003-0910-6576 [1] Knudsen, Frederik E. [1] Pesce, Francesco 0000-0002-3403-2524 [1] Johansson, Kristoffer Enøe 0000-0001-6054-0461 [1] Lindorff-Larsen, Kresten 0000-0002-4750-6039 (Corresponding author) [1]

Affiliations

  1. [1] University of Copenhagen
  2. [NORA names: KU University of Copenhagen; University; Denmark; Europe, EU; Nordic; OECD]

Abstract

Intrinsically disordered proteins and regions (collectively, IDRs) are pervasive across proteomes in all kingdoms of life, help to shape biological functions and are involved in numerous diseases. IDRs populate a diverse set of transiently formed structures and defy conventional sequence–structure–function relationships1. Developments in protein science have made it possible to predict the three-dimensional structures of folded proteins at the proteome scale2. By contrast, there is a lack of knowledge about the conformational properties of IDRs, partly because the sequences of disordered proteins are poorly conserved and also because only a few of these proteins have been characterized experimentally. The inability to predict structural properties of IDRs across the proteome has limited our understanding of the functional roles of IDRs and how evolution shapes them. As a supplement to previous structural studies of individual IDRs3, we developed an efficient molecular model to generate conformational ensembles of IDRs and thereby to predict their conformational properties from sequences4,5. Here we use this model to simulate nearly all of the IDRs in the human proteome. Examining conformational ensembles of 28,058 IDRs, we show how chain compaction is correlated with cellular function and localization. We provide insights into how sequence features relate to chain compaction and, using a machine-learning model trained on our simulation data, show the conservation of conformational properties across orthologues. Our results recapitulate observations from previous studies of individual protein systems and exemplify how to link—at the proteome scale—conformational ensembles with cellular function and localization, amino acid sequence, evolutionary conservation and disease variants. Our freely available database of conformational properties will encourage further experimental investigation and enable the generation of hypotheses about the biological roles and evolution of IDRs.

Keywords

IDR, Intrinsically disordered proteins, Kingdom, acid sequence, amino, amino acid sequence, biological functions, biological role, cellular functions, chain, chain compaction, compaction, conformational ensembles, conformational properties, conservation, data, database, development, disease, disease variants, disordered proteins, disordered proteome, diverse set, ensemble, evolution, evolutionary conservation, experimental investigation, features, function, functional role, generation, human proteome, hypothesis, inability, intrinsically, investigation, kingdoms of life, knowledge, lack, lack of knowledge, life, localization, machine-learning models, model, molecular modeling, observations, orthologues, properties, protein, protein science, protein systems, proteomics, region, role, science, sequence, sequence features, sets, simulated data, simulation, structural properties, structural studies, structure, structures of folded proteins, study, supplementation, system, three-dimensional structure, three-dimensional structure of folded proteins, variants

Funders

  • Novo Nordisk Foundation
  • European Commission

Data Provider: Digital Science