Modeling the structural ensemble of intrinsically disordered proteins (IDPs), which lack fixed structures, is essential in understanding their cellular functions and revealing their regulation mechanisms in signaling pathways of related diseases (e.g., cancers and neurodegenerative disorders). Though the ensemble concept has been widely believed to be the most accurate way to depict 3D structures of IDPs, few of the traditional ensemble-based approaches effectively address the degeneracy problem which occurs when multiple solutions are consistent with experimental data and is the main challenge in the IDP ensemble construction task. In this paper, based on a predefined conformational library, we formalize the structure ensemble construction problem into a least squares framework, which provides the optimal solution when the data constraints outnumber unknown variables. To deal with the degeneracy problem, we further propose a regularized regression approach based on the elastic net technique with the assumption that the weights to be estimated for individual structures in the ensemble are sparse. We have validated our methods through a reference ensemble approach as well as by testing the real biological data of three proteins, including alpha-synuclein, the translocation domain of Colocin N and the K18 domain of Tau protein.
Representative Structure Structure Ensemble Chemical Shift Data Degeneracy Problem Backbone Chemical Shift
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in to check access.