IDPEnsembleTools: An Open-Source Library for Analysis of Conformational Ensembles of Disordered Proteins
IDPEnsembleTools is a Python package designed to facilitate the loading, analysis, and comparison of multiple conformational ensembles of intrinsically disordered proteins (IDPs).
It supports various input formats such as .pdb
, .xtc
, and .dcd
, and enables users to extract both global and local structural features, perform dimensionality reduction, and compute similarity scores between ensembles.
Full documentation is available at:
https://bioComputingUP.github.io/EnsembleTools
With IDPEnsembleTools, you can:
-
Extract global features of structural ensembles:
- Radius of gyration (Rg)
- Asphericity
- Prolateness
- End-to-end distance
-
Extract local features:
- Interatomic distances
- Phi–psi angles
- Alpha-helix content
-
Perform dimensionality reduction on ensemble features:
- PCA
- UMAP
- t-SNE
-
Compare structural ensembles using:
- Jensen-Shannon (JS) divergence
- Visualize similarity matrices
The notebooks/
directory contains a collection of Jupyter notebooks that demonstrate how to use the EnsembleTools
package. These examples cover key functionalities such as ensemble comparison, dimensionality reduction (PCA, t-SNE, UMAP), feature extraction, and visualization customization. They serve both as tutorials and reproducible workflows for analyzing disordered protein ensembles.
Notebook | Description | Link |
---|---|---|
comparing_ensembles.ipynb |
Compare multiple conformational ensembles using selected metrics and visualizations. | View |
featurization.ipynb |
Generate numerical features from protein ensembles for downstream analysis. | View |
kpca_analysis.ipynb |
Perform Kernel PCA to capture non-linear variance in ensemble structures. | View |
loading_data.ipynb |
Load and preprocess ensemble data from various formats. | View |
pca_analysis.ipynb |
Principal Component Analysis (PCA) for dimensionality reduction and visualization. | View |
plot_customization.ipynb |
Customize plots for clarity and publication-quality visualizations. | View |
sh3_example.ipynb |
Case study: global and local analysis of the SH3 domain of the Drk protein. | View |
tsne_analysis.ipynb |
t-SNE embedding of ensemble features to explore local structure. | View |
umap_analysis.ipynb |
UMAP embedding of ensemble features and visualization. | View |
It is recommended to install idpet
in a clean virtual environment to avoid conflicts with existing packages.
# Create and activate a new conda environment
conda create -n idpet-env python=3.9
conda activate idpet-env
# Install the package from PyPI
pip install idpet
# Create a new virtual environment (Python 3.7+)
python -m venv idpet-env
# Activate the environment
# On Linux/macOS:
source idpet-env/bin/activate
# On Windows:
idpet-env\Scripts\activate
# Upgrade pip and install the package
pip install --upgrade pip
pip install idpet
git clone https://github.com/BioComputingUP/EnsembleTools.git
cd idpet
pip install -e .
This project is licensed under the MIT License.