NEASE (Network Enrichment method for Alternative Splicing Events) a tool for functional enrichment of alternative splicing exons/events.
The python package NEASE (Network-based Enrichment method for Alternative Splicing Events) first detects protein features affected by AS such as domains, motifs and residues. Next, NEASE uses a protein-protein interactions integrated with domain-domain interactions, residue-level and domain-motif interactions to identify interaction partners likely affected by AS
Next, NEASE performs gene set overrepresentation analysis and identifies enriched pathways based on affected edges. Furthermore, since the statistical approach is network-based, it also prioritizes (differentially) spliced genes and finds new disease biomarkers candidates in case of aberrant splicing.
To install the package from PyPI please run:
pip install nease
To install the package from git:
git clone https://github.com/louadi/NEASE.git && cd NEASE
pip install .
Enjoy your instance of NEASE
The standard input of the package (also the recommended) is a DataFrame object with Ensembl IDs of the genes and the exon coordinates from human genome build hg38 (GRCh38).
- First column - genes IDs (Only Ensembl gene IDs can be used).
- Second column - start of the exon coordinate.
- Third column - end of the exon coordinate.
- Fourth column - dPSI (optionally)
Gene | Start | End | dpsi |
---|---|---|---|
ENSG00000154263 | 69314431 | 69315425 | -0.10 |
ENSG00000154265 | 87411893 | 87412033 | 0.13 |
In the case of differential splicing analysis, please make sure to filter events by significance beforehand, for example using a p-value cutoff.
The package also supports the output of multiple AS differential detection tools such as rMATs, Whippet and MAJIQ. For the newer version of MAJIQ, pleasee check the issue.
If you need help with your data or need to add support for another tools, open an issue or contact us.
Please note, that all functions are annotated with dockstrings for more details.
Import NEASE package and pandas:
import nease
import pandas as pd
table: Data input as DataFrame object as explained in "Data input".
organism: Either 'Human' or 'Mouse'.
input_type: Either 'Standard', 'Whippet', 'rmats'or "MAJIQ".
remove_non_in_frame: Remove exons that are predicted to disturb the ORF (Prediction source).
only_divisible_by_3: Remove exons not divisible by 3.
only_DDIs: Only use DDI annotations (No PDB and ELM).
p_value_cutoff: The enrichment p-value cutoff.
events=nease.run(table, organism='Human',input_type='Standard',
remove_non_in_frame=True, only_divisible_by_3=False)
Get statistics of your data.
events.get_stats()
Get a list of all affected domains.
events.get_domains()
Get a list of all affected linear motifs.
events.get_elm()
Get a list of all affected residues and their interactions from the PDB.
events.get_pdb()
List of affected interactions from domains and linear motif binding.
events.get_edges()
The main function of NEASE
database: a list of pathway databases to run enrichment on it.
# Supported databases:
database= ['PharmGKB', 'HumanCyc', 'Wikipathways', 'Reactome','KEGG', 'SMPDB',
'Signalink','NetPath', 'EHMN', 'INOH','BioCarta','PID']
# Run enrichment on Reactome only
events.enrich(database=['Reactome'])
path_id: a string representing the Pathway ID. You can find pathways id in the enrichment table results.
events.path_analysis('R-HSA-112314')
Generate an HTML file for visualizing the network example.
path_id: a string representing the Pathway ID.
file: a string representing a local file path for the HTML file.
save_pdf: If True, also save the figure as PDF.
k: a Float for the algorithm Fruchterman-Reingold force-directed for nodes positions, to be tuned by the user. You might need to run the following function multiple times for an optimal visualizations. more details.
events.Vis_path("R-HSA-5674135",file='AS data/enrichment/',save_pdf=False,k=0.8)
gseapy_databases: gseapy pathways databases
non_symmetrical_only: Run classical gene set enrichment for only non-symmetrical alternative exons, that are likely to cause NMD
# Run on KEGG gene set
events.classic_enrich(gseapy_databases=['KEGG_2019_Human'],non_symmetrical_only=True)
A step-by-step guide to use NEASE is available here.
A simple example for running NEASE on a standard input: (Notebook/Google Colab)
If you use NEASE, please cite:
Louadi, Z., Elkjaer, M.L., Klug, M. et al. Functional enrichment of alternative splicing events with NEASE reveals insights into tissue identity and diseases. Genome Biol 22, 327 (2021). https://doi.org/10.1186/s13059-021-02538-1