Statistical analysis, annotation and functional enrichment analsysis of the IIS-FJD CNV database of allele frequencies.
These scripts were developed to analyse the CNV database of the Instituto de Investigación Sanitaria Fundación Jiménez Díaz (IIS-FJD). They are specific for this database and therefore they will not work for CNV databases that do not share the same structure.
The datasets used in the project are not included in this repository as they contain information from IIS-FJD patients.
The files contained in this repository are:
- StatisticalAnalysis.R: statistical and exploratory data analysis of the database.
- DensityPlot.R: tools for the data visualization of the distribution of the genomic regions contained in the database.
- split_matrix.py: divides of the data base of CNVs in two -one for duplications and another for deletions-.
- run_AnnotSV.sh: bash script to annotate the regions of the CNV data base.
- genes.R: statistical analysis of the genes annotated int the database.
- GO_v2.R, KEGG_v2.R, InterPro_v2.R and HPO_v2: functional annotation of the genes mapped to the regions of the CNV database and enrichment analysis.
- EA_Results: results of the functional enrichment analysis performed with the data of the IIS-FJD cohort.
Ana Solbas Casajús - a.solbas@alumnos.upm.es
Project Link: https://github.com/asolbas/CNVdb