-
Notifications
You must be signed in to change notification settings - Fork 3
code and data from Latent Structure modeling paper
poldrack/LatentStructure
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This repository contains code and relevant files for the Latent Structure modeling project described in: Poldrack RA, Mumford JA, Schonberg T, Kalar D, Barman B, Yarkoni T (2012). Discovering relations between mind, brain, and mental disorders using topic mapping. PLOS Computational Biology, in press. Preprint available at: http://talyarkoni.org/papers/Poldrack_et_al_in_press_PLoS_CompBio.pdf Guide to subdirectories: CCA - contains results from CCA analysis NIF-Disorders - contains files used for disorders topic modeling clustering - contains files used for clustering of disorders cogatlas - contains files used for cogatlas topic modeling src: contains the source files used for all analyses. these require the following external libraries: http://numpy.scipy.org/ http://scikit-learn.org/stable/ http://rpy.sourceforge.net/rpy2.html http://nipy.sourceforge.net/nibabel/ also note that src/utils needs to be in the python path 1_db_foci_to_image.py - obtains coordinates from neurosynth database and creates images - uses utils/tal_to_mni.py 1.1_merge_peakfiles.sh - creates merged image and computes mask of voxels with activation on at least 1% of papers 1.2_extract_nidag_text.py - extract full text of each article from database 2_get_cogat_concepts.py - read cognitive atlas concepts from RDF and get loading for each document 3_create_disorder_list.py - load NIF dysfunction ontology, grab synonyms, and add additional missing terms 4_get_disorders_NIF.py - get loading for each disorder term from corpus 5_mk_cogatlas_docs.py - make documents based on cogatlas loadings 6_mk_disorders_docs.py - make documents based on disorder loadings 7_mk_pickled_data.py - created a pickle with the full dataset to make loading easier 8_mk_8fold_cogatlas.py - make files for 8-fold CV 9_mk_8fold_disorders.py - make files for 8-fold CV 10_mk_8fold_cogatlas_mallet_data.py - make mallet data from 8-fold files 11_mk_8fold_disorders_mallet_data.py - make mallet data from 8-fold files 12_mk_8fold_cogatlas_topic_models.py - create scripts to run mallet jobs 13_mk_8fold_disorders_topic_models.py - create scripts to run mallet jobs 14_get_best_topic_likelihood.py - check likelihoods to get get dimenstionality 15.1_get_disorders_dimensionality.py - generate additional topic models to get disorder dimensionality 15.3_get_best_disorder_dimensionality.py - get dimensionality that has unique topic dists 15_run_topic_models.py - generate scripts to run final topic models 16_mk_cogatlas_loadingdata.py - load topic data and save loadingdata.txt 17_mk_disorders_loadingdata.py - load topic data and save loadingdata.txt 18_mk_all_chisq_maps.py - make all chisquare maps - uses utils/mk_chisq_maps_filter.py 18.2_mk_6mm_chisq_maps.sh - make 6mm versions of topic 19_mk_slice_images_cogatlas.py - make slice images using p values 20_mk_slice_images_disorders.py - make slice images using p values 22_mk_latexreport_cogatlas.py - create latex report 23_mk_latexreport_disorders.py - create latex report 24_run_CCA_nonneg.py - run CCA analysis 25_cluster_disorders.R - run clustering on disorders data 26_make_cognitive_topic_figure.py - generate figure 2 from initial submission 27_make_disorders_topic_figure.py - generate figure 3 from initial submission 28_mk_8fold_fulltext_topic_models.py - create scripts to run fulltext topic models 28.1_mk_8fold_fulltext_mallet_data.py - make 8-fold data for full text analysis 28.2_run_all_fulltext_nfold.sh - run mallet jobs on 8-fold full text 28.3_get_best_dimensionality.py - get best dimensionality for full text 30_get_wordhist.py - get histograms of # of docs for each filtered paper 31_count_locations.py - count # of locations reported across all papers 32_doc_topic_hist.py - make histograms of docs/topic and topics/doc (for Figure 3) 33_plot_likelihood.py - plot empirical likelihood as function of # of topics (for Figure 2) 34_run_CCA_on_topics.py - run CCA directly on topic distributions 35_mk_fold1_loadingdata.py - get loadingdata for fold1 for each ntopics 36.1_mk_slice_corr_images.disorders.py - make images using correlation rather than p value 36_mk_slice_corr_images.cogatlas.py - make images using correlation rather than p value (for figure 37_make_cognitive_topic_corr_figure.py - make Figure 4 38_make_disorder_topic_corr_figure.py - make Figure 6 39_get_topic_hierarchy.py - create graphviz .dot file for topic hierarchy (Figure 5)
About
code and data from Latent Structure modeling paper
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published