Code for running robust and repeatable LDA experiments. Use the -h
flag to view CLI parameters for any script.
- lda: Files related to the training and analysis of LDA topic models
- dlda: Files related to the training and analysis of dynamic topic models (using
gensim
'sldaseq
implementation) list_common_words.py
: Takes an experiment config file as a command line argument and runs all specified preprocessing before listing the top 50 words in the dataset which will be used in that experimentplot_data_quants.py
: Driver function to use aTextParser
to make plots of the quantities of data in time frames (especially useful for deciding time intervals for a dynamic topic model)
Install our ogm package and its dependencies.