This repository provides an in-depth exploration of various topic modeling techniques, implemented using Python and Jupyter Notebooks(or Jupyter lab). The following models are covered:
- Latent Dirichlet Allocation (LDA)
- Dynamic Topic Model (DTM)
- Topics Over Time (ToT)
- Correlated Topic Model (CTM)
- Structural Topic Model (STM)
- Biterm Topic Model (BTM)
- BERTopic
Before setting up the environment, ensure that Make
is installed on your system.
For Windows users, please refer to this guide to install Make
.
-
Clone this repository:
git clone https://github.com/sorrychoe/topic-modeling-theory.git cd topic-modeling-theory
-
Initialize the setup:
make init
-
Install dependencies:
make install
Now you're ready to start exploring the topic modeling notebooks.
Each model is implemented in a separate Jupyter Notebook, providing explanations and code walkthroughs. You can start by running any of the notebooks to see how the models work and modify the code to fit your specific use case.
- notebook/Latent_Dirichlet_Allocation.ipynb:
- Theory & Python Sample Code of Latent Dirichlet Allocation.
- notebook/Dynamic_Topic_Model.ipynb:
- Theory & Python Sample Code of Dynamic Topic Model.
- notebook/Topic_over_Time.ipynb:
- Theory & Python Sample Code(Alternative) of Topics Over Time.
- notebook/Correlated_Topic_Model.ipynb:
- Theory & Python Sample Code of Correlated Topic Model.
- notebook/Structural_Topic_Model.ipynb:
- Theory & R Sample Code of Structural Topic Model.
- notebook/Biterm_Topic_Model.ipynb:
- Theory & Python Sample Code of Biterm Topic Model.
If there is a problem while using it, please register the issue section in Github.