Skip to content

JZ1015/DocumentClusteringAndTopicModeling_Unsupervised-learning

Repository files navigation

DocumentClusteringAndTopicModeling_Unsupervised-learning

Clustered unlabeled textual contents from IMDB and WiKi with unsupervised K-means, and Latent Dirichlet Allocation (LDA)
Preprocessed text content by tokenizing, stemming and stop-words removing, performed feature extraction with TF-IDF
Identified latent topics and keywords of each cluster and visualized training results after dimensionality reduction with Principal Component analysis (PCA)

Source data attached. Please open with Jupyter Notebook

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published