Cluster the documents by topics they talk about.
To run the model follow the steps:
git clone https://github.com/ermalaliraj/python-lda-topic-modeling-ec-laws.git
cd python-lda-topic-modeling-ec-laws/ktrain
pip install -r requirements.txt
python train_model.py
python predict.py
Generate visualization file with documents distributions.
python visualize.py
- PROS:
- Easy to implement
- CONS:
- A document is not expressed as a distribution of Topics, instead as a single Topic.