Skip to content
/ osprey Public

Latent Dirichlet Allocation (LDA) Topic Modelling Platform for Cultural Strategists

Notifications You must be signed in to change notification settings

lvxhnat/osprey

Repository files navigation

LDA Topic Modelling Dashboard

This repo aims to build on the work of Cornell jsLDA and pyLDAvis by providing some added functionalities to allow for better understanding of topics within a document corpus.

Getting Started

The dashboard is currently not yet deployed or dockerized. For now, the instructions below are for running the dashboard on your local machine

1 In your shell, change the current directory to osprey_admin, where the Django code is

cd osprey_admin 

2 Install required packages

pip3 install -r requirements.txt

3 Run Django on local server

python manage.py runserver 8000

4 Open on http://127.0.0.1:8000/ to view the dashboard

5 Running the Dashboard Model

a. Select the number of topics to train the LDA model with.
b. Enter in Column to Size, the exact name of the column that we want to size with the LDA algorithm.
c. Enter in Defining Column ID, the column that defines each unique value in Column to Size
d. Upload the CSV and run the model.


Click on the various bubbles to navigate the different topics.

alt-text

Notes

Charts are coded in D3.js and Chart.js. Backend calculations are done in DJango. Latent Dirichlet allocation is described in Blei et al. (2003) and Pritchard et al. (2000). Intertopic Distances are calculated as described in LDAvis, using Jenson Shannon Divergence between topic words Endres, D. M.; J. E. Schindelin (2003) and scaling the multi-dimensional data onto 2 dimensions using Principle Component Analysis (PCA) Pearson K (1901)

** Design of the dashboard in the .gif might lag behind the current status of the project and is not reflective of the current state

Upcoming Patches and Features

Upcoming features, design plans and progress can be found on my notion page here

Other Implementations

  • pyLDAvis here
  • Cornell jsLDA website here

About

Latent Dirichlet Allocation (LDA) Topic Modelling Platform for Cultural Strategists

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published