clustering_text_classification

Using supervised and unsupervised machine learning methods to process and classify raw text data.

Objective: Pick a set of texts, process the texts and apply a series of unsupervised clustering methods to group the texts. Now analyze which clustering method groups the texts most consistely with respect to the person of interest. Apply supervised and unsupervised permutations of feature selection and generation to build a model that will classify the texts by person of interest. Lastly, evaluate this model against a holdout group of 25%, analyze the consistency of its prediction and explain any notable divergencies.
Data Source: Wikiquote.org
Pull 700 quotes from 10 wikiquote archives (70 per).
Persons of interest: Plato, Socrates, Sigmund Frued, Friedrich Nietzsche, René Descartes, Immanuel Kant, David Hume, Bertrand Russell, John Locke, Noam Chomsky

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Capstone 4 Unsupervised Learning Text Classification.ipynb		Capstone 4 Unsupervised Learning Text Classification.ipynb
README.md		README.md

Provide feedback