Sentimental-Analysis

Sentimental Analysis of Movie Reviews Using Pytorch

Preparing Data

One of the main concepts of TorchText is the Field. These define how your data should be processed. In our sentiment classification task the data consists of both the raw string of the review and the sentiment, either "pos" or "neg".
The parameters of a Field specify how the data should be processed.
We use the TEXT field to define how the review should be processed, and the LABEL field to process the sentiment.
TEXT field has tokenize='spacy' as an argument. This defines that the "tokenization" (the act of splitting the string into discrete "tokens") should be done using the spaCy tokenizer. If no tokenize argument is passed, the default is simply splitting the string on spaces.
LABEL is defined by a LabelField.

The number of unique words in our training set is over 100,000 , which means that our one-hot vectors will have over 100,000 dimensions
To reduce the dimensions we only keep 25,000 Most Common Words

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
main.py		main.py