IMDB Movie Reviews Sentiment Analysis

This repository contains a Python script for performing sentiment analysis on IMDB movie reviews using a Recurrent Neural Network (RNN) with an LSTM layer. The script preprocesses the text data, builds and trains the model, and evaluates its performance.

Dataset

The dataset used is imdb_labelled.txt, which consists of movie reviews and their corresponding sentiments (0 for negative and 1 for positive).

Requirements

Python 3.x
pandas
numpy
nltk
tensorflow
scikit-learn

Installation

Clone the repository:

git clone https://github.com/Taufiq-ML/imdb-sentiment-analysis.git
cd imdb-sentiment-analysis

Install the required packages:

pip install pandas numpy nltk tensorflow scikit-learn

Download NLTK stopwords:
```
import nltk
nltk.download('stopwords')
```

Usage

Ensure you have the imdb_labelled.txt file in the same directory as the script.
Run the script:
```
python sentiment_analysis.py
```

Script Overview

Data Loading and Cleaning:
- Load the dataset using pandas.
- Clean the text data by removing HTML tags, non-alphabetical characters, converting to lowercase, and removing stopwords.
Tokenization and Padding:
- Tokenize the text data and convert it into sequences.
- Pad the sequences to ensure uniform length.
Model Building:
- Build a Sequential model using Keras with an Embedding layer, LSTM layer, Dropout layer, and a Dense output layer.
Training and Evaluation:
- Compile the model with Adam optimizer and binary cross-entropy loss.
- Train the model and evaluate its performance on a test set.
Making Predictions:
- Use the trained model to predict the sentiment of sample reviews.

Example Output

Num GPUs Available:  1
Loading dataset...
Cleaning text data...
Tokenizing and padding sequences...
Splitting data into training and test sets...
Building the model...
Compiling the model...
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
embedding (Embedding)        (None, 100, 128)          640000
_________________________________________________________________
lstm (LSTM)                  (None, 32)                20608
_________________________________________________________________
dropout (Dropout)            (None, 32)                0
_________________________________________________________________
dense (Dense)                (None, 1)                 33
=================================================================
Total params: 660,641
Trainable params: 660,641
Non-trainable params: 0
_________________________________________________________________
Training the model...
Epoch 100/100
Epoch 97/100
19/19 - 0s - 22ms/step - accuracy: 1.0000 - loss: 5.6306e-04 - val_accuracy: 0.7667 - val_loss: 1.4543
Epoch 98/100
19/19 - 0s - 24ms/step - accuracy: 1.0000 - loss: 9.6709e-04 - val_accuracy: 0.7667 - val_loss: 1.4433
Epoch 99/100
19/19 - 0s - 24ms/step - accuracy: 1.0000 - loss: 6.5609e-04 - val_accuracy: 0.7667 - val_loss: 1.4387
Epoch 100/100
19/19 - 0s - 23ms/step - accuracy: 1.0000 - loss: 4.8037e-04 - val_accuracy: 0.7667 - val_loss: 1.4504
Evaluating the model...
5/5 - 0s - 6ms/step - accuracy: 0.7667 - loss: 1.4504
Test Accuracy: 0.7666666507720947
Sentiment: Positive

Contributing

If you have any suggestions or improvements, feel free to create a pull request or open an issue.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE		LICENSE
README.md		README.md
imdb_labelled.txt		imdb_labelled.txt
untitled0.py		untitled0.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IMDB Movie Reviews Sentiment Analysis

Dataset

Requirements

Installation

Usage

Script Overview

Example Output

Contributing

License

About

Releases

Packages

Languages

License

Taufiq-ML/RNN-Sentiment-analysis-

Folders and files

Latest commit

History

Repository files navigation

IMDB Movie Reviews Sentiment Analysis

Dataset

Requirements

Installation

Usage

Script Overview

Example Output

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages