Skip to content

Sentiment Analysis of Bangla news comments. This work is implemented on a publicly available Bengali news comments dataset.

License

Notifications You must be signed in to change notification settings

eftekhar-hossain/Bangla-News-Comments

Repository files navigation

Sentiment Analysis of Bangla News Comments Using Machine Learning: Project Overview

  • Developed a machine learning model that can classify the sentimental category (positive, negative and neutral) of a news comment written in Bangla Text.
  • For the implementation a publicly available dataset of 12k news comments have been used.
  • To create the system TF-idf feature extraction technique with n-gram features have been used.
  • Analysed the performance of different machine learning algorithms for n-gram feature by using various evaluation metrics such as accuracy, precision, recall and f1-score.

Dataset Distribution

The dataset consists of 12K news comments of five sentiment categories. For the ease of implementation converted this five categories into 3 categories.

data_dist

Dataset Summary- includes total number of words and unique words in each class.

data_dist

Model Evaluation

Differnet Machine learning classifers are taken to train and evaluate the system efficacy. The experiment is done for N-gram features and measuers the performance using various evaluation metrics.

Performance on Unigram feature:

unigram

Performance on Bigram feature:

bigram

Performance on Tri-gram feature:

trigram

From the above analysis, it is observed that for trigram feature Multinomial Naive Bayes shows good performance in all evaluation metrices.

Accuracy and F1-score Plot:

plot

References:

  1. Dataset Link

Resources Used

  • Python Version: 3.7
  • Packages: Scikit Learn, Numpy, Pandas, Matplotlib, Seaborn

About

Sentiment Analysis of Bangla news comments. This work is implemented on a publicly available Bengali news comments dataset.

Topics

Resources

License

Stars

Watchers

Forks