Skip to content

Text classification using NLP with Sequential model developed on LSTM architecture utilizing Embedding and Tokenizer

Notifications You must be signed in to change notification settings

nickoaryad/Cloudeka2024-NewsClassification

Repository files navigation

Cloudeka2024-NewsClassification

Welcome to the README of Machine Learning: Developing NLP Model Using TensorFLow.

The Dataset 📈

This repository uses dataset available here.

The Devs ✒️

This repository is developed as Final Assignment of Belajar Pengembangan Machine Learning module, a part of Machine Learning learning path of Dicoding awarded by Lintasarta Cloudeka Digischool 2024.

The Problems 📝

This repository is focusing on these following bulletpoints:

  • Minimum 1,000 samples for dataset
  • Data split : 80% train and 20% validation
  • Developing sequential model
  • Commencing LSTM model within the model architecture
  • Utilizing Embedding and Tokenizer function
  • Training duration not more than 30 minutes
  • 75% of minimum accuracy both train set and validation set

The Libraries 📚

  • numpy library to carry out numerical computation such as sets, arrays, multidimension matrixes, and vectors
  • pandas library to undergo data processing, analysing, and manipulation using dataframe
  • matplotlib library to perform visualization using plotting
  • os library to execute loading data
  • zipfile library to extract file
  • skicit learn library to split dataset
  • tensorflow library to generate image
  • keras library to display image
  • nltk library to commence text preprocessing

Copyright © Nicko Arya Dharma 2024 All Rights Reserved

About

Text classification using NLP with Sequential model developed on LSTM architecture utilizing Embedding and Tokenizer

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published