Jupyter Notebook for NLP Tasks
-
Updated
Jun 21, 2024 - Jupyter Notebook
Jupyter Notebook for NLP Tasks
This repository provides a complete workflow for text processing using Hugging Face Transformers and NLTK. It includes modules for sentence normalization, spelling correction, word embedding generation, positional encoding computation, and English-to-French translation
Inneall aistriúcháin atá taobh thiar de Chaighdeánaitheoir na Gaeilge, agus aistritheoirí Gàidhlig/Gaelg→Gaeilge
Simple tool to check if Unicode text files are Unicode-normalized
📢 Tha (ថា) - A Khmer Text Normalization and Verbalization Toolkit
Predict emotions (happiness, anger, sadness) from WhatsApp chat data using machine learning and deep learning models. Includes text normalization, vectorization (TF-IDF, BoW, Word2Vec, GloVe), and model evaluation.
Accurate categorization of eCommerce products improves user experience and boosts search engine visibility. The project goal is to classify products into 14 predefined categories using their descriptions sourced from an eCommerce platform.
Twitter Sentiment Analysis using Natural Language Processing(NLP)
This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation". It is intended to be used for normalizing / cleaning Bengali and English text.
Japanese text normalizer for mecab-neologd
Extract text content from an HTML page, process it, and extract unique words from the processed text. This notebook utilizes various text processing techniques including cleaning, normalization, tokenization, lemmatization or stemming, and stop words removal.
Proper categorization of e-commerce products enhances the user experience and achieves better results with external search engines. The objective of the project is to classify a product into four given categories, based on its description available on an e-commerce platform.
Useful String extensions to save you time in production.
Welcome to my text scrapbook! Here you will find examples of text tokenization, normalization, n-grams, and lots of text adjacent stuff.
Repository for text normalization research.
JS / Python3 / PHP Lib to work with UTF8 polytonic greek and latin
🧹 Python package for text cleaning
Chinese text normalization for speech processing
Add a description, image, and links to the text-normalization topic page so that developers can more easily learn about it.
To associate your repository with the text-normalization topic, visit your repo's landing page and select "manage topics."