Skip to content
View e-olang's full-sized avatar
🎯
Focusing
🎯
Focusing
Block or Report

Block or report e-olang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
e-olang/README.md

About Me :

I'm a dedicated Machine Learning Researcher and AI enthusiast with a keen interest in Natural Language Processing (NLP) and Reinforcement Learning. As an experienced ML Engineer, I have a proven track record of developing cutting-edge solutions that leverage the power of AI to solve real-world challenges.
Download Résumé
LinkedIn Profile

💻Tech Stack

Python Java Anaconda MySQL Pandas PyTorch TensorFlow NumPy

📊GitHub Stats


🌱Current

  • 👯 I’m looking to collaborate on any interesting Artificial Intelligence and or Machine Learning Projects
  • 📫 How to reach me: via email: [email protected]

Research & Projects

An attempt to develop a similarity search model capable of generating optimal numeric text representations for more than one language. The final intended use would be to build some search engine. Still doing reviews of related material to establish what is feasible. Feel free to reach out and help. A demo of how semantic search can be used can be seen here; although in English, this tool tries to return similar research papers/works based on the user queries.

A transformers model pre-trained on a large corpus of Swahili data in a self-supervised fashion. This means it was pre-trained on the raw texts only, with no humans labeling them in any way (which is why it can use lots of publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it was pre-trained for a masked language modeling objective.


Pinned

  1. NLP NLP Public

    Jupyter Notebook

  2. Bag of words Bag of words
    1
    import sklearn
    2
    from sklearn.feature_extraction.text import CountVectorizer
    3
    
    
    4
    
    
    5
    texts = ["This is a good child", "This was a bad child"]
  3. ngram_language_modelling.ipynb ngram_language_modelling.ipynb
    1
    {
    2
      "cells": [
    3
        {
    4
          "cell_type": "markdown",
    5
          "metadata": {