Skip to content
View charent's full-sized avatar
Block or Report

Block or report charent

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
charent/README.md

PROFILE+VIEWS FOLLOWERS 博客主页 Gitee主页

Hi there 👋

Thanks for visiting my Github Page. Here are some facts about me:

  • 🔭 I’m currently working on: machine learning / deep learning, data analysis / risk control / data mining, and algorithm.
  • 🌱 I'm also working on those directions of NLP: text classification, information extraction and text generation.
  • 🔬 I'm now interesting in how to get high-quality text for training language models (for examples, text-to-text model such as T5, causal language model such as GPT2 / Phi), and how to speedy up LLM (Large Language Model) training, fine-tune and inference. In addition, the application of LLM in vertical fields is also a very interesting direction, such as RAG (Retrieval Augmented Generation).
  • 📫 ······

My skills 🛠️

  • Languages:
    • Python, SQL, Shell, C++, a little Golang and a little Java.
  • Frameworks:
    • PyTorch, Huggingface's NLP framework, Pandas & Numpy, PySpark, Hive.
  • Developments:
    • Linux, Git, Docker, VSCode, Markdown.

Contributes 🧑‍💻

GitHub Streak

   Charent's GitHub stats       Top Langs   

Links 🔗

Pinned

  1. ChatLM-mini-Chinese ChatLM-mini-Chinese Public

    中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。

    Python 883 104

  2. Phi2-mini-Chinese Phi2-mini-Chinese Public

    Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.

    Jupyter Notebook 401 43

  3. pytorch_IE_model pytorch_IE_model Public

    基于二分标注的文本三元组信息抽取模型。实现抽取出文本中的多个三元组,输出格式:[ (主实体, 关系, 客实体), ......]。

    Python 11 2

  4. QtPlayerServerAndClient QtPlayerServerAndClient Public

    基于QT C++的网络视频播放器,包括服务端和客户端。

    C++ 3

  5. Android_Sensor_Data_Collection Android_Sensor_Data_Collection Public

    采集安卓手机输入过程中重力感应器、线性加速度感应器、加速度感应器和陀螺仪的变化数据。用于训练姿态识别模型。

    Java 2 1

  6. Phone_Attitude_Recognition Phone_Attitude_Recognition Public

    手机输入姿态识别。通过采集不同使用姿态下的手机重力感应器、线性加速度感应器、加速度感应器和陀螺仪的数据,用于训练模型并部署。

    Python 3 1