dpo

A open-source framework designed to adapt pre-trained Language Models (LLMs), such as Llama, Mistral, and Mixtral, to a wide array of domains and languages.

lora finetuning dpo llm finetuning-llms continual-pre-training

Updated May 27, 2024
Python

armbues / SiLLM-examples

Star

Examples for using the SiLLM framework for training and running Large Language Models (LLMs) on Apple Silicon

lora mlx dpo apple-silicon large-language-models llm llm-training llm-inference

Updated May 17, 2024
Python

golang-malawi / go-dpo

Star

Unofficial Go library for DPO Group

golang library payments dpo

Updated May 3, 2024
Go

DPO-Group / DPO_Gravity_Forms

Star

This is the DPO Group plugin for Gravity Forms.

gravityforms gravity-forms gravityforms-payment dpo

Updated Apr 29, 2024
PHP

ssbuild / llm_dpo

Star

dpo finetuning

lora dpo qlora

Updated Apr 23, 2024
Python

vicgalle / configurable-safety-tuning

Sponsor

Star

Data and models for the paper "Configurable Safety Tuning of Language Models with Synthetic Preference Data"

alignment safety preference-learning dpo llm

Updated Apr 23, 2024
Python

sugarandgugu / Simple-Trl-Training

Star

基于DPO算法微调语言大模型，简单好上手。

simple dpo trl llm rlhf

Updated Apr 16, 2024
Python

RobinSmits / Dutch-LLMs

Star

Various training, inference and validation code and results related to Open LLM's that were pretrained (full or partially) on the Dutch language.

transformers pytorch alpaca peft dpo trl large-language-models open-llama polylm qwen2

Updated Apr 9, 2024
Jupyter Notebook

Improve this page

Add a description, image, and links to the dpo topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the dpo topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dpo

Here are 47 public repositories matching this topic...

modelscope / swift

armbues / SiLLM

jianzhnie / LLamaTuner

OctopusMind / DPO

TideDra / VL-RLHF

RockeyCoss / SPO

shibing624 / MedicalGPT

kyryl-opens-ml / rlfh-dagster-modal

ducnh279 / Align-LLMs-with-DPO

ContextualAI / HALOs

martin-wey / CodeUltraFeedback

DPO-Group / DPO_WooCommerce

adithya-s-k / Indic-llm

armbues / SiLLM-examples

golang-malawi / go-dpo

DPO-Group / DPO_Gravity_Forms

ssbuild / llm_dpo

vicgalle / configurable-safety-tuning

sugarandgugu / Simple-Trl-Training

RobinSmits / Dutch-LLMs

Improve this page

Add this topic to your repo