huggingface / trl Public

Notifications
Fork 997
Star 8.3k

Code
Issues 60
Pull requests 18
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: huggingface/trl

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

60 Open 844 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

how to save v_head

#1650 opened May 20, 2024 by zyzhang1130

Adapter name for SFT trainer

#1649 opened May 18, 2024 by para-zhou

Set seed

#1648 opened May 17, 2024 by user799595

ValueError when training on a multi GPU setup and DPO

#1645 opened May 16, 2024 by miosturu

How to save and resume a checkpoint from PPOTrainer

#1643 opened May 14, 2024 by paraGONG

ImportError: cannot import name 'DPOConfig' from 'trl'

#1642 opened May 14, 2024 by AswiniNLP

Do we need to consider the chat template when doing DPO/KTO training?

#1640 opened May 11, 2024 by ZeroYuHuang

ImportError: cannot import name 'SFTConfig' from 'trl'

#1639 opened May 11, 2024 by brand17

When I used galore on orpo, the learning rate was set to 8e-6, but the training rate was 0.01

#1638 opened May 10, 2024 by Minami-su

How to use trl\trainer\kto_trainer.py

#1635 opened May 9, 2024 by mazhengyufreedom

Seq2SeqTrainer with DataCollatorForCompletionOnlyLM: incorrect masking for evaluation

#1634 opened May 8, 2024 by adamamer20

Seq2seq model with ppo_trainer samples strange output!

#1633 opened May 8, 2024 by sajastu

DDPO cannot use SDXL

#1630 opened May 8, 2024 by mao-code

PPOTrainer ignores data_collator keyword argument and uses provided collator inconsistently

#1629 opened May 8, 2024 by codezakh

Custom DPO Trainer CUDA OOM

#1626 opened May 7, 2024 by TheGhoul21

Learning to generate EOS tokens

#1623 opened May 6, 2024 by vwxyzjn

ConstantLengthDataset Ignore Some Texts

#1621 opened May 4, 2024 by TianyiPeng

kto error when assign dataset to device

#1620 opened May 4, 2024 by mostafamdy

Long data length cause Cuda ouf of memory when DPO training

#1619 opened May 4, 2024 by virt9

Have trouble in ppo example

#1618 opened May 3, 2024 by Shiguang-Guo

Error when Using 8-bit Quantization

#1616 opened May 3, 2024 by JhonDan1999

[enhancement] Implement IRPO training custom loss

#1611 opened May 1, 2024 by TheGhoul21

Training stops early

#1601 opened Apr 30, 2024 by Techinix

CLI utils class cases seem to be incorrect

#1600 opened Apr 29, 2024 by busycalibrating

[Question]How should I combine DPOTrainer and Accelerate for training?

#1597 opened Apr 29, 2024 by YangsongLan

Previous 1 2 3 Next

Previous Next

ProTip! Add no:assignee to see everything that’s not assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly