Skip to content

Navigation Menu

Explore
For
- Enterprise
- Teams
- Startups
- Education
By Solution
Resources
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

OpenLLMAI / OpenRLHF Public

Notifications You must be signed in to change notification settings
Fork 132
Star 1.5k

Code
Issues 48
Pull requests
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Releases: OpenLLMAI/OpenRLHF

Releases · OpenLLMAI/OpenRLHF

Release v0.3.0

11 Jun 00:01

Compare

Choose a tag to compare

Release v0.3.0 Latest

Changes

Upgraded the PyTorch NGC container to version 24.02 (@openllmai0, Xianyu)
Upgraded DeepSpeed to version 0.14.0 (@openllmai0, Xianyu)
Fixed the vLLM version to 0.4.2 (Xianyu)
Cleaned up the codebase (@openllmai0, Xianyu)

Contributors

openllmai0

Assets 2

All reactions

Release v0.2.9

02 Jun 02:39

Compare

Choose a tag to compare

Release v0.2.9 Pre-release

Pre-release

Changes

Fixed OOM for --colocate_critic_reward and --colocate_actor_ref @openllmai0 (Xianyu)

Contributors

openllmai0

Assets 2

All reactions

Release v0.2.8

24 May 13:56

Compare

Choose a tag to compare

Release v0.2.8 Pre-release

Pre-release

Changes

Fixed DPO loss mask @openllmai0 (Xianyu)
Fixed vLLM generation corner case @openllmai0 (Xianyu)
Upgraded Ray and Transformers @openllmai0 (Xianyu)
Fixed typos in README.md @KT313
Added system prompt in datasets @hijkzzz

Contributors

hijkzzz, KT313, and openllmai0

Assets 2

All reactions

Release v0.2.7

06 May 07:57

Compare

Choose a tag to compare

Release v0.2.7 Pre-release

Pre-release

Changes

Added support for vLLM-v0.4.2 @hijkzzz
Added support for Jamba-v0.1 (Incompatible with vLLM-v0.4.2 now) @hijkzzz
Added LoRA configs (--lora_dropout, --target_modules) @hijkzzz

Contributors

hijkzzz

Assets 2

All reactions

Release v0.2.6

30 Apr 02:33

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Release v0.2.6 Pre-release

Pre-release

Changes

Upgraded vLLM to v0.4.1 @mgerstgrasser @wuxibin89 @hijkzzz
Upgraded Transformers to v4.40.1 and DeepSpeed to v0.14.0 @hijkzzz
Fixed typo in train_ppo_ray.py @mickelliu
Fixed mismatch size output_state_dict(148) and state_dict(149) in model saving @hijkzzz
Added support for --colocate_actor_ref and --colocate_critic_reward in train_ppo_ray.py @hijkzzz
Added support for Ray PPO reward ref models offloading @hijkzzz

Contributors

wuxibin89, mgerstgrasser, and 2 other contributors

Assets 2

All reactions

Release v0.2.5

12 Apr 09:13

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Changes

Added Chinese README.md @khazic
Added KD Trainer and Loss @ifromeast
Fixed num_training_steps @wuxibin89
Updated requirements.txt @kfertakis
Fixed error due to 'margin' variable type being list in rm_trainer.py @StwayneXG

Contributors

wuxibin89, kfertakis, and 3 other contributors

Assets 2

All reactions

Release v0.2.4

13 Mar 14:05

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Changes

Fixed DPO masked loss function @hijkzzz
Fixed Yi-34B tokenizer (--disable_fast_tokenizer) #240 @hijkzzz
Supported wandb.login() (--wandb True) #231 @mgerstgrasser

Contributors

mgerstgrasser and hijkzzz

Assets 2

All reactions

Release v0.2.3

04 Mar 00:14

Compare

Choose a tag to compare

Changes

Fixed #191 "deepspeed.zero.Init causes very strange spikes in PPO policy_loss" @hijkzzz
Added dockerfile for vLLM @hijkzzz

Contributors

hijkzzz

Assets 2

All reactions

Release v0.2.2

01 Mar 01:44

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Changes

Fixed LlamaRotaryEmbedding for Transformers v4.38.1 @hijkzzz
Use lazy vLLM engine @wuxibin89
Added Chinese PR docs @catqaq
Fixed tensor shape docs @Thecats-Jfm

Contributors

wuxibin89, hijkzzz, and 2 other contributors

Assets 2

All reactions

Release v0.2.1

22 Feb 02:50

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Changes

Fixed position_ids for left padding #217 @hijkzzz
Supported input_key for custom dataset @hijkzzz

Contributors

hijkzzz

Assets 2

All reactions

Previous 1 2 3 Next

Previous Next

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.