Skip to content

Releases: OpenLLMAI/OpenRLHF

Release v0.3.0

11 Jun 00:01
Compare
Choose a tag to compare

Changes

  • Upgraded the PyTorch NGC container to version 24.02 (@openllmai0, Xianyu)
  • Upgraded DeepSpeed to version 0.14.0 (@openllmai0, Xianyu)
  • Fixed the vLLM version to 0.4.2 (Xianyu)
  • Cleaned up the codebase (@openllmai0, Xianyu)

Release v0.2.9

02 Jun 02:39
Compare
Choose a tag to compare
Release v0.2.9 Pre-release
Pre-release

Changes

  • Fixed OOM for --colocate_critic_reward and --colocate_actor_ref @openllmai0 (Xianyu)

Release v0.2.8

24 May 13:56
Compare
Choose a tag to compare
Release v0.2.8 Pre-release
Pre-release

Changes

Release v0.2.7

06 May 07:57
Compare
Choose a tag to compare
Release v0.2.7 Pre-release
Pre-release

Changes

  • Added support for vLLM-v0.4.2 @hijkzzz
  • Added support for Jamba-v0.1 (Incompatible with vLLM-v0.4.2 now) @hijkzzz
  • Added LoRA configs (--lora_dropout, --target_modules) @hijkzzz

Release v0.2.6

30 Apr 02:33
9af18be
Compare
Choose a tag to compare
Release v0.2.6 Pre-release
Pre-release

Changes

  • Upgraded vLLM to v0.4.1 @mgerstgrasser @wuxibin89 @hijkzzz
  • Upgraded Transformers to v4.40.1 and DeepSpeed to v0.14.0 @hijkzzz
  • Fixed typo in train_ppo_ray.py @mickelliu
  • Fixed mismatch size output_state_dict(148) and state_dict(149) in model saving @hijkzzz
  • Added support for --colocate_actor_ref and --colocate_critic_reward in train_ppo_ray.py @hijkzzz
  • Added support for Ray PPO reward ref models offloading @hijkzzz

Release v0.2.5

12 Apr 09:13
74a8b73
Compare
Choose a tag to compare

Changes

Release v0.2.4

13 Mar 14:05
bed10e1
Compare
Choose a tag to compare

Changes

Release v0.2.3

04 Mar 00:14
Compare
Choose a tag to compare

Changes

  • Fixed #191 "deepspeed.zero.Init causes very strange spikes in PPO policy_loss" @hijkzzz
  • Added dockerfile for vLLM @hijkzzz

Release v0.2.2

01 Mar 01:44
ad7fb49
Compare
Choose a tag to compare

Changes

Release v0.2.1

22 Feb 02:50
1e69ccd
Compare
Choose a tag to compare

Changes