Releases: predibase/lorax
Releases · predibase/lorax
lorax-0.3.0
LoRAX is the open-source framework for serving hundreds of fine-tuned LLMs in production for the price of one.
lorax-0.2.1
LoRAX is the open-source framework for serving hundreds of fine-tuned LLMs in production for the price of one.
v0.2.1
v0.2.0
What's Changed
Enhancements
- Implement sparse SGMV by @tgaddair in #64
- Implement tensor parallel SGMV by @tgaddair in #79
- Add adapter support for all linear layers in Llama and Mistral by @tgaddair in #75
- 4 bit support by @flozi00 in #66
- Exllamav2 by @flozi00 in #60
Bugfixes
- Updated to custom SGMV kernel to fix issue with certain ranks by @tgaddair in #70
- fix: Allow using unsupported base models without adapter loading by @tgaddair in #76
Maintenance
- Add DISABLE_SGMV env var to explicitly fallback to loop by @tgaddair in #69
- Upgrade the README discord badge and use an invite link that doesn't expire. by @justinxzhao in #73
New Contributors
- @justinxzhao made their first contribution in #73
Full Changelog: v0.1.2...v0.2.0
v0.1.2
v0.1.1
What's Changed
- Add Helm charts to deploy models by @abidwael in #27
- change defaults for helm chart by @noyoshi in #38
- add helm release wf by @noyoshi in #39
- Added support for YARN scaling by @tgaddair in #45
- Fixed tensor parallelism splits by @tgaddair in #47
- enh: enable CodeLlama by @geoffreyangus in #48
- Fallback when Punica is not installed by @tgaddair in #49
- add transformers gptq weights by @flozi00 in #52
- Add support for paged attention v2 and update flash attention v2 by @tgaddair in #54
- Fixed adapter loading for GPTQ base models by @tgaddair in #58
- Update gha to be able to automatically push images with release tags by @magdyksaleh in #59
New Contributors
Full Changelog: v0.1.0...v0.1.1
lorax-0.1.0
LoRAX is the open-source framework for serving hundreds of fine-tuned LLMs in production for the price of one.