Releases · bentoml/OpenLLM

15 Mar 09:28

v0.5.0-alpha

58c741c

Pre-release

Release 0.5.0-alpha [generated by GitHub Actions]

Assets 2

06 Feb 03:17

github-actions

v0.4.44

1b54d64

v0.4.44

Installation

pip install openllm==0.4.44

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.44

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.44 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix: remove vllm dependency for pytorch bento by @larme in #893

Full Changelog: v0.4.43...v0.4.44

Contributors

larme

Assets 57

05 Feb 10:58

github-actions

v0.4.43

fe44c84

v0.4.43

Installation

pip install openllm==0.4.43

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.43

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.43 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix: limit BentoML version range by @larme in #881
chore: bump up bentoml version to 1.1.11 by @larme in #883
Bump BentoML version in tools by @larme in #884

Full Changelog: v0.4.42...v0.4.43

Contributors

larme

Assets 57

02 Feb 12:31

github-actions

v0.4.42

d1583cc

v0.4.42

Installation

pip install openllm==0.4.42

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.42

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.42 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

docs: Update opt example to ms-phi by @Sherlock113 in #805
chore(script): run vendored scripts by @aarnphm in #808
docs: README.md typo by @weibeu in #819
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in #818
chore(deps): bump docker/metadata-action from 5.3.0 to 5.4.0 by @dependabot in #814
chore(deps): bump taiki-e/install-action from 2.22.5 to 2.23.1 by @dependabot in #813
chore(deps): bump github/codeql-action from 3.22.11 to 3.22.12 by @dependabot in #815
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in #825
chore(deps): bump crazy-max/ghaction-import-gpg from 6.0.0 to 6.1.0 by @dependabot in #824
chore(deps): bump taiki-e/install-action from 2.23.1 to 2.23.7 by @dependabot in #823
docs: Add Llamaindex in freedom to build by @Sherlock113 in #826
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in #836
chore(deps): bump docker/metadata-action from 5.4.0 to 5.5.0 by @dependabot in #834
chore(deps): bump aquasecurity/trivy-action from 0.16.0 to 0.16.1 by @dependabot in #832
chore(deps): bump taiki-e/install-action from 2.23.7 to 2.24.1 by @dependabot in #833
chore(deps): bump vllm to 0.2.7 by @aarnphm in #837
chore: update discord link by @aarnphm in #838
improv(package): use python slim base image and let pytorch install cuda by @larme in #807
fix(dockerfile): conflict deps by @aarnphm in #841
chore: fix typo in list_models pydoc by @fuzzie360 in #847
docs: update README.md telemetry code link by @fuzzie360 in #842
chore(deps): bump taiki-e/install-action from 2.24.1 to 2.25.1 by @dependabot in #846
chore(deps): bump github/codeql-action from 3.22.12 to 3.23.0 by @dependabot in #844
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in #848
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in #858
chore(deps): bump taiki-e/install-action from 2.25.1 to 2.25.9 by @dependabot in #856
chore(deps): bump github/codeql-action from 3.23.0 to 3.23.1 by @dependabot in #855
fix: proper SSE handling for vllm by @larme in #877
chore: set stop to empty list by default by @larme in #878
fix: all runners sse output by @larme in #880

New Contributors

@weibeu made their first contribution in #819
@fuzzie360 made their first contribution in #847

Full Changelog: v0.4.41...v0.4.42

Contributors

larme, fuzzie360, and 5 other contributors

Assets 57

18 Dec 18:18

github-actions

v0.4.41

b09bd20

v0.4.41

GPTQ Supports

vLLM backend now support GPTQ with upstream

openlml start TheBloke/Mistral-7B-Instruct-v0.2-GPTQ --backend vllm --quantise gptq

Installation

pip install openllm==0.4.41

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.41

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.41 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

docs: add notes about dtypes usage. by @aarnphm in #786
chore(deps): bump taiki-e/install-action from 2.22.0 to 2.22.5 by @dependabot in #790
chore(deps): bump github/codeql-action from 2.22.9 to 3.22.11 by @dependabot in #794
chore(deps): bump sigstore/cosign-installer from 3.2.0 to 3.3.0 by @dependabot in #793
chore(deps): bump actions/download-artifact from 3.0.2 to 4.0.0 by @dependabot in #791
chore(deps): bump actions/upload-artifact from 3.1.3 to 4.0.0 by @dependabot in #792
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in #796
fix(cli): avoid runtime __origin__ check for older Python by @aarnphm in #798
feat(vllm): support GPTQ with 0.2.6 by @aarnphm in #797
fix(ci): lock to v3 iteration of actions/artifacts workflow by @aarnphm in #799

Full Changelog: v0.4.40...v0.4.41

Contributors

dependabot, aarnphm, and pre-commit-ci

Assets 57

15 Dec 16:57

github-actions

v0.4.40

2e8fc28

v0.4.40

Installation

pip install openllm==0.4.40

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.40

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.40 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix(infra): conform ruff to 150 LL by @aarnphm in #781
infra: update blame ignore to formatter hash by @aarnphm in #782
perf: upgrade mixtral to use expert parallelism by @aarnphm in #783

Full Changelog: v0.4.39...v0.4.40

Contributors

aarnphm

Assets 57

14 Dec 19:30

github-actions

v0.4.39

d4fbbce

v0.4.39

Installation

pip install openllm==0.4.39

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.39

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.39 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix(logprobs): correct check logprobs by @aarnphm in #779

Full Changelog: v0.4.38...v0.4.39

Contributors

aarnphm

Assets 57

13 Dec 23:36

github-actions

v0.4.38

1dbae67

v0.4.38

Installation

pip install openllm==0.4.38

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.38

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.38 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix(mixtral): correct chat templates to remove additional spacing by @aarnphm in #774
fix(cli): correct set arguments for openllm import and openllm build by @aarnphm in #775
fix(mixtral): setup hack atm to load weights from pt specifically instead of safetensors by @aarnphm in #776

Full Changelog: v0.4.37...v0.4.38

Contributors

aarnphm

Assets 57

13 Dec 14:22

github-actions

v0.4.37

8d9d212

v0.4.37

Installation

pip install openllm==0.4.37

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.37

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.37 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

feat(mixtral): correct support for mixtral by @aarnphm in #772
chore: running all script when installation by @aarnphm in #773

Full Changelog: v0.4.36...v0.4.37

Contributors

aarnphm

Assets 57

12 Dec 06:44

github-actions

v0.4.36

9cd1e44

v0.4.36

Mixtral supports

Supports Mixtral on BentoCloud with vLLM and all required dependencies.

Bento built with openllm now defaults to python 3.11 for this change to work.

Installation

pip install openllm==0.4.36

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.36

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.36 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

feat(openai): supports echo by @aarnphm in #760
fix(openai): logprobs when echo is enabled by @aarnphm in #761
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in #767
chore(deps): bump docker/metadata-action from 5.2.0 to 5.3.0 by @dependabot in #766
chore(deps): bump actions/setup-python from 4.7.1 to 5.0.0 by @dependabot in #765
chore(deps): bump taiki-e/install-action from 2.21.26 to 2.22.0 by @dependabot in #764
chore(deps): bump aquasecurity/trivy-action from 0.14.0 to 0.16.0 by @dependabot in #763
chore(deps): bump github/codeql-action from 2.22.8 to 2.22.9 by @dependabot in #762
feat: mixtral support by @aarnphm in #770

Full Changelog: v0.4.35...v0.4.36

Contributors

dependabot, aarnphm, and pre-commit-ci

Assets 57

Releases: bentoml/OpenLLM

v0.5.0-alpha

v0.4.44

Installation

Usage

What's Changed

Contributors

v0.4.43

Installation

Usage

What's Changed

Contributors

v0.4.42

Installation

Usage

What's Changed

New Contributors

Contributors

v0.4.41

GPTQ Supports

Installation

Usage

What's Changed

Contributors

v0.4.40

Installation

Usage

What's Changed

Contributors

v0.4.39

Installation

Usage

What's Changed

Contributors

v0.4.38

Installation

Usage

What's Changed

Contributors

v0.4.37

Installation

Usage

What's Changed

Contributors

v0.4.36

Mixtral supports

Installation

Usage

What's Changed

Contributors