Release v0.1.0 · vectorch-ai/ScaleLLM

Major changes:

Added python wrapper and published scalellm package to PyPI.
Supported openai-compatible rest api server. 'python3 -m scalellm.serve.api_server'
Install scalellm with pip: 'pip install scalellm'
Added examples for offline inference and async stream.

What's Changed

[fix] use the pybind11 from libtorch and fix model download issue. by @guocuimi in #167
[misc] upgrade torch to 2.3 and use gcc-12 by @guocuimi in #168
[feat] added python rest api server skeleton by @guocuimi in #169
[refactor] combine sequence and request outputs by @guocuimi in #170
[feat] added python LLMEngine skeleton by @guocuimi in #171
[refactor] move proto definitions into proto namespace by @guocuimi in #173
[feat] implement async llm engine for python wrapper by @guocuimi in #172
[refactor] consolidate handlers to share llm_handler between python rest api server and grpc server by @guocuimi in #174
[python] move request handling logic into seperate file from api server by @guocuimi in #175
[python] added model check for rest api by @guocuimi in #176
[feat] added status handling for grpc server by @guocuimi in #177
[misc] some changes to cmake file by @guocuimi in #180
[kernle] change head_dim list to reduce binary size by @guocuimi in #181
[CI] added base docker image for python wheel build by @guocuimi in #182
[ci] build python wheels by @guocuimi in #183
[CI] fix docker image issues and build wheel for different python, pytorch versions by @guocuimi in #184
[fix] added manylinux support by @guocuimi in #185
[fix] added cuda 11.8 support for manylinux by @guocuimi in #186
[feat] added version suffix to include cuda and torch version by @guocuimi in #187
[CI] Upload wheels to release as asserts by @guocuimi in #188
[fix] fix extension typo for wheel publish workflow by @guocuimi in #189
[python] added LLM for offline inference and stream examples for chat and complete by @guocuimi in #190
[python] added requirements into package by @guocuimi in #191
[Release] prepare 0.1.0 release by @guocuimi in #192
[Release] added workflow to publish whls to PyPI by @guocuimi in #193

Full Changelog: v0.0.9...v0.1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.0

Major changes:

What's Changed

Contributors