Skip to content

Medusa-v0.1

Latest
Compare
Choose a tag to compare
@harveyp123 harveyp123 released this 11 Sep 20:35
· 69 commits to main since this release

Medusa is a easy-to-use framework that democratizes the acceleration techniques for LLM generation. Medusa-v0.1 uses several extra light-weighted decoding head, and exclude the need for draft model.