Support Phi 3 model #3550

iseeyuan · 2024-05-08T20:12:10Z

With limited memory on most of phones, there's community requests on supporting a model with a smaller size like Phi-3 mini. It may be supported out of box, but need to verification, evaluation and profiling.

mikekgfb · 2024-05-12T23:46:15Z

Exciting. Can we try this in torchchat?

I was also looking at how to run a model on watchOS because -- we have iOS and macOS with ET now, so looking for the next exciting platform. And then... Max size: 75 MB ?!

This might be an interesting experiment. It may also be an exciting item for a community member to prototype!

salykova · 2024-05-16T15:44:07Z

@iseeyuan

in addition to Phi3, here is a list of some of the most popular tiny llms used within the open-source community:

OpenElm https://huggingface.co/apple/OpenELM
TinyLlama 1.1B https://github.com/jzhang38/TinyLlama
StableLM 1.6B and 3B https://huggingface.co/stabilityai/stablelm-2-zephyr-1_6b https://huggingface.co/stabilityai/stablelm-2-1_6b-chat https://huggingface.co/stabilityai/stablelm-zephyr-3b
RWKV 1.6B and 3B models https://huggingface.co/RWKV/rwkv-5-world-3b https://huggingface.co/RWKV/rwkv-6-world-3b https://huggingface.co/RWKV/rwkv-6-world-1b6 https://huggingface.co/RWKV/rwkv-5-world-1b5
Qwen 1.8B and 4B models https://huggingface.co/Qwen/Qwen1.5-1.8B-Chat https://huggingface.co/Qwen/Qwen1.5-4B-Chat

Tinyllama should work out of the box since the same model architecture and the same tokenizer. TinyLlama is of particular interest because almost every mobile phone can run it

helunwencser · 2024-05-16T16:45:49Z

I am working on getting Phi-3 enabled via ExecuTorch. Will post here once I finish it. After that, we can try it on TorchChat as well.

iseeyuan · 2024-05-21T16:10:31Z

@salykova Thank you for the list! We may picked Phi3 as it's relatively new and popular, but are definitely considering enabling other models. The long term goal is that we improve our infrastructure during the first couple of models, and the community can contribute to enabling other models using the infrastructure. With that said, you are welcome to enable other models and feel free to submit your PR!

devYonz · 2024-05-25T21:17:00Z

Would love to see 27tokens/s on mobile with Executorch and phi silica https://learn.microsoft.com/en-us/windows/ai/apis/phi-silica

iseeyuan assigned iseeyuan and helunwencser and unassigned iseeyuan May 9, 2024

iseeyuan added the high priority label May 9, 2024

pytorch-bot bot added the triage review Items require an triage review label May 9, 2024

iseeyuan added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module and removed triage review Items require an triage review labels May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Phi 3 model #3550

Support Phi 3 model #3550

iseeyuan commented May 8, 2024

mikekgfb commented May 12, 2024

salykova commented May 16, 2024 •

edited

helunwencser commented May 16, 2024

iseeyuan commented May 21, 2024

devYonz commented May 25, 2024 •

edited

Support Phi 3 model #3550

Support Phi 3 model #3550

Comments

iseeyuan commented May 8, 2024

mikekgfb commented May 12, 2024

salykova commented May 16, 2024 • edited

helunwencser commented May 16, 2024

iseeyuan commented May 21, 2024

devYonz commented May 25, 2024 • edited

salykova commented May 16, 2024 •

edited

devYonz commented May 25, 2024 •

edited