New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apple Silicone Neural Engine: Core ML model package format support #7105
Comments
Do you have a quick start on this? Haven鈥檛 looked into core ML at all |
Hi, yes. @ntindle This is about running LLMs locally on Apple Silicone. Core ML is a framework that can redistribute workload across CPU, GPU & Nural Engine (ANE). ANE is available on all modern Apple Devices: iPhones & Macs (A14 or newer and M1 or newer). Ideally, we want to run LLMs on ANE only as it has optimizations for running ML tasks compared to GPU. Apple claims "deploying your Transformer models on Apple devices with an A14 or newer and M1 or newer chip to achieve up to 10 times faster and 14 times lower peak memory consumption compared to baseline implementations".
https://machinelearning.apple.com/research/neural-engine-transformers |
We don't package any models with our code. Is it possible to use tools like Llamafile to do this? |
I mean, it was a general overview; you don't have to package models; you just need to be able to use CoreML packaged models. Work in progress on CoreML implementation for [whisper.cpp]. They see x3 performance improvements for some models. (ggerganov/whisper.cpp#548) you might be interested in. You might also be interested in another implementation Swift Transformers. Example of CoreML application |
I'll be interested to look into this at some point |
Duplicates
Summary 馃挕
Problem
Please consider adding Core ML model package format support to utilize Apple Silicone Nural Engine + GPU.
Examples 馃寛
Additional context
List of Core ML package format models
https://github.com/likedan/Awesome-CoreML-Models
Motivation 馃敠
Utilize both ANE & GPU, not just GPU on Apple Silicon
The text was updated successfully, but these errors were encountered: