Releases: ollama/ollama
Releases · ollama/ollama
v0.1.14
New Models
- StableLM Zephyr: A lightweight chat model allowing accurate, and responsive output without requiring high-end hardware.
- Magicoder: a family of 7B parameter models trained on 75K synthetic instruction data using OSS-Instruct, a novel approach to enlightening LLMs with open-source code snippets.
What's Changed
- New Chat API for sending a history of messages
curl http://localhost:11434/api/chat -d '{ "model": "mistral", "messages": [ { "role": "system", "content": "You are a helpful assistant that answers concisely." }, { "role": "user", "content": "why is the sky blue?" } ] }'
- Linewrap now works when resizing the terminal with
ollama run
- Fixed an issue where ctrl+z would not suspend
ollama run
as expected - Fixed an issue where requests to
/api/generate
would not work when waiting for another request to finish - Fixed an issue where extra whitespace after a
FROM
command would cause an error - Ollama will now warn you if there's a version mismatch when connecting remotely with
OLLAMA_HOST
- New
/api/version
API for checking Ollama's version
New Contributors
- @ruecat made their first contribution in #1364
- @calderonsamuel made their first contribution in #1399
- @Xe made their first contribution in #1406
Full Changelog: v0.1.13...v0.1.14
v0.1.13
New models
- Starling: a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.
- Meditron: Open-source medical large language model adapted from Llama 2 to the medical domain.
- DeepSeek LLM An advanced language model crafted with 2 trillion bilingual tokens.
What's Changed
- Improved progress bar when running
ollama pull
with a simpler design that displays a more consistent download speed and remaining time - The system prompt can now be set in
ollama run
using/set system <system prompt>
. - Parameters can now be set in
ollama run
using/set <parameter> <value>
. Examples:- Set the context size to 16K:
/set parameter num_ctx 16384
- Set the temperature to 1:
/set parameter temperature 1
- Set the seed:
/set parameter seed 1048
- Set the context size to 16K:
- Fixed issue where Linux installer script would encounter an error when installing on Red Hat Enterprise Linux with an Nvidia GPU
New Contributors
- @kasumi-1 made their first contribution in #1281
- @rootedbox made their first contribution in #1287
- @ftorto made their first contribution in #1299
- @ToasterUwU made their first contribution in #1301
- @jeremiahbuckley made their first contribution in #1321
- @smartalecH made their first contribution in #994
Full Changelog: v0.1.12...v0.1.13
v0.1.12
New Models
- Yi Chat: the chat variant of the popular Yi 34b model is now available.
What's Changed
- Improved multi-line prompts (starting & ending with
"""
) and pasting functionality inollama run
- Option (or alt) + backspace will now delete words in
ollama run
- Fixed issue where older Intel Macs would receive an error when trying to run a model
- Fixed issues with YaRN models output and performance
New Contributors
- @ex3ndr made their first contribution in #1225
- @kejcao made their first contribution in #1223
- @longy2k made their first contribution in #1239
- @wookayin made their first contribution in #1261
- @vinjn made their first contribution in #1262
Full Changelog: v0.1.11...v0.1.12
v0.1.11
New Models
- Orca 2: A fine-tuned version of Meta's Llama 2 model, designed to excel particularly in reasoning.
- DeepSeek Coder: A capable coding model trained from scratch. Available in 1.3B, 6.7B and 33B parameter counts.
- Alfred: A robust conversational model designed to be used for both chat and instruct use cases.
What's Changed
- Improved progress bar design
- Fixed issue where
ollama create
would error withinvalid cross-device link
- Fixed issue where
ollama run
Ollama would exit with an error on macOS Big Sur and Monterey q5_0
andq5_1
models will now use GPU- Fixed several
max retries exceeded
errors when runningollama pull
orollama push
- Fixed issue where
ollama create
would result in a "file not found" errorFROM
referred to local file - Fixed issue where resizing the terminal while running
ollama pull
would cause repeated progress bar messages - Minor performance improvements on Intel Macs
- Improved error messages on Linux when using Nvidia GPUs
Full Changelog: v0.1.10...v0.1.11
v0.1.10
New models
- OpenChat: An open-source chat model trained on a wide variety of data, surpassing ChatGPT on various benchmarks.
- Neural-chat: New chat model by Intel
- Goliath: A large chat model created by combining two fine-tuned versions of Llama 2 70B
What's Changed
- JSON mode can now be used with
ollama run
:- Pass
--format json
flag or - Use
/set format json
to change the current chat session to use JSON mode
- Pass
- Prompts can now be passed in via standard input to
ollama run
. For example:head -30 README.md | ollama run codellama "how do I install Ollama on Linux?"
ollama create
now works withOLLAMA_HOST
to build models using Ollama running on a remote machine- Fixed crashes on Intel Macs
- Fixed issue where
ollama pull
progress would reverse when re-trying a failed connection - Fixed issue where
ollama show --modelfile
would show an incorrectFROM
command - Fixed issue where word wrap wouldn't work when piping in data to
ollama run
via standard input - Fix permission denied issues when running
ollama create
on Linux - Added FAQ entry for proxy support on Linux
- Fixed installer error on Debian 12
- Fixed issue where
ollama push
would result in a 405 error ollama push
will now return a better error when trying to push to a namespace the current user does not have access to
New Contributors
- @dhiltgen made their first contribution in #1075
- @dansreis made their first contribution in #1055
- @breitburg made their first contribution in #1106
- @enricoros made their first contribution in #1078
- @huynle made their first contribution in #1115
- @bnodnarb made their first contribution in #1098
- @danemadsen made their first contribution in #1120
- @pieroit made their first contribution in #1124
- @yanndegat made their first contribution in #1151
Full Changelog: v0.1.9...v0.1.10
v0.1.9
New models
- Yi: a high-performing, bilingual model supporting both English and Chinese.
What's Changed
- JSON mode: instruct models to always return valid JSON when calling
/api/generate
by setting theformat
parameter tojson
- Raw mode: bypass any templating done by Ollama by passing
{"raw": true}
to/api/generate
- Better error descriptions when downloading and uploading models with
ollama pull
andollama push
- Fixed issue where Linux installer would encounter an error when running as the
root
user - Improved progress bar design when running
ollama pull
andollama push
- Fixed issue where running on a machine with less than 2GB of VRAM would be slow
New Contributors
- @pepperoni21 made their first contribution in #995
- @lgrammel made their first contribution in #1020
- @ej52 made their first contribution in #999
- @David-Kunz made their first contribution in #996
- @tjbck made their first contribution in #943
- @omagdy7 made their first contribution in #1029
- @upchui made their first contribution in #1034
- @kevinhermawan made their first contribution in #1043
- @amithkoujalgi made their first contribution in #1044
- @mpldr made their first contribution in #1042
- @aashish2057 made their first contribution in #992
- @nickanderson made their first contribution in #1062
Full Changelog: v0.1.8...v0.1.9
v0.1.8
New Models
- CodeBooga: A high-performing code instruct model created by merging two existing code models.
- Dolphin 2.2 Mistral: An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.
- MistralLite: MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.
- Yarn Mistral an extension of Mistral to support a context window of up to 128 tokens
- Yarn Llama 2 an extension of Llama 2 to support a context window of up to 128 tokens
What's Changed
- Ollama will now honour large context sizes on models such as
codellama
andmistrallite
- Fixed issue where repeated characters would be output on long contexts
ollama push
is now much faster. 7B models will push up to ~100MB/s and large models (70B+) up to 1GB/s if network speeds permit
New Contributors
- @dloss made their first contribution in #948
- @noahgitsham made their first contribution in #983
Full Changelog: v0.1.7...v0.1.8
v0.1.7
What's Changed
- Fixed an issue when running
ollama run
where certain key combinations such as Ctrl+Space would lead to an unresponsive prompt - Fixed issue in
ollama run
where retrieving the previous prompt from history would require two up arrow key presses instead of one - Exiting
ollama run
with Ctrl+D will now put cursor on the next line
Full Changelog: v0.1.6...v0.1.7
v0.1.6
New models
- Dolphin 2.1 Mistral: an instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.
- Zephyr Beta: this is the second model in the series based on Mistral, and has strong performance that compares to and even exceeds Llama 2 70b in several categories. It’s trained on a distilled dataset, improving grammar and yielding even better chat results.
What's Changed
- Pasting multi-line strings in
ollama run
is now possible - Fixed various issues when writing prompts in
ollama run
- The library models have been refreshed and revamped including
llama2
,codellama
, and more:- All
chat
orinstruct
models now support setting thesystem
parameter, orSYSTEM
command in theModelfile
- Parameters (
num_ctx
, etc) have been updated for library models - Slight performance improvements for all models
- All
- Model storage can now be configured with
OLLAMA_MODELS
. See the FAQ for more info on how to configure this. OLLAMA_HOST
will now default to port443
whenhttps://
is specified, and port80
whenhttp://
is specified- Fixed trailing slashes causing an error when using
OLLAMA_HOST
- Fixed issue where
ollama pull
would retry multiple times when out of space - Fixed various
out of memory
issues when using Nvidia GPUs - Fixed performance issue previously introduced on AMD CPUs
New Contributors
Full Changelog: v0.1.5...v0.1.6
v0.1.5
What's Changed
- Fix an issue where an error would occur when running
falcon
orstarcoder
models
Full Changelog: v0.1.4...v0.1.5