ggerganov / llama.cpp Public

Notifications
Fork 8.2k
Star 58k

Code
Issues 354
Pull requests 203
Discussions
Actions
Projects 4
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Issues: ggerganov/llama.cpp

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

354 Open 2,580 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

NKVO argument leads to huge compute buffers in full Cublas offload on a heterogeneous dual GPU config. bug-unconfirmed

#7217 opened May 11, 2024 by Nexesenex

Assertion failure on quantization of Meta-Llama-3-70B-Instruct from f16 to various quantization types. bug-unconfirmed

#7215 opened May 11, 2024 by tigran123

Is Infini-attention support possible? enhancement

New feature or request

#7213 opened May 11, 2024 by sdmorrey

ggml-cuda.cu:1278: to_fp32_cuda != nullptr bug-unconfirmed

#7211 opened May 10, 2024 by a-downing

convert-hf-to-gguf-update.py breaks bug-unconfirmed

#7207 opened May 10, 2024 by CrispStrobe

LLaVA-NeXT-Video-34B enhancement

New feature or request

#7201 opened May 10, 2024 by mirek190

Native Intel IPEX-LLM Support enhancement

New feature or request

#7190 opened May 10, 2024 by iamhumanipromise

Build error at server.cpp: undefined reference to `json_schema_to_grammar bug-unconfirmed

#7189 opened May 10, 2024 by jarviszeng-zjc

third party applications are overwhelmingly slow for subsequent prompt evaluation compared to examples/main and examples/server enhancement

New feature or request

#7185 opened May 9, 2024 by khimaros

4 tasks done

BF16 prompt processing has half the performance compared to F16 and F32 von AMD Ryzen Embedded V3000 (Zen 3) enhancement

New feature or request

#7182 opened May 9, 2024 by lemmi

Can't run the program bug-unconfirmed

#7181 opened May 9, 2024 by mike2003

llamacpp --prompt-cache-all < -- more than a year passed and still is not fully implemented enhancement

New feature or request

#7179 opened May 9, 2024 by mirek190

Train For Language Translation

#7178 opened May 9, 2024 by nichellehouston

selects too many cores by default on orange pi 5 (2x slower) bug-unconfirmed

#7176 opened May 9, 2024 by calculatortamer

Should we add an autolabeler for PR? devops

improvements to build systems and github actions

enhancement

New feature or request

help wanted

Extra attention is needed

#7174 opened May 9, 2024 by mofosyne

Add support for mistral Dutch and Armenian models: Tweeties/tweety-7b-dutch-v24a and Tweeties/tweety-7b-armenian-v24a enhancement

New feature or request

#7170 opened May 9, 2024 by JohnClaw

Support for Consistency Large Language Models? enhancement

New feature or request

#7168 opened May 9, 2024 by unoexperto

how can i modify the setting,make it answer in Chinese by default enhancement

New feature or request

#7167 opened May 9, 2024 by LiangZeFenglzf

Add metadata override and also generate dynamic default filename when converting gguf enhancement

New feature or request

help wanted

Extra attention is needed

need feedback

Testing and feedback with results are needed

#7165 opened May 9, 2024 by mofosyne

Looking for help for using llama.cpp with Phi3 model and LoRA bug-unconfirmed

#7164 opened May 9, 2024 by SHIMURA0

Gibberish response from server and main exits on M1 macstudio ultra with gpu (cpu ok) bug-unconfirmed

#7159 opened May 9, 2024 by jrozentur

llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 149, got 147 bug-unconfirmed

#7157 opened May 9, 2024 by YathenStianbase

ggml-cuda.so is 90mb with -arch=all bug-unconfirmed

#7156 opened May 9, 2024 by jart

Impact of bf16 on Llama 3 8B perplexity? enhancement

New feature or request

#7148 opened May 8, 2024 by jim-plus

4 tasks done

error: implicit declaration of function ‘vld1q_s8_x4’; did you mean ‘vld1q_s8_x2’? bug-unconfirmed

#7147 opened May 8, 2024 by CaptainOfHacks

Previous 1 2 3 4 5 … 14 15 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly