Skip to content

Actions: ggerganov/llama.cpp

Benchmark

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
This workflow was disabled manually.
2,232 workflow runs
2,232 workflow runs
Event

Filter by event

Loading
Status

Filter by status

Loading
Branch
Actor

Filter by actor

Loading
CUDA: revise q8_1 data layout for mul_mat_q
Benchmark #2163: Pull request #7824 opened by JohannesGaessler
June 7, 2024 21:53 48m 22s
June 7, 2024 21:53 48m 22s
move BLAS to a separate backend
Benchmark #2162: Pull request #6210 synchronize by slaren
June 7, 2024 18:00 1h 58m 46s
June 7, 2024 18:00 1h 58m 46s
move BLAS to a separate backend
Benchmark #2161: Pull request #6210 synchronize by slaren
June 7, 2024 17:41 Queued
June 7, 2024 17:41 Queued
Arm AArch64: optimized GEMV and GEMM kernels for q4_0_q8_0, and q8_0_q8_0 quantization
Benchmark #2160: Pull request #5780 synchronize by Dibakar
June 7, 2024 17:39 1h 32m 58s
June 7, 2024 17:39 1h 32m 58s
Server: Unix Socket Support
Benchmark #2159: Pull request #6413 synchronize by adrianliechti
June 7, 2024 17:36 49m 19s
June 7, 2024 17:36 49m 19s
server : Smart selection of available slot using Longest Common Prefix
Benchmark #2158: Pull request #7728 synchronize by sasha0552
June 7, 2024 14:29 47m 3s
June 7, 2024 14:29 47m 3s
server : Smart selection of available slot using Longest Common Prefix
Benchmark #2157: Pull request #7728 synchronize by sasha0552
June 7, 2024 14:27 1m 53s
June 7, 2024 14:27 1m 53s
server : Smart selection of available slot using Longest Common Prefix
Benchmark #2156: Pull request #7728 synchronize by sasha0552
June 7, 2024 14:16 11m 30s
June 7, 2024 14:16 11m 30s
move BLAS to a separate backend
Benchmark #2155: Pull request #6210 synchronize by slaren
June 7, 2024 13:32 48m 40s
June 7, 2024 13:32 48m 40s
Allow pooled embeddings on any model
Benchmark #2154: Pull request #7477 synchronize by iamlemec
June 7, 2024 11:52 43m 47s
June 7, 2024 11:52 43m 47s
Add Qwen2MoE 57B-A14B
Benchmark #2153: Pull request #7814 opened by CISC
June 7, 2024 10:29 48m 38s
June 7, 2024 10:29 48m 38s
move BLAS to a separate backend
Benchmark #2152: Pull request #6210 synchronize by slaren
June 7, 2024 07:55 2h 27m 39s
June 7, 2024 07:55 2h 27m 39s
feat: add changes to handle jina v2 chinese code
Benchmark #2151: Pull request #7795 synchronize by JoanFM
June 7, 2024 07:55 1h 41m 34s
June 7, 2024 07:55 1h 41m 34s
server : do not get prompt in infill mode (#7286)
Benchmark #2150: Commit a5cabd7 pushed by ggerganov
June 7, 2024 07:09 1h 38m 54s master
June 7, 2024 07:09 1h 38m 54s
ggml-qnn: add Qualcomm QNN(Qualcomm Neural Network,aka Qualcomm AI Engine Direct) backend
Benchmark #2149: Pull request #6869 synchronize by zhouwg
June 7, 2024 06:56 1h 4m 55s
June 7, 2024 06:56 1h 4m 55s
check for nans in imatrix and quantize (#7807)
Benchmark #2148: Commit c9ee711 pushed by ggerganov
June 7, 2024 06:01 1h 13m 10s master
June 7, 2024 06:01 1h 13m 10s
Poro-34B-chat tokenizer support
Benchmark #2147: Pull request #7713 synchronize by ezosa
June 7, 2024 05:13 1h 13m 7s
June 7, 2024 05:13 1h 13m 7s
ggml-qnn: add Qualcomm QNN(Qualcomm Neural Network,aka Qualcomm AI Engine Direct) backend
Benchmark #2146: Pull request #6869 synchronize by zhouwg
June 7, 2024 04:51 47m 53s
June 7, 2024 04:51 47m 53s
ggml-qnn: add Qualcomm QNN(Qualcomm Neural Network,aka Qualcomm AI Engine Direct) backend
Benchmark #2145: Pull request #6869 synchronize by zhouwg
June 7, 2024 04:51 14s
June 7, 2024 04:51 14s
ggml-qnn: add Qualcomm QNN(Qualcomm Neural Network,aka Qualcomm AI Engine Direct) backend
Benchmark #2144: Pull request #6869 synchronize by zhouwg
June 7, 2024 04:50 59s
June 7, 2024 04:50 59s
ggml-qnn: add Qualcomm QNN(Qualcomm Neural Network,aka Qualcomm AI Engine Direct) backend
Benchmark #2143: Pull request #6869 synchronize by zhouwg
June 7, 2024 04:26 23m 47s
June 7, 2024 04:26 23m 47s
ggml-qnn: add Qualcomm QNN(Qualcomm Neural Network,aka Qualcomm AI Engine Direct) backend
Benchmark #2142: Pull request #6869 synchronize by zhouwg
June 7, 2024 04:26 20s
June 7, 2024 04:26 20s
ggml-qnn: add Qualcomm QNN(Qualcomm Neural Network,aka Qualcomm AI Engine Direct) backend
Benchmark #2141: Pull request #6869 synchronize by zhouwg
June 7, 2024 04:26 28s
June 7, 2024 04:26 28s
ggml-qnn: add Qualcomm QNN(Qualcomm Neural Network,aka Qualcomm AI Engine Direct) backend
Benchmark #2140: Pull request #6869 synchronize by zhouwg
June 7, 2024 04:20 6m 19s
June 7, 2024 04:20 6m 19s
Benchmark
Benchmark #2139: Scheduled
June 7, 2024 02:21 1h 28m 44s master
June 7, 2024 02:21 1h 28m 44s