Default Branch

137435ff15 · kleidiai : add sme fp16 compute path for q4_0 gemm on aarch64 (#20043) · Updated 2026-03-03 10:40:26 +01:00

Branches

386892ec61 · sync : ggml · Updated 2025-07-19 10:46:12 +02:00

2255
1

cfe5e98423 · graph : fix graph reuse reset of params · Updated 2025-07-18 16:50:32 +02:00

2258
1

9106d7595d · model : fix build after merge conflict · Updated 2025-07-18 10:50:59 +02:00

2261
1

05baa62a73 · kv-cache : fix k-shift for multiple streams · Updated 2025-07-17 19:18:36 +02:00

2270
1

07908a824a · server : pre-calculate EOG logit biases · Updated 2025-07-16 12:47:05 +02:00

2283
1

9f8d285901 · server : fix handling of the ignore_eos flag · Updated 2025-07-16 06:37:18 +02:00

2288
1

f68669d50f · fix and opt kernel launch · Updated 2025-07-15 13:28:26 +02:00

2323
3

942c55cd57 · imatrix : avoid using imatrix.dat in README · Updated 2025-07-12 22:50:10 +02:00

2308
32

1180752835 · cuda : support Falcon-H1 state size for SSM_SCAN · Updated 2025-07-09 18:18:37 +02:00

2338
1

4d6a179c68 · gguf-py : avoid adding duplicate tensor mappings for Jamba · Updated 2025-07-09 17:58:35 +02:00

2338
61

b7c6ece5b5 · ggml-ci · Updated 2025-07-09 14:13:34 +02:00

2343
24

7634d14d7a · test-model-random : fix seq_id buffer overflow · Updated 2025-07-09 00:23:58 +02:00

2343
18

2ff3354c33 · memory : fix broken batch splits for recurrent cache · Updated 2025-07-08 03:23:14 +02:00

2354
1

996195299e · up. · Updated 2025-07-07 23:42:40 +02:00

2506
6

bf8b39015f · metal : reuse graphs · Updated 2025-07-07 20:37:07 +02:00

2362
3

886da0a2c5 · kv-cache : prepare K/V buffers for separation · Updated 2025-07-04 09:13:16 +02:00

2373
1

dfceb012ee · llama : add "virtual sequences" · Updated 2025-07-02 19:26:55 +02:00

2384
8

71bef66591 · cuda : graceful fallback for Mamba-1 models with weird embd size · Updated 2025-07-02 09:49:36 +02:00

2393
44

6179578988 · batch : require non-coupled batch with sequential split_equal · Updated 2025-06-25 16:20:46 +02:00

2450
29

37bdfbef8c · wip 3 · Updated 2025-06-24 10:04:18 +02:00

2450
21