Default Branch

ecd99d6a9a · docs: Fix intel documentation link (#20040) · Updated 2026-03-03 14:50:00 +01:00

Branches

ae96333923 · metal : fix thread-safety · Updated 2025-06-20 15:42:54 +02:00

2477
1

6fb2f2e8a9 · ggml : fix repack work size for mul_mat_id · Updated 2025-06-20 09:34:16 +02:00

2480
1

59fee24c72 · recurrent : rework graph inputs + add TODOs · Updated 2025-06-18 08:29:51 +02:00

2504
31

d3d06debe3 · server : add pidfile option · Updated 2025-06-17 22:47:53 +02:00

2505
1

4b2233befb · Vulkan: Set device max size for host memory to avoid OOM warning and fallback to CPU buffer · Updated 2025-06-17 22:25:42 +02:00

2507
1

36fce98281 · server : re-enable swa speculative decoding · Updated 2025-06-12 10:51:15 +02:00

2548
1

ed99a8ea04 · cont : fix comments · Updated 2025-06-12 09:43:55 +02:00

2551
3

4b6fb6524b · context : round n_tokens to next multiple of n_seqs when reserving · Updated 2025-06-11 22:19:17 +02:00

2554
1

62a9f34bae · llama-graph : fix recurrent state copy · Updated 2025-06-10 06:26:30 +02:00

2578
3

c257a8871c · cont : fix defrag erasing cells that didn't move · Updated 2025-06-09 19:45:56 +02:00

2585
3

ac35e50c16 · Update tools/llama-bench/llama-bench.cpp · Updated 2025-06-01 00:38:37 +02:00

2645
3

d3a2eb592d · disable on windows · Updated 2025-05-31 23:17:18 +02:00

2636
12

9065ca71a2 · tests : sampling tests use min_keep == 0 · Updated 2025-05-27 10:30:41 +02:00

2691
3

108d484ab2 · tts : fix n_ubatch + make WavTokenizer cache-less · Updated 2025-05-22 20:58:10 +02:00

2735
1

b06a954bbc · llama_encode : only force non-causal attention for enc-dec models · Updated 2025-05-19 19:43:59 +02:00

2768
1

8282d74692 · bench : handle decode errors · Updated 2025-05-14 21:36:29 +02:00

2806
1

237acc7cd5 · server : update readme + return json for "meta" field · Updated 2025-05-14 14:30:12 +02:00

2815
2

78d70223c3 · metal : use FA-vec kernel up to batch size 20 · Updated 2025-05-13 09:38:06 +02:00

2832
3

6107303ab0 · llama : remove logits_all flag + reorder llama_context_params · Updated 2025-05-08 12:01:41 +02:00

2885
2

16843dba33 · metal : pad mm results · Updated 2025-05-04 08:13:52 +02:00

2921
1