Default Branch

fc2b0053ff · ggml-cuda: Repost of 21896: Blackwell native NVFP4 support (#22196) · Updated 2026-04-29 00:47:42 +02:00

Branches

b84dda3ea0 · wip · Updated 2026-04-27 22:23:20 +02:00

47
19

fd6f79c7a4 · download : prefer q8_0 when q4_k not available · Updated 2026-04-27 11:08:25 +02:00

21
1

cb9fc575e4 · common : use pimpl in debug.h to reduce header dependencies · Updated 2026-04-26 08:49:28 +02:00

42
3

b9421898b6 · add for Q4_0 · Updated 2026-04-23 09:33:19 +02:00

184
2

a5355a0226 · server: keep router model refcount to avoid unloading models that have running requests · Updated 2026-04-22 10:07:13 +02:00

97
15

cf0ebc4e64 · load directly from downloaded state · Updated 2026-04-21 14:36:34 +02:00

97
14

35df147d80 · cont : remove /api/tags · Updated 2026-04-20 14:45:42 +02:00

112
2

ac735691d1 · Converge implementation with export-graph-ops · Updated 2026-04-15 22:44:51 +02:00

160
8

4943e3a396 · gen-libllama-abi: compile sort-key regex once outside the lambda · Updated 2026-04-15 14:04:44 +02:00

167
4

c5b682b25c · various clean up · Updated 2026-04-13 17:39:14 +02:00

188
3

4cabbe36e0 · state · Updated 2026-04-09 13:00:31 +02:00

264
16

d5344395d0 · benchmark · Updated 2026-04-08 18:26:50 +02:00

264
1

a30369d515 · cpu: fix ARM NEON nvfp4 vec dot · Updated 2026-04-06 10:27:03 +02:00

295
1

c30e012253 · contrib : rewrite AGENTS.md, make it more clear about project values (#21270) · Updated 2026-04-01 23:31:51 +02:00

340
0
Included

2985be3324 · update hw info · Updated 2026-03-31 03:24:40 +02:00

577
2

1c128d941e · remove junk · Updated 2026-03-29 16:31:04 +02:00

903
51

f0fea264b0 · cont : rand hadamard matrices · Updated 2026-03-27 19:11:47 +01:00

429
4

ff76c6731d · cont : cache shift support · Updated 2026-03-27 13:39:14 +01:00

429
4

4cd732f445 · better wording · Updated 2026-03-25 19:46:17 +01:00

439
2

07a6fd8775 · kleidiai: removed cpu feature detection from CI run script · Updated 2026-03-24 18:24:41 +01:00

679
3