Default Branch

fc2b0053ff · ggml-cuda: Repost of 21896: Blackwell native NVFP4 support (#22196) · Updated 2026-04-29 00:47:42 +02:00

Branches

5f14aa8e43 · gguf-py : do not align the data start offset · Updated 2025-12-22 15:49:54 +01:00

1466
1

a95df75322 · add test model · Updated 2025-12-18 23:44:01 +01:00

1530
13

6b1394ed74 · prof: fix tensor dims formatter · Updated 2025-12-18 02:11:21 +01:00

1503
3

e47a082fc9 · security : add collaborator guidance · Updated 2025-12-16 09:16:46 +01:00

1544
1

4574ab6f40 · preset: handle negated arg, reverse the meaning if needed · Updated 2025-12-14 21:44:41 +01:00

1564
1

357f999381 · graph: add f_attn_temp_offset · Updated 2025-12-14 12:12:12 +01:00

1568
1

292f8e231c · model-conversion : cast logits to float32 · Updated 2025-12-13 21:24:21 +01:00

1580
1

2a615b27e4 · ggml : remove redundant src in ggml_cast · Updated 2025-12-09 10:16:15 +01:00

1637
1

31436df5ae · contrib : stale PRs · Updated 2025-12-05 21:49:15 +01:00

1677
1

dad7571ff2 · tests : better input range for unary operators · Updated 2025-12-04 11:18:24 +01:00

1701
1

01c9e9fd5c · llama : fix sanity checks during quantization · Updated 2025-12-03 10:10:11 +01:00

1721
1

874c877bde · revise · Updated 2025-11-30 17:54:44 +01:00

1768
2

c6bba89ea9 · arch : add description about LLM_TENSOR_INFOS · Updated 2025-11-27 15:03:09 +01:00

1791
1

d93ff58322 · models : fix LFM2 tensors · Updated 2025-11-27 13:54:51 +01:00

1791
1

05429433a1 · examples: add model-backend-compare tool to compare intermediate device tensors with CPU reference · Updated 2025-11-25 18:05:56 +01:00

1813
1

72f80499ee · server : headers cleanup · Updated 2025-11-24 11:50:50 +01:00

1873
5

722f9defe9 · vulkan: intel mmv fix attempt · Updated 2025-11-23 10:13:19 +01:00

1835
1

c0b9903a1a · more readable · Updated 2025-11-20 17:45:37 +01:00

1848
2

6cdda87baf · ci : disable op offload in some tests · Updated 2025-11-20 16:16:50 +01:00

1880
3

dba1cbceb3 · tune for RDNA3 · Updated 2025-11-16 20:21:22 +01:00

1888
4