Default Branch

319146247e · vulkan: improve partial offloading performance on AMD (#19976) · Updated 2026-03-01 17:32:14 +01:00

Branches

d7f794eadb · convert : avoid dequantizing mxfp4 for GPT-OSS · Updated 2025-10-24 13:56:26 +02:00

1353
1

93fbd407f3 · Merge branch 'master' into compilade/convert-prequant · Updated 2025-10-23 20:23:12 +02:00

1356
6

f0076dc5a0 · metal : adjust .get_alloc_size to be alloc friendly · Updated 2025-10-19 16:20:54 +02:00

1386
1

96f9f391c7 · ggml : fix unaligned access in AMX code · Updated 2025-09-29 09:37:15 +02:00

1566
1

a8b0089a5b · ggml : remove SVE paths · Updated 2025-09-28 19:26:03 +02:00

1566
1

837b1b4563 · ggml : remove KQ mask padding · Updated 2025-09-28 17:10:17 +02:00

1569
6

17ca6ed540 · Implement llama-pull tool · Updated 2025-09-20 18:25:21 +02:00

1657
1

e83ef74733 · one less magic number · Updated 2025-09-20 07:58:36 +02:00

1676
6

652d303b32 · metal : fuse add + rms · Updated 2025-09-18 15:29:25 +02:00

1674
1

64c6dcbe6d · metal : make the NSG a function constant in mul_mv kernels · Updated 2025-09-18 10:31:59 +02:00

1679
2

6045c5a263 · cont : put all buffers in the same virtual address space · Updated 2025-09-14 14:46:57 +02:00

1715
2

3f62ee8bee · metal : back to a single queue per device · Updated 2025-09-09 16:06:46 +02:00

1757
9

3f62ee8bee · metal : back to a single queue per device · Updated 2025-09-09 16:06:46 +02:00

1757
9

7b717fb4b2 · Rewrite llama-run to use llama-server · Updated 2025-09-05 18:22:36 +02:00

1794
1

9f2636b7dc · wip · Updated 2025-09-01 10:17:56 +02:00

1843
1

4317d5abf5 · wip · Updated 2025-08-28 12:55:21 +02:00

1877
1

dc2187d48d · ggml : fix SSM_SCAN for n_groups > 1 · Updated 2025-08-27 23:37:04 +02:00

1882
1

7a152de3bb · vulkan: enable Conv2D for Apple after MoltenVK fixed the bug · Updated 2025-08-23 15:57:15 +02:00

1927
1

fb573f4440 · ggml-quants : avoid division by zero in make_q3_quants · Updated 2025-08-18 00:26:02 +02:00

1997
2

220860aa0c · graph : use F32 accumulators for gpt-oss · Updated 2025-08-14 15:08:31 +02:00

2024
1