Default Branch

b2cb4512c5 · Create parameters overview (#1269) · Updated 2026-02-20 07:20:56 +01:00

Branches

2affe8730a · Add mqkv and rcache for Gemma3 · Updated 2025-11-16 16:38:59 +01:00    git

4211
4003

a63a3492d3 · Fix rtr when mqkv is enabled · Updated 2025-11-16 15:50:53 +01:00    git

210
1

8e2661afc8 · Add ability to use RoPE cache to DeepSeek models · Updated 2025-11-16 14:32:49 +01:00    git

211
1

4e2f4b739d · Allow distinct output tensor for Gemma models · Updated 2025-11-16 11:08:19 +01:00    git

4211
4000

6bbc2c42ba · Fix ggml_cuda_fattn_is_supported · Updated 2025-11-16 10:46:30 +01:00    git

4211
3998

3502b9793b · Fix RoPE cache on multi-GPU setup · Updated 2025-11-16 07:01:50 +01:00    git

215
1

ca03d07bb6 · Add --chat-template-file to usage · Updated 2025-11-14 10:07:32 +01:00    git

4211
3994

03f43ee612 · Merge branch 'main' into ik/graph_reuse · Updated 2025-11-13 18:27:30 +01:00    git

4211
3995

9519a834ac · Fix fused up+gate when mmq is not supported · Updated 2025-11-13 17:43:21 +01:00    git

4211
3988

d296ab79be · Fix q4_0_r8 also on Zen4 · Updated 2025-11-13 11:52:25 +01:00    git

4211
3991

aba78ceafa · Set default MLA to 3 also in llama-bench · Updated 2025-11-13 08:50:23 +01:00    git

4211
3984

e1d669fb34 · Add missing AVX512 operators for MSVC · Updated 2025-11-13 08:09:58 +01:00    git

4211
3984

6d799ea36b · Also fix mrope and vision · Updated 2025-11-13 07:48:38 +01:00    git

4211
3986

14e06e26a5 · Add mainline compatible FA command line option · Updated 2025-11-12 10:12:34 +01:00    git

4211
3984

e9e9fc3dfd · Set mla=3 by default · Updated 2025-11-12 09:55:41 +01:00    git

4211
3983

b73d66f76e · Formatting · Updated 2025-11-11 18:09:04 +01:00    git

4211
3984

82780dfd55 · Enable fusion by default · Updated 2025-11-11 09:26:13 +01:00    git

4211
3982

5266eeea18 · Opt from #880 also for iqk cuda gemv · Updated 2025-11-11 08:59:56 +01:00    git

4211
3981

7854e64231 · Add usage · Updated 2025-11-11 07:41:30 +01:00    git

4211
3981

febf3df389 · Add rcache to llama-bench · Updated 2025-11-10 16:48:05 +01:00    git

4211
3979