Default Branch

b2cb4512c5 · Create parameters overview (#1269) · Updated 2026-02-20 07:20:56 +01:00

Branches

1cebf75ba4 · Minor: remove unnecesssary calls to build_inp_out_ids · Updated 2025-11-10 16:37:33 +01:00    git

4211
3978

5d90f711d4 · Model loading and compute graph · Updated 2025-11-10 10:27:18 +01:00    git

4211
3978

2bb57b4900 · This seems better · Updated 2025-11-10 08:53:07 +01:00    git

4211
3977

ef64b1a171 · Use fused gemv+add only for TG · Updated 2025-11-10 06:43:40 +01:00    git

4211
3973

19ecaaad42 · Remove forgotten printf · Updated 2025-11-09 17:27:59 +01:00    git

4211
3974

aae817e50b · DeepSeek TG optimizations for TG · Updated 2025-11-09 06:54:05 +01:00    git

4211
3967

02676a999c · Better -no-fmoe TG on CUDA · Updated 2025-11-08 15:13:14 +01:00    git

4211
3969

54848a4c7e · Adapt to latest main · Updated 2025-11-08 06:45:14 +01:00    git

4211
3967

fc31862add · Adopt fix from mainline PR 17089 · Updated 2025-11-08 06:41:28 +01:00    git

4211
3965

3549305b7a · Disable add + fused_rms_norm fusion · Updated 2025-11-07 17:47:52 +01:00    git

4211
3962

e49cfff302 · Fix PPL increase caused by mmq_id · Updated 2025-11-07 12:56:24 +01:00    git

4211
3962

4fe0705abe · Fix iqk_mul_mat when number of rows is not multiple of repack rows · Updated 2025-11-06 18:00:25 +01:00    git

4211
3959

06e9fcd4d8 · Also llama-bench · Updated 2025-11-06 17:08:03 +01:00    git

4211
3960

bdfa4bbe29 · Disable CUDA fusion by default for now · Updated 2025-11-05 09:56:00 +01:00    git

4211
3956

5aa5ebcb97 · Adding cmake option to disable CUDA fusion · Updated 2025-11-05 05:20:36 +01:00    git

4211
3951

ba593f3ba6 · Fix compilation failure after merging #883 · Updated 2025-11-04 18:27:52 +01:00    git

4211
3950

202924d9fe · Much better CPU TG performance at long context for GLM-4.5 · Updated 2025-11-04 16:47:36 +01:00    git

4211
3949

42c34f5c49 · sweep-bench: be able to set TG tokens via -n · Updated 2025-11-04 11:55:04 +01:00    git

4211
3948

04e57f4356 · Allow quantization of ffn_gate_inp · Updated 2025-11-04 10:34:10 +01:00    git

4211
3948

24f3cb644c · Make V mul mat follow QK mul mat · Updated 2025-11-04 09:39:23 +01:00    git

4211
3950