Default Branch

b2cb4512c5 · Create parameters overview (#1269) · Updated 2026-02-20 07:20:56 +01:00

Branches

bb21114ab4 · Slightly better PP · Updated 2025-09-01 08:17:19 +02:00    git

4211
3877

86e927bfe9 · minor · Updated 2025-08-30 11:48:12 +02:00    git

344
2

3bc7acf1bd · Add command line option · Updated 2025-08-30 11:09:55 +02:00    git

4211
3872

411606d73b · Chat fixes · Updated 2025-08-30 07:21:55 +02:00    git

4211
3868

b0a1c63279 · This is slightly better · Updated 2025-08-29 14:47:22 +02:00    git

4211
3871

f5b3ca8c95 · Make yarn_log_multiplier optional · Updated 2025-08-28 08:47:23 +02:00    git

4211
3866

aa340974f6 · Add more checks for iq3_k, iq3_ks · Updated 2025-08-28 06:47:37 +02:00    git

4211
3880

91d056209a · Add checks for more quantizagtion types · Updated 2025-08-27 08:15:31 +02:00    git

4211
3868

adee94976c · Heuristics for mmq_id -> original threshold · Updated 2025-08-27 07:11:46 +02:00    git

4211
3863

24fb00637e · Adding forgotten q8_0_r8 to num_rows() · Updated 2025-08-27 07:00:14 +02:00    git

4211
3865

c411d443ee · Add CUDA fp8 header · Updated 2025-08-26 07:43:59 +02:00    git

4211
3880

c213189c2b · Fix compile without flag on systems without it installed · Updated 2025-08-24 14:42:51 +02:00    git

4211
3881

2e86b76476 · Remove the 16 · Updated 2025-08-23 13:58:14 +02:00    git

4211
3858

277426f040 · Sanitize importances for KT quantization · Updated 2025-08-23 08:08:35 +02:00    git

4211
3857

7845ae4a8d · Fix more Q8_0 repacking mess on AVX2 · Updated 2025-08-23 08:03:30 +02:00    git

4211
3856

6c01bedbb1 · Minor · Updated 2025-08-22 17:21:04 +02:00    git

4211
3856

01eee24f0f · Use bperm trick for iq2_k_r4 gemv -> ~7% gain · Updated 2025-08-21 18:10:12 +02:00    git

4211
3857

90379f3d51 · Use bperm trick for iq3_k gemv -> 4.5% gain · Updated 2025-08-21 15:02:15 +02:00    git

4211
3856

8d91235b0e · Use get_int_from_table_16 everywhere for 4-bit quants · Updated 2025-08-21 10:27:39 +02:00    git

4211
3851

79f34c4e1d · Just always set num_rows to 16 · Updated 2025-08-21 07:10:19 +02:00    git

4211
3864