Default Branch

b2cb4512c5 · Create parameters overview (#1269) · Updated 2026-02-20 07:20:56 +01:00

Branches

5f3e6faac8 · Enable q6_0 for flash attention · Updated 2024-10-21 15:30:10 +02:00    git

4211
3470

599a2b7806 · Fix typo, which is not really a bug · Updated 2024-10-21 12:12:14 +02:00    git

4211
3474

a3fe796f6c · Bitnet: make the scale tensors optional · Updated 2024-10-19 18:37:33 +02:00    git

4211
3467

0e76d21b96 · Adding agray3's graph caching approach · Updated 2024-10-18 17:01:08 +02:00    git

4211
3465

e732da1f57 · Attempt to blindly fix Windows build failure · Updated 2024-10-18 11:35:47 +02:00    git

4211
3465

c4292bf2d9 · iq4_knn: Metal - predictably bad · Updated 2024-10-18 10:48:00 +02:00    git

4211
3468

9612cd79d6 · iq4_kss: very slightly faster Metal dot product · Updated 2024-10-16 14:08:15 +02:00    git

4211
3473

3e0c2519d3 · iq4_ks: faster dot product on Metal · Updated 2024-10-16 13:04:59 +02:00    git

4211
3462

55f91a98f1 · iq3_k: slightly faster Metal dot product · Updated 2024-10-14 09:41:26 +02:00    git

4211
3461

f74905d649 · iq2_k: optimize Metal dot product · Updated 2024-10-13 13:09:53 +02:00    git

4211
3461

f9f15c27b6 · iq2_ks: faster Metal · Updated 2024-10-13 11:23:14 +02:00    git

4211
3470

e441c897a4 · Better model info · Updated 2024-10-10 16:38:59 +02:00    git

4211
3456

e734e888e1 · iq3_ks: AVX2 · Updated 2024-10-10 09:48:42 +02:00    git

4211
3463

f61c37967a · iq3_kl: use iq4_ks instead of iq4_k/iq4_xs · Updated 2024-10-09 11:50:43 +02:00    git

4211
3467

df2bd86a31 · WIP · Updated 2024-10-06 08:09:51 +02:00    git

4211
3458

acaa4869af · Move scale fudge factors to quantization · Updated 2024-10-04 15:14:52 +02:00    git

4211
3453

a553eb191a · Make the entire project c++17 · Updated 2024-10-04 13:23:21 +02:00    git

4211
3453

ed477f1cdc · Do not quantize activations if not necessary also for MoE models · Updated 2024-10-04 10:11:02 +02:00    git

4211
3452

38eb7fa499 · q6_0: this is slightly better · Updated 2024-10-02 17:07:55 +02:00    git

4211
3451

a8e932b734 · Fused y*unary(x) op: Metal · Updated 2024-10-02 15:51:29 +02:00    git

4211
3452