Default Branch

b2cb4512c5 · Create parameters overview (#1269) · Updated 2026-02-20 07:20:56 +01:00

Branches

de91911d7a · Fix Zen4 Flash Attention · Updated 2024-09-02 14:51:10 +02:00    git

4211
3408

6bc273c1d6 · Do not process prompts containing binary data for escapes · Updated 2024-09-02 08:12:08 +02:00    git

4211
3407

a66d1fc562 · Update FlashAttn comment · Updated 2024-09-01 11:47:50 +02:00    git

4211
3409

1b834ac6e4 · Flash attention: templated implementation · Updated 2024-08-31 12:10:36 +02:00    git

4211
3421

eb046ae5f3 · Fix build when iqk_mul_mat is disabled · Updated 2024-08-31 07:52:08 +02:00    git

4211
3405

3b4fe65e1c · Minor · Updated 2024-08-28 17:31:27 +02:00    git

4211
3417

e4f200098b · Flash attention with softcap: Metal · Updated 2024-08-26 18:34:43 +02:00    git

4211
3409

c9116e9eca · softcap: minor improvement · Updated 2024-08-21 11:34:35 +02:00    git

4211
3403

cc3d42e60b · softcap, tanh: avoid NaNs for large arguments (NEON) · Updated 2024-08-20 14:46:30 +02:00    git

4211
3408

3e97ec87a2 · iq4_k: use iq5_k also when n_gqa = 2 · Updated 2024-08-20 08:29:45 +02:00    git

4211
3401

6b6e2f2dbc · AVX2 quantization for Q8_K · Updated 2024-08-19 14:31:55 +02:00    git

4211
3400

ff471dfd61 · quantize_stats: print rmse and max error as fraction of <x> · Updated 2024-08-19 12:47:19 +02:00    git

4211
3399

b2212f170c · iq2_k: slightly better bpw - accuracy compromise · Updated 2024-08-19 12:33:19 +02:00    git

4211
3398

686e75650e · Skip barriers of noops · Updated 2024-08-14 08:49:12 +02:00    git

4211
3397

28bb16556d · Remove CI check · Updated 2024-08-12 11:28:44 +02:00    git

4211
3397

2c9aaae809 · Fix Makefile · Updated 2024-08-09 16:29:32 +02:00    git

4211
3394

bf745357ec · Fix Zen4 implementation of iq3_k, iq4_k, iq5_k · Updated 2024-08-09 09:32:07 +02:00    git

4211
3393

8178075f84 · iq2_tn: small NEON improvement · Updated 2024-08-06 12:08:22 +02:00    git

4211
3393

54481695a0 · q2_K: allow it to detect ternary nets and quantize accordingly · Updated 2024-08-05 10:59:36 +02:00    git

4211
3382

de818b77d6 · iq3_k, iq5_k: faster quantization · Updated 2024-08-05 07:13:53 +02:00    git

4211
3380