Default Branch

b2cb4512c5 · Create parameters overview (#1269) · Updated 2026-02-20 07:20:56 +01:00

Branches

adb6b6fb3f · Update GGML_QUANT_SIZES · Updated 2025-04-24 06:06:26 +02:00    git

4211
3616

e79f523bcc · BitNet adjustments · Updated 2025-04-22 08:36:32 +02:00    git

4211
3642

3d7206e6ea · Support both model names · Updated 2025-04-22 04:47:17 +02:00    git

4211
3643

d75c151624 · Attempt fix 13 · Updated 2025-04-21 08:56:05 +02:00    git

4211
3652

3e41c56a8a · Minor · Updated 2025-04-16 16:15:05 +02:00    git

4211
3641

3164fa3310 · Better gemm/gemv on AVX2 fr q4_0_r8 · Updated 2025-04-15 17:12:22 +02:00    git

4211
3638

a164a50a36 · We need also these · Updated 2025-04-15 12:56:29 +02:00    git

4211
3638

8bff04c9d6 · Use stripped tensor name, not src0->name · Updated 2025-04-14 18:00:06 +02:00    git

4211
3638

4ed6076940 · Add ability to hide imatrix details in llama-quantize · Updated 2025-04-14 15:36:57 +02:00    git

4211
3635

4291d7e1e6 · Minor · Updated 2025-04-13 07:34:43 +02:00    git

4211
3636

9b24ae7fc6 · Fix KLD precision · Updated 2025-04-12 09:01:20 +02:00    git

4211
3633

c5f1a0ad25 · Correct L4 rms_norm · Updated 2025-04-11 10:45:33 +02:00    git

4211
3632

b51661bbff · llama4: this seems to be working · Updated 2025-04-09 11:02:22 +02:00    git

4211
3632

80846bb2c9 · WIP · Updated 2025-04-08 16:16:16 +02:00    git

4211
3632

5ec2cb63ae · Guard against attempts to use MLA for non-MLA models · Updated 2025-04-08 08:45:08 +02:00    git

4211
3630

ae7cf9a766 · More · Updated 2025-04-07 16:58:09 +02:00    git

4211
3629

8b9be1a048 · Add copyright notices · Updated 2025-04-07 10:19:54 +02:00    git

4211
3623

0dbcd57267 · Try not repacking q8_0 for FA computations · Updated 2025-04-06 08:49:59 +02:00    git

4211
3623

c2bab6cee5 · We need to synchronize before using device to host async memcpy · Updated 2025-04-05 14:28:20 +02:00    git

4211
3623

fe157dee95 · Better iq2_xs quantization · Updated 2025-04-05 10:51:26 +02:00    git

4211
3623