ik_llama.cpp

main

b2cb4512c5 · Create parameters overview (#1269) · Updated 2026-02-20 07:20:56 +01:00

ik/revert_0bf4d997 df226f38c4 · Fixed compilation after revert · Updated 2025-02-07 10:44:52 +01:00 git	4211 3549	ZIP TAR.GZ
ik/iq1_s_checks 38f2270a15 · Add additional checks for iq1_s_r4 quantization · Updated 2025-02-07 07:19:58 +01:00 git	4211 3546	ZIP TAR.GZ
ik/cuda_rms_non_contiguous 9ac82537dc · cuda: non-contiguous rms norm · Updated 2025-02-06 18:41:17 +01:00 git	4211 3546	ZIP TAR.GZ
ik/rename_4_8 5c37edf98e · Rename iq4_xs_r4 to iq4_xs_r8 to reflect actual row interleaving · Updated 2025-02-06 15:46:44 +01:00 git	4211 3547	ZIP TAR.GZ
ik/iq1_m_r4 54585d6946 · iq1_m_r4: rename mul_mat_iq1_m_r4_q8_1 to mul_mat_iq1_m_r4_q8_0 · Updated 2025-02-06 08:56:18 +01:00 git	4211 3548	ZIP TAR.GZ
ik/iq1_s_r4_neon f3c6937fe5 · iq1_s_r4: slightly faster NEON gemm/gemv · Updated 2025-02-05 13:22:22 +01:00 git	4211 3543	ZIP TAR.GZ
ik/iq1_s_r4 3c9b116600 · Compiler warnings · Updated 2025-02-05 10:12:00 +01:00 git	4211 3551	ZIP TAR.GZ
ik/qmix_tweaks_2 b8966277c0 · Make q5,6_0_r4, iq4_nl_e4 work with row size that are not a multiple of 128 · Updated 2025-01-30 17:29:04 +01:00 git	4211 3548	ZIP TAR.GZ
ik/qx_k_b32_avx2 195d7efc8e · Cleanup · Updated 2025-01-30 08:24:52 +01:00 git	4211 3546	ZIP TAR.GZ
ik/bench_gp 23e90dc325 · Make q4_0_r4 work with tensor row sizes that are not a multiple of 128 · Updated 2025-01-29 08:55:10 +01:00 git	4211 3545	ZIP TAR.GZ
ik/q4_0_r8 b22ed8bc66 · Be able to load Deepseek-v2-Lite · Updated 2025-01-27 16:47:24 +01:00 git	4211 3556	ZIP TAR.GZ
ik/iq4_xs_r8_v2 56ca4c3ba9 · FA: repack Q8_0 to Q8_0_R8 (NEON) · Updated 2025-01-26 11:24:38 +01:00 git	4211 3546	ZIP TAR.GZ
ik/chat_templates bb23d014ab · Removing missed conflict marker · Updated 2025-01-23 18:31:49 +01:00 git	4211 3537	ZIP TAR.GZ
ik/gemv_bf16_r16 d868ca149a · Disable mul_mat_Qx_Qy_Mx1 on AVX2 · Updated 2025-01-23 10:58:42 +01:00 git	4211 3535	ZIP TAR.GZ
ik/avx2_bf16 cc7642c757 · Slightly faster fp16/bf16 gemv on AVX2 · Updated 2025-01-22 08:03:57 +01:00 git	4211 3534	ZIP TAR.GZ
ik/zen4_repack_f16 ef2b0066b9 · On Zen4 repack fp16 models to bf16_r16 when run-time-repacking is requested · Updated 2025-01-21 18:14:57 +01:00 git	4211 3532	ZIP TAR.GZ
ik/fattn_kqv 31d7424afb · FA: turn off performance timer · Updated 2025-01-19 17:37:46 +01:00 git	4211 3543	ZIP TAR.GZ
ik/fattn_bf16 3e7d5c180c · On Zen4 it is also better to not use large Q steps for fp16 K-cache · Updated 2025-01-15 17:09:07 +01:00 git	4211 3539	ZIP TAR.GZ
ik/fix_fattn_odd_even 983e86805e · Fix the strange FA behavior with odd/even batch sizes · Updated 2025-01-12 15:49:25 +01:00 git	4211 3529	ZIP TAR.GZ
ik/fix_mul_mat_16 e2f8747555 · Make sure rows per thread is a multiple of 4 also for MoE when using _r4 quants · Updated 2025-01-12 10:39:52 +01:00 git	4211 3529	ZIP TAR.GZ

... 26 27 28 29 30 ...

Default Branch

Branches