ik_llama.cpp

History

Kawrakow 46968d4ab1 Sanitize imatrix (#735 ) * sanitize importance matrix: WIP * sanitize importance matrix: iq4_k * sanitize importance matrix: iq5_k, iq6_k * sanitize imatrix: iq4_ks * sanitize imatrix: iq4_kss * sanitize imatrix: iq2_ks and iq2_kl * sanitize imatrix: iq5_ks * sanitize imatrix: iq4_nl_r4 * sanitize imatrix: q4_0_r8 * sanitize imatrix: q6_0_r4 * sanitize imatrix: iq4_xs_r8 * sanitize imatrix: iq4_xs_r8 and q3_k_r4 with a template * sanitize imatrix: q2_k_r4, q4_k_r4, q5_k_r4, q6_k_r4 * sanitize imatrix: repacked i-quants * Minor * Add more checks for iq3_k, iq3_ks --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>		2025-08-29 09:08:15 +03:00
..
cmake	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
include	AVX512+AVXVNNI GEMM implementation for quants using Q8_K for activations (#710 )	2025-08-22 06:27:07 +03:00
src	Sanitize imatrix (#735 )	2025-08-29 09:08:15 +03:00
.gitignore	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
CMakeLists.txt	Enable CUDA graphs for MoE models + GPT-OSS support (#689 )	2025-08-15 09:18:07 +03:00