ik_llama.cpp

History

Kawrakow 4819257ce6 Quantization improvements (#295 ) * Better make_qx_quants Tested with q4_0 and q3_K (pure, imatrix), and the improvement is quite significant. * Sae for iq4_nl, iq4_xs --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>		2025-03-29 08:09:52 +01:00
..
cmake	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
include	Use bf16 instead of fp16 block scales for q8_1 (#292 )	2025-03-27 05:49:16 +01:00
src	Quantization improvements (#295 )	2025-03-29 08:09:52 +01:00
.gitignore	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
CMakeLists.txt	Compile time option to use bf16 for qunts without MMQ kernels (#261 )	2025-03-18 07:37:10 +01:00