ik_llama.cpp

History

Iwan Kawrakow 36e9c922b8 iq2_kt - this is better Using blocks of 32 and 16 bits per group of 8 weights it beats iq2_xxs in terms of PPL by a significant margin. It is 0.0625 bpw larger, but even if we go to 15 bits per group od 8 (so 0.0625 bpw less than iq2_xxs), PPL is still lower.	2024-11-21 08:16:41 +02:00
..
CMakeLists.txt	WIP	2024-11-21 08:16:40 +02:00
quantize-stats.cpp	iq2_kt - this is better	2024-11-21 08:16:41 +02:00

Iwan Kawrakow 36e9c922b8 iq2_kt - this is better

Using blocks of 32 and 16 bits per group of 8 weights
it beats iq2_xxs in terms of PPL by a significant margin.
It is 0.0625 bpw larger, but even if we go to 15 bits per
group od 8 (so 0.0625 bpw less than iq2_xxs), PPL is still
lower.

2024-11-21 08:16:41 +02:00

CMakeLists.txt

WIP

2024-11-21 08:16:40 +02:00

quantize-stats.cpp

iq2_kt - this is better

2024-11-21 08:16:41 +02:00