For LLaMA-3.1 models:
* It is better to quantize all of attn_v with iq3_k instead of
half of attn_v with iq4_k
* Quantizing attn_output with iq3_k results in a larger PPL decrease
compared to what one expects from the added bpw.
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>