From 6cf0c21b094771237e9ba9da7853d6f7bfca90f9 Mon Sep 17 00:00:00 2001 From: Georgi Gerganov Date: Tue, 7 Oct 2025 08:22:35 +0300 Subject: [PATCH] tests : add -INF blocks to the KQ mask in the FA tests (llama/16380) * tests : add -INF blocks to the KQ mask in the FA tests * cont : bump -INF block size to 64 Co-authored-by: Jeff Bolz * ggml : prevent division by zero in FA CPU op --------- Co-authored-by: Jeff Bolz --- ggml/src/ggml-cpu/ops.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ggml/src/ggml-cpu/ops.cpp b/ggml/src/ggml-cpu/ops.cpp index 6275c8305..8e1a2de14 100644 --- a/ggml/src/ggml-cpu/ops.cpp +++ b/ggml/src/ggml-cpu/ops.cpp @@ -8135,7 +8135,7 @@ static void ggml_compute_forward_flash_attn_ext_f16( } // V /= S - const float S_inv = 1.0f/S; + const float S_inv = S == 0.0f ? 0.0f : 1.0f/S; ggml_vec_scale_f32(DV, VKQ32, S_inv); // dst indices