llama.cpp

mirror of https://github.com/ggerganov/llama.cpp synced 2026-04-18 05:05:43 +02:00

History

Reese Levine 45cac7ca70 ggml-webgpu: fix compiler warnings and refactor FlashAttention encoding (#21052 ) * Update workflows to remove dependence on llvmpipe * Try setting Dawn_DIR * remove c++20 initializers * Move to proper guid * Try avoiding segfaults on vulkan backend process exit * Remove compiler warnings on parameter casting * Fix soft_max and update reg_tile accumulation to f32 for better precision * Refactor flash_attn a bit * remove c++20 initializers and format * Increase div precision for NVIDIA * revert div precision and comment out ggml-ci node for now * Formatting * Try debugging on a failing CI node * Revert "Try debugging on a failing CI node" This reverts commit `1971e33cba`.		2026-04-17 09:17:11 -07:00
..
actions	ggml : add OpenVINO backend (#15307 )	2026-03-14 07:56:55 +02:00
ISSUE_TEMPLATE	issues: add openvino backends (#20932 )	2026-03-24 14:41:10 +08:00
workflows	ggml-webgpu: fix compiler warnings and refactor FlashAttention encoding (#21052 )	2026-04-17 09:17:11 -07:00
labeler.yml	ci: drop v5 `all:` composition from labeler.yml (#21627 )	2026-04-09 08:20:19 +02:00
pull_request_template.md	contrib: add "Requirements" section to PR template (#20841 )	2026-03-23 16:59:02 +01:00