mirror of
https://github.com/ggerganov/llama.cpp
synced 2026-03-03 13:50:01 +01:00
* webgpu : pipeline flash_attn Q/K loads in WGSL * ggml-webgpu: unroll Q*K accumlation inner loop * ggml-webgpu: vectorization * ggml-webgpu: unrolling * ggml-webgpu: remove redundant unrolling * ggml-webgpu: restore the config * ggml-webgpu: remove redundant comments * ggml-webgpu: formatting * ggml-webgpu: formatting and remove vectorization * ggml-webgpu: remove unnecessary constants * ggml-webgpu: change QKV buffer to read_write to pass validation * ggml-webgpu: add explanation for the additional bracket around Q K accumulate * Indentation and for -> if for tail * Kick off CI on wgsl only commits --------- Co-authored-by: Reese Levine <reeselevine1@gmail.com> |
||
|---|---|---|
| .. | ||
| bench.yml.disabled | ||
| build-cache.yml | ||
| build-cmake-pkg.yml | ||
| build-linux-cross.yml | ||
| build.yml | ||
| check-vendor.yml | ||
| close-issue.yml | ||
| copilot-setup-steps.yml | ||
| docker.yml | ||
| editorconfig.yml | ||
| gguf-publish.yml | ||
| labeler.yml | ||
| pre-tokenizer-hashes.yml | ||
| python-check-requirements.yml | ||
| python-lint.yml | ||
| python-type-check.yml | ||
| release.yml | ||
| server-webui.yml | ||
| server.yml | ||
| update-ops-docs.yml | ||
| winget.yml | ||