mirror of
https://github.com/ggerganov/llama.cpp
synced 2026-03-29 03:15:32 +02:00
* CPU/CUDA: Gemma 2 FlashAttention support * apply logit_softcap to scale in kernel * disable logit softcapping tests on Metal * remove metal check |
||
|---|---|---|
| .. | ||
| ggml-alloc.h | ||
| ggml-backend.h | ||
| ggml-blas.h | ||
| ggml-cann.h | ||
| ggml-cuda.h | ||
| ggml-kompute.h | ||
| ggml-metal.h | ||
| ggml-rpc.h | ||
| ggml-sycl.h | ||
| ggml-vulkan.h | ||
| ggml.h | ||