whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp synced 2026-03-07 07:29:21 +01:00

History

Georgi Gerganov 27533e7f63 metal : improve FA + improve MoE (llama/12612) * ggml : FA with different K, V head sizes (CPU) ggml-ci * metal : add FA with HS=192 * metal : extend FA to support different K and V head sizes ggml-ci * metal : add FA vector kernels for heads K 192 and V 128 ggml-ci * ggml : restrict op on other backends to equal head sizes ggml-ci * metal : optimize FA-vec kernel ggml-ci * metal : FA remove mq registers * metal : improve MoE mul_mat_id condition ggml-ci * metal : fix comments + remove unnecessary addition ggml-ci * metal : avoid too much shared memory usage with mul_mat_id ggml-ci		2025-03-28 21:47:42 +02:00
..
ggml-alloc.h	ggml : upgrade init_tensor API to return a ggml_status (llama/11854)	2025-03-08 15:13:01 +02:00
ggml-backend.h	ggml : upgrade init_tensor API to return a ggml_status (llama/11854)	2025-03-08 15:13:01 +02:00
ggml-blas.h	ggml : build backends as libraries (llama/10256)	2024-11-20 21:00:08 +02:00
ggml-cann.h	ggml : build backends as libraries (llama/10256)	2024-11-20 21:00:08 +02:00
ggml-cpp.h	GGUF: C++ refactor, backend support, misc fixes (llama/11030)	2025-01-14 10:38:01 +02:00
ggml-cpu.h	ggml-cpu: Faster IQ1 mul_mat_vec on AVX2 using BMI2 instructions (llama/12154)	2025-03-08 15:13:01 +02:00
ggml-cuda.h	ggml : build backends as libraries (llama/10256)	2024-11-20 21:00:08 +02:00
ggml-kompute.h	ggml : build backends as libraries (llama/10256)	2024-11-20 21:00:08 +02:00
ggml-metal.h	repo : update links to new url (llama/11886)	2025-02-27 08:55:36 +02:00
ggml-opencl.h	Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (llama/10693)	2024-12-18 12:52:16 +02:00
ggml-opt.h	ggml: new optimization interface (ggml/988)	2024-11-20 21:00:08 +02:00
ggml-rpc.h	rpc : send hash when tensor data is above some fixed threshold (llama/12496)	2025-03-28 21:47:42 +02:00
ggml-sycl.h	ggml : build backends as libraries (llama/10256)	2024-11-20 21:00:08 +02:00
ggml-vulkan.h	vulkan: Make Vulkan optional at runtime (ggml/11493). (llama/11494)	2025-02-27 08:55:36 +02:00
ggml.h	metal : improve FA + improve MoE (llama/12612)	2025-03-28 21:47:42 +02:00
gguf.h	GGUF: C++ refactor, backend support, misc fixes (skip) (llama/11030)	2025-01-14 10:38:01 +02:00