ik_llama.cpp

History

Kawrakow fbb67fa2bd Fused norm (#1086 ) * Adding fused_norm - same idea as fused_rms_norm * Avoid computing the attention reduce op for cohere2 --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>		2025-12-24 15:22:43 +01:00
..
ggml-alloc.h	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
ggml-backend.h	Better PP performance with split mode "graph" and 3+ GPUs (#1069 )	2025-12-17 07:40:25 +01:00
ggml-blas.h	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
ggml-cann.h	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
ggml-cpp.h	Port mdmd from mainline + Qwen2/2.5-VL support (#798 )	2025-09-27 08:45:29 +02:00
ggml-cuda.h	CUDA: set compute parameters via command line arguments (#910 )	2025-11-07 07:11:23 +02:00
ggml-kompute.h	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
ggml-metal.h	Merge mainline - Aug 12 2024 (#17 )	2024-08-12 15:14:32 +02:00
ggml-rpc.h	RPC: support multiple devices including cpu (#1024 )	2025-11-30 18:48:02 +01:00
ggml-sycl.h	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
ggml-vulkan.h	Vulkan: a fresh start (#608 )	2025-07-15 08:03:13 +02:00
ggml.h	Fused norm (#1086 )	2025-12-24 15:22:43 +01:00