ik_llama.cpp

History

Iwan Kawrakow 1f96fc97c6 Faster tensor name formatting We gain ~1% for Ling-mini-2.0 when running on CUDA.		2025-10-24 07:42:04 +03:00
..
cmake	Merge vulkan code from mainline up to commit of 6/28/2025 (#563 )	2025-07-02 08:49:42 +02:00
ggml-cann	Merge mainline - Aug 12 2024 (#17 )	2024-08-12 15:14:32 +02:00
ggml-cuda	Fused mul + multi_add op (#858 )	2025-10-24 07:40:35 +03:00
ggml-sycl	Merge mainline - Aug 12 2024 (#17 )	2024-08-12 15:14:32 +02:00
iqk	Fused mul + multi_add op (#858 )	2025-10-24 07:40:35 +03:00
kompute@4565194ed7	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
kompute-shaders	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
llamafile	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
vulkan-shaders	Vulkan: a fresh start (#608 )	2025-07-15 08:03:13 +02:00
CMakeLists.txt	Better argsort (CPU) (#835 )	2025-10-16 11:31:03 +03:00
ggml-aarch64.c	Merge mainline - Aug 12 2024 (#17 )	2024-08-12 15:14:32 +02:00
ggml-aarch64.h	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
ggml-alloc.c	Enable CUDA graphs for MoE models + GPT-OSS support (#689 )	2025-08-15 09:18:07 +03:00
ggml-backend-impl.h	Merge vulkan code from mainline up to commit of 6/28/2025 (#563 )	2025-07-02 08:49:42 +02:00
ggml-backend.cpp	gpt-oss: duplicate experts biases when necessary (#829 )	2025-10-14 14:38:40 +03:00
ggml-blas.cpp	Merge mainline - Aug 12 2024 (#17 )	2024-08-12 15:14:32 +02:00
ggml-cann.cpp	Merge vulkan code from mainline up to commit of 6/28/2025 (#563 )	2025-07-02 08:49:42 +02:00
ggml-common.h	AVX512+AVXVNNI GEMM implementation for quants using Q8_K for activations (#710 )	2025-08-22 06:27:07 +03:00
ggml-cuda.cu	Fused mul + multi_add op (#858 )	2025-10-24 07:40:35 +03:00
ggml-impl.h	MXFP4 (#682 )	2025-08-09 08:40:18 +03:00
ggml-kompute.cpp	Merge vulkan code from mainline up to commit of 6/28/2025 (#563 )	2025-07-02 08:49:42 +02:00
ggml-metal.m	MXFP4 (#682 )	2025-08-09 08:40:18 +03:00
ggml-metal.metal	MXFP4 (#682 )	2025-08-09 08:40:18 +03:00
ggml-quants.c	Fix avx2 GEMM mess (v2) (#724 )	2025-08-27 08:03:47 +03:00
ggml-quants.h	IQ1_M_R4: better 1.75 bpw quants (#187 )	2025-02-06 14:08:52 +02:00
ggml-rpc.cpp	Merge vulkan code from mainline up to commit of 6/28/2025 (#563 )	2025-07-02 08:49:42 +02:00
ggml-sycl.cpp	Merge vulkan code from mainline up to commit of 6/28/2025 (#563 )	2025-07-02 08:49:42 +02:00
ggml-vulkan.cpp	Vulkan: a fresh start (#608 )	2025-07-15 08:03:13 +02:00
ggml.c	Faster tensor name formatting	2025-10-24 07:42:04 +03:00