ik_llama.cpp

Kawrakow 69fdd041c1 Remove forgotten unused code	2026-01-26 12:54:21 +00:00
..
cmake	Merge vulkan code from mainline up to commit of 6/28/2025 (#563 )	2025-07-02 08:49:42 +02:00
ggml-cann	Merge mainline - Aug 12 2024 (#17 )	2024-08-12 15:14:32 +02:00
ggml-cuda	Remove forgotten unused code	2026-01-26 12:54:21 +00:00
ggml-sycl	Merge mainline - Aug 12 2024 (#17 )	2024-08-12 15:14:32 +02:00
iqk	Faster adaptive_p sampling (#1165 )	2026-01-19 16:03:09 +02:00
kompute@4565194ed7	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
kompute-shaders	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
llamafile	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
vulkan-shaders	Port of Qwen3-VL support from mainline (#883 )	2025-11-04 19:20:54 +02:00
CMakeLists.txt	Remove llamafile remnants (#1179 )	2026-01-22 13:20:23 +02:00
ggml-aarch64.c	Merge mainline - Aug 12 2024 (#17 )	2024-08-12 15:14:32 +02:00
ggml-aarch64.h	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
ggml-alloc.c	Enable CUDA graphs for MoE models + GPT-OSS support (#689 )	2025-08-15 09:18:07 +03:00
ggml-backend-impl.h	Merge vulkan code from mainline up to commit of 6/28/2025 (#563 )	2025-07-02 08:49:42 +02:00
ggml-backend.cpp	Fix build failure when OpenMP is not available (#1171 )	2026-01-22 12:26:23 +02:00
ggml-blas.cpp	Merge mainline - Aug 12 2024 (#17 )	2024-08-12 15:14:32 +02:00
ggml-cann.cpp	Merge vulkan code from mainline up to commit of 6/28/2025 (#563 )	2025-07-02 08:49:42 +02:00
ggml-common.h	AVX512+AVXVNNI GEMM implementation for quants using Q8_K for activations (#710 )	2025-08-22 06:27:07 +03:00
ggml-cuda.cu	Fix build with GGML_CUDA_GRAPHS=OFF	2026-01-22 10:46:57 +00:00
ggml-impl.h	MXFP4 (#682 )	2025-08-09 08:40:18 +03:00
ggml-kompute.cpp	Merge vulkan code from mainline up to commit of 6/28/2025 (#563 )	2025-07-02 08:49:42 +02:00
ggml-metal.m	MXFP4 (#682 )	2025-08-09 08:40:18 +03:00
ggml-metal.metal	MXFP4 (#682 )	2025-08-09 08:40:18 +03:00
ggml-quants.c	Fix avx2 GEMM mess (v2) (#724 )	2025-08-27 08:03:47 +03:00
ggml-quants.h	IQ1_M_R4: better 1.75 bpw quants (#187 )	2025-02-06 14:08:52 +02:00
ggml-rpc.cpp	server: improve speed of speculative decoding (#1119 )	2026-01-10 08:01:22 +02:00
ggml-sycl.cpp	Merge vulkan code from mainline up to commit of 6/28/2025 (#563 )	2025-07-02 08:49:42 +02:00
ggml-vulkan.cpp	Port of Qwen3-VL support from mainline (#883 )	2025-11-04 19:20:54 +02:00
ggml.c	Remove llamafile remnants (#1179 )	2026-01-22 13:20:23 +02:00

cmake

Merge vulkan code from mainline up to commit of 6/28/2025 (#563 )

2025-07-02 08:49:42 +02:00

ggml-cann

Merge mainline - Aug 12 2024 (#17 )

2024-08-12 15:14:32 +02:00

ggml-cuda

Remove forgotten unused code

2026-01-26 12:54:21 +00:00

ggml-sycl

Merge mainline - Aug 12 2024 (#17 )

2024-08-12 15:14:32 +02:00

iqk

Faster adaptive_p sampling (#1165 )

2026-01-19 16:03:09 +02:00

kompute@4565194ed7

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

kompute-shaders

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

llamafile

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

vulkan-shaders

Port of Qwen3-VL support from mainline (#883 )

2025-11-04 19:20:54 +02:00

CMakeLists.txt

Remove llamafile remnants (#1179 )

2026-01-22 13:20:23 +02:00

ggml-aarch64.c

Merge mainline - Aug 12 2024 (#17 )

2024-08-12 15:14:32 +02:00

ggml-aarch64.h

Merge mainline llama.cpp (#3 )

2024-07-27 07:55:01 +02:00

ggml-alloc.c

Enable CUDA graphs for MoE models + GPT-OSS support (#689 )

2025-08-15 09:18:07 +03:00

ggml-backend-impl.h

Merge vulkan code from mainline up to commit of 6/28/2025 (#563 )

2025-07-02 08:49:42 +02:00

ggml-backend.cpp

Fix build failure when OpenMP is not available (#1171 )

2026-01-22 12:26:23 +02:00

ggml-blas.cpp

Merge mainline - Aug 12 2024 (#17 )

2024-08-12 15:14:32 +02:00

ggml-cann.cpp

Merge vulkan code from mainline up to commit of 6/28/2025 (#563 )

2025-07-02 08:49:42 +02:00

ggml-common.h

AVX512+AVXVNNI GEMM implementation for quants using Q8_K for activations (#710 )

2025-08-22 06:27:07 +03:00

ggml-cuda.cu

Fix build with GGML_CUDA_GRAPHS=OFF

2026-01-22 10:46:57 +00:00

ggml-impl.h

MXFP4 (#682 )

2025-08-09 08:40:18 +03:00

ggml-kompute.cpp

Merge vulkan code from mainline up to commit of 6/28/2025 (#563 )

2025-07-02 08:49:42 +02:00

ggml-metal.m

MXFP4 (#682 )

2025-08-09 08:40:18 +03:00

ggml-metal.metal

MXFP4 (#682 )

2025-08-09 08:40:18 +03:00

ggml-quants.c

Fix avx2 GEMM mess (v2) (#724 )

2025-08-27 08:03:47 +03:00

ggml-quants.h

IQ1_M_R4: better 1.75 bpw quants (#187 )

2025-02-06 14:08:52 +02:00

ggml-rpc.cpp

server: improve speed of speculative decoding (#1119 )

2026-01-10 08:01:22 +02:00

ggml-sycl.cpp

Merge vulkan code from mainline up to commit of 6/28/2025 (#563 )

2025-07-02 08:49:42 +02:00

ggml-vulkan.cpp

Port of Qwen3-VL support from mainline (#883 )

2025-11-04 19:20:54 +02:00

ggml.c

Remove llamafile remnants (#1179 )

2026-01-22 13:20:23 +02:00