ik_llama.cpp/src
2026-01-24 05:03:52 +00:00
..
CMakeLists.txt Enable and clean up compiler warnings in src (#824) 2025-10-11 16:01:13 +03:00
llama-arch.cpp Mimo-V2-Flash support (#1096) 2026-01-05 08:00:01 +02:00
llama-arch.h Mimo-V2-Flash support (#1096) 2026-01-05 08:00:01 +02:00
llama-build-context.cpp Disable when the KV cache is not f16 2026-01-24 05:03:52 +00:00
llama-build-context.h Avoid ggml_get_rows if not necessary (#1160) 2026-01-20 15:38:21 +02:00
llama-context.h POC: CUDA tensor parallel (MoE models) (#1022) 2025-12-01 19:25:40 +01:00
llama-cparams.h Additional graph reduce types for split mode graph (#1154) 2026-01-18 08:02:49 +02:00
llama-grammar.cpp Update grammar (#1023) 2025-11-30 18:45:38 +01:00
llama-grammar.h Update grammar (#1023) 2025-11-30 18:45:38 +01:00
llama-hparams.cpp Make comments more precise when experts gating function is missing (#1175) 2026-01-21 09:12:40 +02:00
llama-hparams.h Mimo-V2-Flash support (#1096) 2026-01-05 08:00:01 +02:00
llama-impl.h server: stop processing the prompt when client disconnects (#1134) 2026-01-13 07:56:59 +02:00
llama-load-tensors.cpp Avoid ggml_get_rows if not necessary (#1160) 2026-01-20 15:38:21 +02:00
llama-mmap.cpp Enable CUDA graphs for MoE models + GPT-OSS support (#689) 2025-08-15 09:18:07 +03:00
llama-mmap.h Enable CUDA graphs for MoE models + GPT-OSS support (#689) 2025-08-15 09:18:07 +03:00
llama-model-loader.cpp Merge ffn_up and ffn_gate experts tensors (#1137) 2026-01-12 18:30:53 +02:00
llama-model-loader.h Merge ffn_up and ffn_gate experts tensors (#1137) 2026-01-12 18:30:53 +02:00
llama-model.cpp Mimo-V2-Flash support (#1096) 2026-01-05 08:00:01 +02:00
llama-model.h Merge ffn_up and ffn_gate experts tensors (#1137) 2026-01-12 18:30:53 +02:00
llama-quantize.cpp Merge ffn_up and ffn_gate experts tensors (#1137) 2026-01-12 18:30:53 +02:00
llama-sampling.cpp sampling: refactor sorting (#1166) 2026-01-19 16:48:54 +02:00
llama-sampling.h Faster adaptive_p sampling (#1165) 2026-01-19 16:03:09 +02:00
llama-vocab.cpp Server: refactor and rename functions (#1151) 2026-01-18 08:16:57 +02:00
llama-vocab.h Update mtmd to improve accuracy of M-RoPE (#993) 2025-11-29 07:27:15 +01:00
llama.cpp Remove llamafile remnants (#1179) 2026-01-22 13:20:23 +02:00
unicode-data.cpp Merge mainline llama.cpp (#3) 2024-07-27 07:55:01 +02:00
unicode-data.h Merge mainline llama.cpp (#3) 2024-07-27 07:55:01 +02:00
unicode.cpp Server: refactor and rename functions (#1151) 2026-01-18 08:16:57 +02:00
unicode.h Enable CUDA graphs for MoE models + GPT-OSS support (#689) 2025-08-15 09:18:07 +03:00