ik_llama.cpp

History

Iwan Kawrakow f8d511a30f Revert "CUDA: prompt processing optimizations for MoE models (#739 )" This reverts commit `f22a9ef95a`.		2025-09-01 20:06:57 +03:00
..
CMakeLists.txt	Enable CUDA graphs for MoE models + GPT-OSS support (#689 )	2025-08-15 09:18:07 +03:00
llama-arch.h	Enable CUDA graphs for MoE models + GPT-OSS support (#689 )	2025-08-15 09:18:07 +03:00
llama-grammar.cpp	Tool calls support from mainline (#723 )	2025-09-01 08:38:49 +03:00
llama-grammar.h	Tool calls support from mainline (#723 )	2025-09-01 08:38:49 +03:00
llama-impl.h	Remove double definition of LLAMA_LOG_DEBUG	2025-09-01 08:42:04 +03:00
llama-mmap.cpp	Enable CUDA graphs for MoE models + GPT-OSS support (#689 )	2025-08-15 09:18:07 +03:00
llama-mmap.h	Enable CUDA graphs for MoE models + GPT-OSS support (#689 )	2025-08-15 09:18:07 +03:00
llama-model-loader.cpp	Enable CUDA graphs for MoE models + GPT-OSS support (#689 )	2025-08-15 09:18:07 +03:00
llama-model-loader.h	Enable CUDA graphs for MoE models + GPT-OSS support (#689 )	2025-08-15 09:18:07 +03:00
llama-sampling.cpp	Tool calls support from mainline (#723 )	2025-09-01 08:38:49 +03:00
llama-sampling.h	add dry sampler (#513 )	2025-06-19 10:24:53 +03:00
llama-vocab.cpp	Tool calls support from mainline (#723 )	2025-09-01 08:38:49 +03:00
llama-vocab.h	Tool calls support from mainline (#723 )	2025-09-01 08:38:49 +03:00
llama.cpp	Revert "CUDA: prompt processing optimizations for MoE models (#739 )"	2025-09-01 20:06:57 +03:00
unicode-data.cpp	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
unicode-data.h	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
unicode.cpp	Enable CUDA graphs for MoE models + GPT-OSS support (#689 )	2025-08-15 09:18:07 +03:00
unicode.h	Enable CUDA graphs for MoE models + GPT-OSS support (#689 )	2025-08-15 09:18:07 +03:00