ik_llama.cpp

History

firecoperana bb358223cd server: cache prompt to host memory (#954 ) * server : host-memory prompt caching change similarity calculation and prompt save conditions Remove unneeded token limit rename variable Separate prompt save and load logic change default values change log remove truncate prompt logic * add description * bug fixes * remove token limit in init --------- Co-authored-by: firecoperana <firecoperana>		2025-11-14 18:40:13 +02:00
..
cmake	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
base64.hpp	llava : expose as a shared library for downstream projects (#3613 )	2023-11-07 00:36:23 +03:00
build-info.cpp.in	build : link against build info instead of compiling against it (#3879 )	2023-11-02 08:50:16 +02:00
chat-parser.cpp	Add --webui arg to launch llama.cpp new webui (#786 )	2025-10-27 14:22:02 +02:00
chat-parser.h	Move minja and nlohmann/json to vendor (#802 )	2025-09-27 09:12:35 +02:00
chat.cpp	Add --webui arg to launch llama.cpp new webui (#786 )	2025-10-27 14:22:02 +02:00
chat.h	Port mdmd from mainline + Qwen2/2.5-VL support (#798 )	2025-09-27 08:45:29 +02:00
CMakeLists.txt	Add vision support in llama-server (#901 )	2025-11-05 10:43:46 +02:00
common.cpp	server: cache prompt to host memory (#954 )	2025-11-14 18:40:13 +02:00
common.h	server: cache prompt to host memory (#954 )	2025-11-14 18:40:13 +02:00
console.cpp	check C++ code with -Wmissing-declarations (#3184 )	2023-09-15 15:38:27 -04:00
console.h	gguf : new file format with flexible meta data (beta) (#2398 )	2023-08-21 23:07:43 +03:00
grammar-parser.cpp	Tool calls support from mainline (#723 )	2025-09-01 08:38:49 +03:00
grammar-parser.h	Tool calls support from mainline (#723 )	2025-09-01 08:38:49 +03:00
json-partial.cpp	Move minja and nlohmann/json to vendor (#802 )	2025-09-27 09:12:35 +02:00
json-partial.h	Move minja and nlohmann/json to vendor (#802 )	2025-09-27 09:12:35 +02:00
json-schema-to-grammar.cpp	Tool calls support from mainline (#723 )	2025-09-01 08:38:49 +03:00
json-schema-to-grammar.h	Move minja and nlohmann/json to vendor (#802 )	2025-09-27 09:12:35 +02:00
llguidance.cpp	Tool calls support from mainline (#723 )	2025-09-01 08:38:49 +03:00
log.h	Tool calls support from mainline (#723 )	2025-09-01 08:38:49 +03:00
ngram-cache.cpp	Fixed lookup compilation issues on Windows (#6273 )	2024-03-24 14:21:17 +01:00
ngram-cache.h	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
regex-partial.cpp	Tool calls support from mainline (#723 )	2025-09-01 08:38:49 +03:00
regex-partial.h	Tool calls support from mainline (#723 )	2025-09-01 08:38:49 +03:00
sampling.cpp	Move minja and nlohmann/json to vendor (#802 )	2025-09-27 09:12:35 +02:00
sampling.h	Tool calls support from mainline (#723 )	2025-09-01 08:38:49 +03:00
speculative.cpp	Support --device and --device-draft parameter (#866 )	2025-10-27 18:13:28 +02:00
speculative.h	Port universal assisted decoding to llama-server (#699 )	2025-08-18 09:22:23 +03:00
train.cpp	train : change default FA argument (#7528 )	2024-05-25 15:22:35 +03:00
train.h	sync : ggml (backend v2) (#3912 )	2023-11-13 14:16:23 +02:00