llama.cpp

mirror of https://github.com/ggerganov/llama.cpp synced 2026-03-02 13:19:27 +01:00

History

Daniel Bevenius 25f40ca65f completion : simplify batch (embd) processing (#19286 ) * completion : simplify batch (embd) processing This commit simplifies the processing of embd by removing the for loop that currently exists which uses params.n_batch as its increment. This commit also removes the clamping of n_eval as the size of embd is always at most the size of params.n_batch. The motivation is to clarify the code as it is currently a little confusing when looking at this for loop in isolation and thinking that it can process multiple batches. * add an assert to verify n_eval is not greater than n_batch		2026-02-04 05:43:28 +01:00
..
batched-bench	tool/ex/tests: consistently free ctx, then model (#18168 )	2025-12-22 11:00:37 +01:00
cli	common : use two decimal places for float arg help messages (#19048 )	2026-01-25 07:31:42 +01:00
completion	completion : simplify batch (embd) processing (#19286 )	2026-02-04 05:43:28 +01:00
cvector-generator	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
export-lora	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
fit-params	llama-fit-params: keep explicit --ctx-size 0 (#19070 )	2026-01-24 22:13:08 +01:00
gguf-split	cli: new CLI experience (#17824 )	2025-12-10 15:28:59 +01:00
imatrix	common : refactor common_sampler + grammar logic changes (#17937 )	2025-12-14 10:11:13 +02:00
llama-bench	Setting mmap and direct_io to false as default in llama-bench.cpp (#18841 )	2026-01-16 09:46:51 +01:00
mtmd	mtmd: add min/max pixels gguf metadata (#19273 )	2026-02-02 20:59:06 +01:00
perplexity	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
quantize	quantize: add option --tensor-type-file to llama-quantize (#18572 )	2026-01-31 11:39:21 +08:00
rpc	Install rpc-server when GGML_RPC is ON. (#17149 )	2025-11-11 10:53:59 +00:00
server	server: print actual model name in 'model not found" error (#19117 )	2026-02-02 16:55:27 +01:00
tokenize	cmake : Do not install tools on iOS targets (#15903 )	2025-09-16 09:54:44 +07:00
tts	refactor : remove libcurl, use OpenSSL when available (#18828 )	2026-01-14 18:02:47 +01:00
CMakeLists.txt	cmake: only build cli when server is enabled (#18670 )	2026-01-09 16:43:26 +01:00