llama.cpp

mirror of https://github.com/ggerganov/llama.cpp synced 2026-03-15 11:40:50 +01:00

History

Georgi Gerganov 6562e5a4d6 context : allow cache-less context for embeddings (#13108 ) * context : allow cache-less context for embeddings ggml-ci * context : enable reranking with encode() ggml-ci * context : encode() clears embd_seq ggml-ci * examples : use llama_encode() when appropriate ggml-ci * models : nomic bert moe does not require KV cache * llama : update comments for llama_decode/llama_encode ggml-ci * context : update warning log [no ci]	2025-05-08 14:28:33 +03:00
..
llama-cpp.h	llama : add `llama_vocab`, functions -> methods, naming (#11110 )	2025-01-12 11:32:42 +02:00
llama.h	context : allow cache-less context for embeddings (#13108 )	2025-05-08 14:28:33 +03:00

Georgi Gerganov 6562e5a4d6

context : allow cache-less context for embeddings (#13108 )

* context : allow cache-less context for embeddings

ggml-ci

* context : enable reranking with encode()

ggml-ci

* context : encode() clears embd_seq

ggml-ci

* examples : use llama_encode() when appropriate

ggml-ci

* models : nomic bert moe does not require KV cache

* llama : update comments for llama_decode/llama_encode

ggml-ci

* context : update warning log [no ci]

2025-05-08 14:28:33 +03:00

llama-cpp.h llama : add llama_vocab, functions -> methods, naming (#11110 ) 2025-01-12 11:32:42 +02:00

llama.h context : allow cache-less context for embeddings (#13108 ) 2025-05-08 14:28:33 +03:00