mirror of
https://github.com/ggerganov/llama.cpp
synced 2026-03-10 00:59:32 +01:00
* json: rename python schema converter to make import easier
* server: skip null json_schema / grammar fields
* json: deps management for primitive rules (+ allow null values)
* json: optimize repetitions for minItems/maxItems and regexps: `a{,3}` goes from `"a"? "a"? "a"?` (explosive combos) to `(a (a (a)?)?)?`
* grammars: add troubleshooting section to readme
* json: cap length of numbers to 15 digits before/after decimal point
(avoids infinite gen, e.g. "one third" -> `0.333333333333...`)
* json: unify all repetition code (w/ or w/o sep)
* json: support string minLength/maxLength
* server+json: update server/README w/ result_format
* nits
* json: fix type error w/ python 3.8
* json: fix server/README (json_schema in /completion vs. result_format in /v1/chat/completions)
* json: simplify DOT `{"type": "string", "pattern": "^.$"}`
* json: remove recursion in opt_repetitions (avoids Python stack overflow)
* json: rm dead code
* json: rm useless assert & ggml.h import
|
||
|---|---|---|
| .. | ||
| .gitignore | ||
| CMakeLists.txt | ||
| get-model.cpp | ||
| get-model.h | ||
| run-json-schema-to-grammar.mjs | ||
| test-autorelease.cpp | ||
| test-backend-ops.cpp | ||
| test-c.c | ||
| test-chat-template.cpp | ||
| test-double-float.cpp | ||
| test-grad0.cpp | ||
| test-grammar-integration.cpp | ||
| test-grammar-parser.cpp | ||
| test-json-schema-to-grammar.cpp | ||
| test-llama-grammar.cpp | ||
| test-model-load-cancel.cpp | ||
| test-opt.cpp | ||
| test-quantize-fns.cpp | ||
| test-quantize-perf.cpp | ||
| test-rope.cpp | ||
| test-sampling.cpp | ||
| test-tokenizer-0-falcon.cpp | ||
| test-tokenizer-0-falcon.py | ||
| test-tokenizer-0-llama.cpp | ||
| test-tokenizer-0-llama.py | ||
| test-tokenizer-1-bpe.cpp | ||
| test-tokenizer-1-llama.cpp | ||