llama.cpp/tests/peg-parser/testing.h
Aldehir Rojas 0a8026e768
common : introduce composable PEG parser combinators for chat parsing (#17136)
* common : implement parser combinators to simplify chat parsing

* add virtual destructor to parser_base

* fix memory leak from circular references of rules

* implement gbnf grammar building

* remove unused private variable

* create a base visitor and implement id assignment as a visitor

* fix const ref for grammar builder

* clean up types, friend classes, and class declarations

* remove builder usage from until_parser

* Use a counter class to help assign rule ids

* cache everything

* add short description for each parser

* create a type for the root parser

* implement repetition parser

* Make optional, one_or_more, and zero_or_more subclasses of repetition

* improve context constructor

* improve until parsing and add benchmarks

* remove cached() pattern, cache in parser_base with specialized parsing functions for each parser

* improve json parsing performance to better match legacy parsing

* fix const auto * it for windows

* move id assignment to classes instead of using a visitor

* create named rules in the command r7b example

* use '.' for any in GBNF

* fix parens around choices in gbnf grammar

* add convenience operators to turn strings to literals

* add free-form operators for const char * to simplify defining literals

* simplify test case parser

* implement semantic actions

* remove groups in favor of actions and a scratchpad

* add built in actions for common operations

* add actions to command r7b example

* use std::default_searcher for platforms that don't have bm

* improve parser_type handling and add cast helper

* add partial result type to better control when to run actions

* fix bug in until()

* run actions on partial results by default

* use common_chat_msg for result

* add qwen3 example wip

* trash partial idea and simplify

* move action arguments to a struct

* implement aho-corasick matcher for until_parser and to build exclusion grammars

* use std::string for input, since std::string_view is incompatible with std::regex

* Refactor tests

* improve qwen3 example

* implement sax-style parsing and refactor

* fix json string in test

* rename classes to use common_chat_ prefix

* remove is_ suffix from functions

* rename from id_counter to just counter

* Final refactored tests

* Fix executable name and editorconfig-checker

* Third time's the charm...

* add trigger parser to begin lazy grammar rule generation

* working lazy grammar

* refactor json rules now that we check for reachability

* reduce pointer usage

* print out grammars in example

* rename to chat-peg-parser* and common_chat_peg_parser*

* Revert unrelated changes

* New macros for CMakeLists to enable multi-file compilations

* starting unicode support

* add unicode support to char_parser

* use unparsed args as additional sources

* Refactor tests to new harness

* Fix CMakeLists

* fix rate calculation

* add unicode tests

* fix trailing whitespace and line endings

skip-checks: true

* Helpers + rewrite qwen3 with helpers

* Fix whitespace

* extract unicode functions to separate file

* refactor parse unicode function

* fix compiler error

* improve construction of sequence/choice parsers

* be less clever

* add make_parser helper function

* expand usage of make_parser, alias common_chat_msg_peg_parser_builder to builder in source

* lower bench iterations

* add unicode support to until_parser

* add unicode support to json_string_parser

* clean up unicode tests

* reduce unicode details to match src/unicode.cpp

* simplify even further

* remove unused functions

* fix type

* reformat char class parsing

* clean up json string parser

* clean up + fix diagnostics

* reorder includes

* compact builder functions

* replace action_parser with capture_parser, rename env to semantics

* rename env to semantics

* clean up common_chat_parse_context

* move type() to below constant

* use default constructor for common_chat_peg_parser

* make all operators functions for consistency

* fix compilation errors in test-optional.cpp

* simplify result values

* rename json_string_unquoted to json_string_content

* Move helper to separate class, add separate explicit and helper classes

* Whitespace

* Change + to append()

* Reformat

* Add extra helpers, tests and Minimax example

* Add some extra optional debugging prints + real example of how to use them

* fix bug in repetitions when min_count = 0 reports failures

* dump rule in debug

* fix token accumulation and assert parsing never fails

* indent debug by depth

* use LOG_* in tests so logs sync up with test logs

* - Add selective testing
- Refactor all messaging to use LOG_ERR
- Fix lack of argument / tool name capturing
- Temporary fix for double event capture

* refactor rule() and introduce ref()

* clean up visitor

* clean up indirection in root parser w.r.t rules

* store shared ptr directly in parser classes

* replace aho-corasick automation with a simple trie

* Reset prev for qwen3 helper example variant

* refactor to use value semantics with std::variant/std::visit

* simplify trie_matcher result

* fix linting issues

* add annotations to rules

* revert test workaround

* implement serializing the parser

* remove redundant parsers

* remove tests

* gbnf generation fixes

* remove LOG_* use in tests

* update gbnf tests to test entire grammar

* clean up gbnf generation and fix a few bugs

* fix typo in test output

* remove implicit conversion rules

* improve test output

* rename trie_matcher to trie

* simplify trie to just know if a node is the end of a word

* remove common_chat_ prefix and ensure a common_peg_ prefix to all types

* rename chat-peg-parser -> peg-parser

* promote chat-peg-parser-helper to chat-peg-parser

* checkpoint

* use a static_assert to ensure we handle every branch

* inline trivial peg parser builders

* use json strings for now

* implement basic and native chat peg parser builders/extractors

* resolve refs to their rules

* remove packrat caching (for now)

* update tests

* compare parsers with incremental input

* benchmark both complete and incremental parsing

* add raw string generation from json schema

* add support for string schemas in gbnf generation

* fix qwen example to include \n

* tidy up example

* rename extractor to mapper

* rename ast_arena to ast

* place basic tests into one

* use gbnf_format_literal from json-schema-to-grammar

* integrate parser with common/chat and server

* clean up schema and serialization

* add json-schema raw string tests

* clean up json creation and remove capture parser

* trim spaces from reasoning and content

* clean up redundant rules and comments

* rename input_is_complete to is_partial to match rest of project

* simplify json rules

* remove extraneous file

* remove comment

* implement += and |= operators

* add comments to qwen3 implementation

* reorder arguments to common_chat_peg_parse

* remove commented outdated tests

* add explicit copy constructor

* fix operators and constness

* wip: update test-chat for qwen3-coder

* bring json parser closer to json-schema-to-grammar rules

* trim trailing space for most things

* fix qwen3 coder rules w.r.t. trailing spaces

* group rules

* do not trim trailing space from string args

* tweak spacing of qwen3 grammar

* update qwen3-coder tests

* qwen3-coder small fixes

* place parser in common_chat_syntax to simplify invocation

* use std::set to collect rules to keep order predictable for tests

* initialize parser to make certain platforms happy

* revert back to std::unordered_set, sort rule names at the end instead

* uncomment rest of chat tests

* define explicit default constructor

* improve arena init and server integration

* fix chat test

* add json_member()

* add a comprehensive native example

* clean up example qwen test and add response_format example to native test

* make build_peg_parser accept std::function instead of template

* change peg parser parameters into const ref

* push tool call on tool open for constructed parser

* add parsing documentation

* clean up some comments

* add json schema support to qwen3-coder

* add id initializer in tests

* remove grammar debug line from qwen3-coder

* refactor qwen3-coder to use sequence over operators

* only call common_chat_peg_parse if appropriate format

* simplify qwen3-coder space handling

* revert qwen3-coder implementation

* revert json-schema-to-grammar changes

* remove unnecessary forward declaration

* small adjustment to until_parser

* rename C/C++ files to use dashes

* codeowners : add aldehir to peg-parser and related files

---------

Co-authored-by: Piotr Wilkin <piotr.wilkin@syndatis.com>
2025-12-03 12:45:32 +02:00

244 lines
6.6 KiB
C++

#pragma once
#include "common.h"
#include <chrono>
#include <exception>
#include <iostream>
#include <string>
#include <regex>
#include <vector>
struct testing {
std::ostream &out;
std::vector<std::string> stack;
std::regex filter;
bool filter_tests = false;
bool throw_exception = false;
bool verbose = false;
int tests = 0;
int assertions = 0;
int failures = 0;
int unnamed = 0;
int exceptions = 0;
static constexpr std::size_t status_column = 80;
explicit testing(std::ostream &os = std::cout) : out(os) {}
std::string indent() const {
if (stack.empty()) {
return "";
}
return std::string((stack.size() - 1) * 2, ' ');
}
std::string full_name() const {
return string_join(stack, ".");
}
void log(const std::string & msg) {
if (verbose) {
out << indent() << " " << msg << "\n";
}
}
void set_filter(const std::string & re) {
filter = std::regex(re);
filter_tests = true;
}
bool should_run() const {
if (filter_tests) {
if (!std::regex_match(full_name(), filter)) {
return false;
}
}
return true;
}
template <typename F>
void run_with_exceptions(F &&f, const char *ctx) {
try {
f();
} catch (const std::exception &e) {
++failures;
++exceptions;
out << indent() << "UNHANDLED EXCEPTION (" << ctx << "): " << e.what() << "\n";
if (throw_exception) {
throw;
}
} catch (...) {
++failures;
++exceptions;
out << indent() << "UNHANDLED EXCEPTION (" << ctx << "): unknown\n";
if (throw_exception) {
throw;
}
}
}
void print_result(const std::string &label, int new_failures, int new_assertions, const std::string &extra = "") const {
std::string line = indent() + label;
std::string details;
if (new_assertions > 0) {
if (new_failures == 0) {
details = std::to_string(new_assertions) + " assertion(s)";
} else {
details = std::to_string(new_failures) + " of " +
std::to_string(new_assertions) + " assertion(s) failed";
}
}
if (!extra.empty()) {
if (!details.empty()) {
details += ", ";
}
details += extra;
}
if (!details.empty()) {
line += " (" + details + ")";
}
std::string status = (new_failures == 0) ? "[PASS]" : "[FAIL]";
if (line.size() + 1 < status_column) {
line.append(status_column - line.size(), ' ');
} else {
line.push_back(' ');
}
out << line << status << "\n";
}
template <typename F>
void test(const std::string &name, F f) {
stack.push_back(name);
if (!should_run()) {
stack.pop_back();
return;
}
++tests;
out << indent() << name << "\n";
int before_failures = failures;
int before_assertions = assertions;
run_with_exceptions([&] { f(*this); }, "test");
int new_failures = failures - before_failures;
int new_assertions = assertions - before_assertions;
print_result(name, new_failures, new_assertions);
stack.pop_back();
}
template <typename F>
void test(F f) {
test("test #" + std::to_string(++unnamed), f);
}
template <typename F>
void bench(const std::string &name, F f, int iterations = 100) {
stack.push_back(name);
if (!should_run()) {
stack.pop_back();
return;
}
++tests;
out << indent() << "[bench] " << name << "\n";
int before_failures = failures;
int before_assertions = assertions;
using clock = std::chrono::high_resolution_clock;
std::chrono::microseconds duration(0);
run_with_exceptions([&] {
for (auto i = 0; i < iterations; i++) {
auto start = clock::now();
f();
duration += std::chrono::duration_cast<std::chrono::microseconds>(clock::now() - start);
}
}, "bench");
auto avg_elapsed = duration.count() / iterations;
auto avg_elapsed_s = std::chrono::duration_cast<std::chrono::duration<double>>(duration).count() / iterations;
auto rate = (avg_elapsed_s > 0.0) ? (1.0 / avg_elapsed_s) : 0.0;
int new_failures = failures - before_failures;
int new_assertions = assertions - before_assertions;
std::string extra =
"n=" + std::to_string(iterations) +
" avg=" + std::to_string(avg_elapsed) + "us" +
" rate=" + std::to_string(int(rate)) + "/s";
print_result("[bench] " + name, new_failures, new_assertions, extra);
stack.pop_back();
}
template <typename F>
void bench(F f, int iterations = 100) {
bench("bench #" + std::to_string(++unnamed), f, iterations);
}
// Assertions
bool assert_true(bool cond) {
return assert_true("", cond);
}
bool assert_true(const std::string &msg, bool cond) {
++assertions;
if (!cond) {
++failures;
out << indent() << "ASSERT TRUE FAILED";
if (!msg.empty()) {
out << " : " << msg;
}
out << "\n";
return false;
}
return true;
}
template <typename A, typename B>
bool assert_equal(const A &expected, const B &actual) {
return assert_equal("", expected, actual);
}
template <typename A, typename B>
bool assert_equal(const std::string &msg, const A &expected, const B &actual) {
++assertions;
if (!(actual == expected)) {
++failures;
out << indent() << "ASSERT EQUAL FAILED";
if (!msg.empty()) {
out << " : " << msg;
}
out << "\n";
out << indent() << " expected: " << expected << "\n";
out << indent() << " actual : " << actual << "\n";
return false;
}
return true;
}
// Print summary and return an exit code
int summary() const {
out << "\n";
out << "tests : " << tests << "\n";
out << "assertions : " << assertions << "\n";
out << "failures : " << failures << "\n";
out << "exceptions : " << exceptions << "\n";
return failures == 0 ? 0 : 1;
}
};