Default Branch

c96f608d98 · common: consolidate PEG string parsers (#20263) · Updated 2026-03-10 00:29:21 +01:00

Branches

8282d74692 · bench : handle decode errors · Updated 2025-05-14 21:36:29 +02:00

2806
1

237acc7cd5 · server : update readme + return json for "meta" field · Updated 2025-05-14 14:30:12 +02:00

2815
2

78d70223c3 · metal : use FA-vec kernel up to batch size 20 · Updated 2025-05-13 09:38:06 +02:00

2832
3

6107303ab0 · llama : remove logits_all flag + reorder llama_context_params · Updated 2025-05-08 12:01:41 +02:00

2885
2

16843dba33 · metal : pad mm results · Updated 2025-05-04 08:13:52 +02:00

2921
1

15dea7bbdf · opt : remove print [no ci] · Updated 2025-05-02 20:25:29 +02:00

2997
4

65202d2985 · sync : ggml · Updated 2025-05-01 08:59:02 +02:00

3028
3

b710758323 · readme : update hot topics · Updated 2025-04-28 10:04:28 +02:00

3062
1

37ae6a281a · Fixes Qwen2.5VL segfault during inference with https://github.com/ggml-org/llama.cpp/pull/12402 as has_qwen2vl_merger migration was incomplete · Updated 2025-04-27 12:36:57 +02:00

3069
1

ed68474f76 · wip · Updated 2025-04-25 18:07:09 +02:00

3092
2

3fe362fe49 · gguf-py : use ThreadPoolExecutor when writing tensors · Updated 2025-04-12 06:00:51 +02:00

3148
4

098f0e5eea · test · Updated 2025-04-10 11:35:16 +02:00

3168
1

e9e1882d2d · rm tail space · Updated 2025-04-08 07:43:11 +02:00

3191
4

da140da72a · gguf-py : fix flake8 lint · Updated 2025-04-08 01:38:35 +02:00

3191
2

ced26486ff · cont · Updated 2025-04-07 14:24:01 +02:00

3203
6

fe564b0dfb · ci : rename job MSVC -> MinGW · Updated 2025-04-04 12:51:48 +02:00

3218
1

43ab09b85d · ci : testing (wip) · Updated 2025-04-04 12:43:43 +02:00

3218
1

7a73e861a7 · cont · Updated 2025-04-04 11:02:20 +02:00

3227
4

c875e03f96 · rpc : update README for cache usage · Updated 2025-03-28 08:41:47 +01:00

3284
1

efe0222130 · media : add SVG logo [no ci] · Updated 2025-03-27 22:07:46 +01:00

3287
1