Default Branch

319146247e · vulkan: improve partial offloading performance on AMD (#19976) · Updated 2026-03-01 17:32:14 +01:00

Branches

c1c42f1544 · webui : send both backend_sampling == false/true · Updated 2026-01-12 14:19:23 +01:00

471
1

08b5d956fc · minor : std::unordered_set over std::set · Updated 2026-01-12 12:35:25 +01:00

601
3

4a2751258a · server : simplify prompt state transition branches · Updated 2026-01-09 16:46:03 +01:00

500
11

0fca4308f7 · Initial plan · Updated 2026-01-08 16:16:59 +01:00

510
2

091d98e2c5 · rpc : use std::unique_ptr for the message_queue · Updated 2026-01-06 14:32:01 +01:00

549
2

54ccf2476b · ci : require editor config · Updated 2026-01-06 12:04:35 +01:00

541
1

4a95b44864 · alloc : skip unassigned leafs · Updated 2026-01-06 10:24:56 +01:00

541
1

bf3f12df4c · graph : constant topology for tokens/embeddings inputs · Updated 2026-01-02 14:46:45 +01:00

574
2

6ecba0d0d0 · fix 5 · Updated 2025-12-30 13:53:52 +01:00

607
170

42c40819ca · handle case done === 0 · Updated 2025-12-29 21:07:10 +01:00

608
2

eaa639af65 · update · Updated 2025-12-29 12:41:48 +01:00

642
17

3b54531ead · ci : disable mmap · Updated 2025-12-28 08:26:51 +01:00

628
1

5f14aa8e43 · gguf-py : do not align the data start offset · Updated 2025-12-22 15:49:54 +01:00

683
1

a95df75322 · add test model · Updated 2025-12-18 23:44:01 +01:00

747
13

6b1394ed74 · prof: fix tensor dims formatter · Updated 2025-12-18 02:11:21 +01:00

720
3

e47a082fc9 · security : add collaborator guidance · Updated 2025-12-16 09:16:46 +01:00

761
1

4574ab6f40 · preset: handle negated arg, reverse the meaning if needed · Updated 2025-12-14 21:44:41 +01:00

781
1

357f999381 · graph: add f_attn_temp_offset · Updated 2025-12-14 12:12:12 +01:00

785
1

292f8e231c · model-conversion : cast logits to float32 · Updated 2025-12-13 21:24:21 +01:00

797
1

2a615b27e4 · ggml : remove redundant src in ggml_cast · Updated 2025-12-09 10:16:15 +01:00

854
1