Default Branch

b2cb4512c5 · Create parameters overview (#1269) · Updated 2026-02-20 07:20:56 +01:00

Branches

172f9dad4c · WIP: fix sm layer (MoE) · Updated 2025-12-21 17:12:03 +01:00    git

139
9

706341d15b · nccl: second attempt, not working · Updated 2025-12-21 06:58:50 +01:00    git

139
7

e28148d401 · WIP · Updated 2025-12-20 07:50:58 +01:00    git

139
6

64908da772 · cuda: set device to src device before p2p copy · Updated 2025-12-17 12:43:36 +01:00    git

140
1

0864655a72 · Disable split scheduling with tensor overrides · Updated 2025-12-17 07:38:18 +01:00    git

4211
4076

5a731064e6 · Much better TG speed with split mode "graph" · Updated 2025-12-15 14:53:35 +01:00    git

145
1

664a529332 · Use actual active number of layers when preparing splits · Updated 2025-12-14 07:41:41 +01:00    git

147
1

f81c0b7fa0 · WIP · Updated 2025-12-13 18:43:17 +01:00    git

4211
4071

d82ed383ce · Fix sync logic · Updated 2025-12-13 18:39:42 +01:00    git

4211
4063

72af525c9f · Undo sync reduction · Updated 2025-12-13 16:57:07 +01:00    git

4211
4062

082545b3f0 · Do not use split mode graph scheduling if there are tensor overrides · Updated 2025-12-12 14:36:02 +01:00    git

4211
4061

50fbde85dc · Fix overflow in offset calculation in mmq · Updated 2025-12-12 14:22:02 +01:00    git

4211
4060

643cccd2c8 · This is better · Updated 2025-12-12 07:23:39 +01:00    git

4211
4060

ca1e7070f6 · Be able to enable or disable P2P via command line argument · Updated 2025-12-11 18:46:54 +01:00    git

4211
4058

e094f32467 · Fix #1055 · Updated 2025-12-11 14:26:41 +01:00    git

4211
4057

b41b17943d · Fix the fix · Updated 2025-12-11 08:03:52 +01:00    git

4211
4054

c953b47266 · Be able to set a max. number of GPUs to be used in split mode graph · Updated 2025-12-11 07:21:42 +01:00    git

4211
4054

b37fafdc39 · Fix llama-bench - missing buffer override comparison operator · Updated 2025-12-11 07:18:45 +01:00    git

4211
4053

b0cc63bcdf · Another attempt for sm graph · Updated 2025-12-09 20:30:06 +01:00    git

161
3

a2f5614529 · Try to split offloaded MoE up/gate up · Updated 2025-12-09 11:09:04 +01:00    git

161
3