Default Branch

b2cb4512c5 · Create parameters overview (#1269) · Updated 2026-02-20 07:20:56 +01:00

Branches

f4b750a430 · Attempt to fix AVX2 FA · Updated 2025-09-29 12:20:11 +02:00    git

4211
3900

a438096765 · Remove unnecessary assert in im2col (CPU) · Updated 2025-09-27 11:11:33 +02:00    git

4211
3900

54375a5587 · Move minja and nlohmann/json to vendor · Updated 2025-09-27 09:11:03 +02:00    git

4211
3898

40c05459c3 · Remove stb_image.h copy in common - it is now in vendor · Updated 2025-09-27 08:52:58 +02:00    git

4211
3897

be7eb79d44 · Add mtmd: this fixes gibberish on second image · Updated 2025-09-26 17:25:47 +02:00    git

4211
3915

11621fe433 · Avoid computing FA chunks where the mask is -infinity also for f16/bf16 · Updated 2025-09-24 16:53:11 +02:00    git

4211
3894

08080356ab · Fix dequantization when requantizing · Updated 2025-09-24 10:44:33 +02:00    git

4211
3891

15dfadccae · Revert timing on committed by mistake · Updated 2025-09-24 10:00:37 +02:00    git

4211
3895

08d116cd02 · Cleanup · Updated 2025-09-24 07:30:26 +02:00    git

4211
3890

0a70ca0bc0 · Fix #772 · Updated 2025-09-23 16:25:47 +02:00    git

4211
3887

3132dd368f · cuda: fused top_k+softmax as used in most MoE models · Updated 2025-09-23 12:59:44 +02:00    git

4211
3886

8794a2fecd · Fix compiler warnings · Updated 2025-09-23 10:28:52 +02:00    git

4211
3885

3a6ebc7764 · This is very slightly better · Updated 2025-09-05 19:06:17 +02:00    git

4211
3881

4b66e9234c · Fix ggml_is_contiguously_allocated · Updated 2025-09-05 19:01:27 +02:00    git

4211
3880

910a27ab9b · Better CPU SWA · Updated 2025-09-04 10:08:42 +02:00    git

4211
3876

b02e137f60 · This is slightly better · Updated 2025-09-04 08:06:31 +02:00    git

4211
3882

3c43f9dc7d · Add a command line argument · Updated 2025-09-02 17:58:34 +02:00    git

4211
3879

27e8ed6454 · This seems very slightly better · Updated 2025-09-02 17:24:45 +02:00    git

4211
3878

8c67621b4b · Set default value of GGML_SCHED_MAX_COPIES to 1 · Updated 2025-09-02 07:01:42 +02:00    git

4211
3873

f8d511a30f · Revert "CUDA: prompt processing optimizations for MoE models (#739)" · Updated 2025-09-01 19:06:57 +02:00    git

4211
3872