| .. |
|
CMakeLists.txt
|
Enable and clean up compiler warnings in src (#824)
|
2025-10-11 16:01:13 +03:00 |
|
llama-arch.cpp
|
Step-3.5: llama.cpp compatibility changes
|
2026-02-06 07:02:55 +00:00 |
|
llama-arch.h
|
Step-3.5: llama.cpp compatibility changes
|
2026-02-06 07:02:55 +00:00 |
|
llama-build-context.cpp
|
Graph parallel for Step-3.5-Flash (#1236)
|
2026-02-06 06:56:51 +02:00 |
|
llama-build-context.h
|
Step-3.5-Flash support (#1231)
|
2026-02-05 08:13:22 +02:00 |
|
llama-context.h
|
POC: CUDA tensor parallel (MoE models) (#1022)
|
2025-12-01 19:25:40 +01:00 |
|
llama-cparams.h
|
Additional graph reduce types for split mode graph (#1154)
|
2026-01-18 08:02:49 +02:00 |
|
llama-grammar.cpp
|
llama : add token matching support to llama-grammar (#1220)
|
2026-02-03 07:57:17 +02:00 |
|
llama-grammar.h
|
llama : add token matching support to llama-grammar (#1220)
|
2026-02-03 07:57:17 +02:00 |
|
llama-hparams.cpp
|
Step-3.5: llama.cpp compatibility changes
|
2026-02-06 07:02:55 +00:00 |
|
llama-hparams.h
|
Step-3.5: llama.cpp compatibility changes
|
2026-02-06 07:02:55 +00:00 |
|
llama-impl.h
|
server: stop processing the prompt when client disconnects (#1134)
|
2026-01-13 07:56:59 +02:00 |
|
llama-load-tensors.cpp
|
Graph parallel for Step-3.5-Flash (#1236)
|
2026-02-06 06:56:51 +02:00 |
|
llama-mmap.cpp
|
Enable CUDA graphs for MoE models + GPT-OSS support (#689)
|
2025-08-15 09:18:07 +03:00 |
|
llama-mmap.h
|
Enable CUDA graphs for MoE models + GPT-OSS support (#689)
|
2025-08-15 09:18:07 +03:00 |
|
llama-model-loader.cpp
|
Step-3.5-Flash support (#1231)
|
2026-02-05 08:13:22 +02:00 |
|
llama-model-loader.h
|
Merge ffn_up and ffn_gate experts tensors (#1137)
|
2026-01-12 18:30:53 +02:00 |
|
llama-model.cpp
|
Step-3.5-Flash support (#1231)
|
2026-02-05 08:13:22 +02:00 |
|
llama-model.h
|
Graph parallel for Step-3.5-Flash (#1236)
|
2026-02-06 06:56:51 +02:00 |
|
llama-quantize.cpp
|
Merge ffn_up and ffn_gate experts tensors (#1137)
|
2026-01-12 18:30:53 +02:00 |
|
llama-sampling.cpp
|
llama : add token matching support to llama-grammar (#1220)
|
2026-02-03 07:57:17 +02:00 |
|
llama-sampling.h
|
Adaptive p: history update fix + temp as flag (#1213)
|
2026-02-03 07:36:12 +02:00 |
|
llama-vocab.cpp
|
Server: refactor and rename functions (#1151)
|
2026-01-18 08:16:57 +02:00 |
|
llama-vocab.h
|
Update mtmd to improve accuracy of M-RoPE (#993)
|
2025-11-29 07:27:15 +01:00 |
|
llama.cpp
|
Graph parallel for Step-3.5-Flash (#1236)
|
2026-02-06 06:56:51 +02:00 |
|
unicode-data.cpp
|
Merge mainline llama.cpp (#3)
|
2024-07-27 07:55:01 +02:00 |
|
unicode-data.h
|
Merge mainline llama.cpp (#3)
|
2024-07-27 07:55:01 +02:00 |
|
unicode.cpp
|
Server: refactor and rename functions (#1151)
|
2026-01-18 08:16:57 +02:00 |
|
unicode.h
|
Enable CUDA graphs for MoE models + GPT-OSS support (#689)
|
2025-08-15 09:18:07 +03:00 |