| .. |
|
CMakeLists.txt
|
Enable and clean up compiler warnings in src (#824)
|
2025-10-11 16:01:13 +03:00 |
|
llama-arch.cpp
|
Adding Ling/Ring (a.k.a., Bailing-MoE2) support (#833)
|
2025-10-15 14:20:40 +03:00 |
|
llama-arch.h
|
Adding Ling/Ring (a.k.a., Bailing-MoE2) support (#833)
|
2025-10-15 14:20:40 +03:00 |
|
llama-build-context.cpp
|
Fuse Q, K, V gemv+add
|
2025-10-26 17:13:11 +02:00 |
|
llama-build-context.h
|
Fused mul + multi_add op (#858)
|
2025-10-24 07:40:35 +03:00 |
|
llama-context.h
|
Refactor file llama.cpp (#823)
|
2025-10-11 11:35:20 +03:00 |
|
llama-cparams.h
|
Fused mul + multi_add op (#858)
|
2025-10-24 07:40:35 +03:00 |
|
llama-grammar.cpp
|
Tool calls support from mainline (#723)
|
2025-09-01 08:38:49 +03:00 |
|
llama-grammar.h
|
Tool calls support from mainline (#723)
|
2025-09-01 08:38:49 +03:00 |
|
llama-hparams.cpp
|
Adding Ling/Ring (a.k.a., Bailing-MoE2) support (#833)
|
2025-10-15 14:20:40 +03:00 |
|
llama-hparams.h
|
Adding Ling/Ring (a.k.a., Bailing-MoE2) support (#833)
|
2025-10-15 14:20:40 +03:00 |
|
llama-impl.h
|
Remove double definition of LLAMA_LOG_DEBUG
|
2025-09-01 08:42:04 +03:00 |
|
llama-load-tensors.cpp
|
Adding Ling/Ring (a.k.a., Bailing-MoE2) support (#833)
|
2025-10-15 14:20:40 +03:00 |
|
llama-mmap.cpp
|
Enable CUDA graphs for MoE models + GPT-OSS support (#689)
|
2025-08-15 09:18:07 +03:00 |
|
llama-mmap.h
|
Enable CUDA graphs for MoE models + GPT-OSS support (#689)
|
2025-08-15 09:18:07 +03:00 |
|
llama-model-loader.cpp
|
Refactor file llama.cpp (#823)
|
2025-10-11 11:35:20 +03:00 |
|
llama-model-loader.h
|
Refactor file llama.cpp (#823)
|
2025-10-11 11:35:20 +03:00 |
|
llama-model.cpp
|
Adding Ling/Ring (a.k.a., Bailing-MoE2) support (#833)
|
2025-10-15 14:20:40 +03:00 |
|
llama-model.h
|
Adding Ling/Ring (a.k.a., Bailing-MoE2) support (#833)
|
2025-10-15 14:20:40 +03:00 |
|
llama-quantize.cpp
|
Fix PATH_MAX not defined on Windows (#828)
|
2025-10-13 09:25:57 +03:00 |
|
llama-sampling.cpp
|
Enable and clean up compiler warnings in src (#824)
|
2025-10-11 16:01:13 +03:00 |
|
llama-sampling.h
|
add dry sampler (#513)
|
2025-06-19 10:24:53 +03:00 |
|
llama-vocab.cpp
|
Adding Ling/Ring (a.k.a., Bailing-MoE2) support (#833)
|
2025-10-15 14:20:40 +03:00 |
|
llama-vocab.h
|
model : add grok-2 support (#782)
|
2025-09-23 16:31:01 +02:00 |
|
llama.cpp
|
Change flash attention and fmoe to be on by default (#863)
|
2025-10-25 09:37:28 +03:00 |
|
unicode-data.cpp
|
Merge mainline llama.cpp (#3)
|
2024-07-27 07:55:01 +02:00 |
|
unicode-data.h
|
Merge mainline llama.cpp (#3)
|
2024-07-27 07:55:01 +02:00 |
|
unicode.cpp
|
Enable CUDA graphs for MoE models + GPT-OSS support (#689)
|
2025-08-15 09:18:07 +03:00 |
|
unicode.h
|
Enable CUDA graphs for MoE models + GPT-OSS support (#689)
|
2025-08-15 09:18:07 +03:00 |