..
CMakeLists.txt
Enable and clean up compiler warnings in src ( #824 )
2025-10-11 16:01:13 +03:00
llama-arch.cpp
Adding Ling/Ring (a.k.a., Bailing-MoE2) support ( #833 )
2025-10-15 14:20:40 +03:00
llama-arch.h
Adding Ling/Ring (a.k.a., Bailing-MoE2) support ( #833 )
2025-10-15 14:20:40 +03:00
llama-build-context.cpp
Do not allocate KV cache for unused layers ( #843 )
2025-10-20 10:09:39 +03:00
llama-build-context.h
Grouped expert routing (CPU only) ( #836 )
2025-10-16 14:57:02 +03:00
llama-context.h
Refactor file llama.cpp ( #823 )
2025-10-11 11:35:20 +03:00
llama-cparams.h
Grouped expert routing (CPU only) ( #836 )
2025-10-16 14:57:02 +03:00
llama-grammar.cpp
Tool calls support from mainline ( #723 )
2025-09-01 08:38:49 +03:00
llama-grammar.h
Tool calls support from mainline ( #723 )
2025-09-01 08:38:49 +03:00
llama-hparams.cpp
Adding Ling/Ring (a.k.a., Bailing-MoE2) support ( #833 )
2025-10-15 14:20:40 +03:00
llama-hparams.h
Adding Ling/Ring (a.k.a., Bailing-MoE2) support ( #833 )
2025-10-15 14:20:40 +03:00
llama-impl.h
Remove double definition of LLAMA_LOG_DEBUG
2025-09-01 08:42:04 +03:00
llama-load-tensors.cpp
Adding Ling/Ring (a.k.a., Bailing-MoE2) support ( #833 )
2025-10-15 14:20:40 +03:00
llama-mmap.cpp
Enable CUDA graphs for MoE models + GPT-OSS support ( #689 )
2025-08-15 09:18:07 +03:00
llama-mmap.h
Enable CUDA graphs for MoE models + GPT-OSS support ( #689 )
2025-08-15 09:18:07 +03:00
llama-model-loader.cpp
Refactor file llama.cpp ( #823 )
2025-10-11 11:35:20 +03:00
llama-model-loader.h
Refactor file llama.cpp ( #823 )
2025-10-11 11:35:20 +03:00
llama-model.cpp
Adding Ling/Ring (a.k.a., Bailing-MoE2) support ( #833 )
2025-10-15 14:20:40 +03:00
llama-model.h
Adding Ling/Ring (a.k.a., Bailing-MoE2) support ( #833 )
2025-10-15 14:20:40 +03:00
llama-quantize.cpp
Fix PATH_MAX not defined on Windows ( #828 )
2025-10-13 09:25:57 +03:00
llama-sampling.cpp
Enable and clean up compiler warnings in src ( #824 )
2025-10-11 16:01:13 +03:00
llama-sampling.h
add dry sampler ( #513 )
2025-06-19 10:24:53 +03:00
llama-vocab.cpp
Adding Ling/Ring (a.k.a., Bailing-MoE2) support ( #833 )
2025-10-15 14:20:40 +03:00
llama-vocab.h
model : add grok-2 support ( #782 )
2025-09-23 16:31:01 +02:00
llama.cpp
Do not allocate KV cache for unused layers ( #843 )
2025-10-20 10:09:39 +03:00
unicode-data.cpp
Merge mainline llama.cpp ( #3 )
2024-07-27 07:55:01 +02:00
unicode-data.h
Merge mainline llama.cpp ( #3 )
2024-07-27 07:55:01 +02:00
unicode.cpp
Enable CUDA graphs for MoE models + GPT-OSS support ( #689 )
2025-08-15 09:18:07 +03:00
unicode.h
Enable CUDA graphs for MoE models + GPT-OSS support ( #689 )
2025-08-15 09:18:07 +03:00