ik_llama.cpp

History

Kawrakow 0e1d33ca4a Fuse add+add+fused_rms (#853 ) * Fuse add+add+fused_rms * Try this * Macro to easily enable/disable fusion * Various: * Check that all tensors involved are on the same device before applying fusion * Fuse sigmoid+scale+sum_rows+div * Fix the fused bailingmoe2 experts selection The issue there was that the bias was not per row, but per expert group, so only the first n_per_group biases were used for al experts. --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>		2025-10-22 16:18:11 +03:00
..
cmake	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
include	Grouped expert routing (CPU only) (#836 )	2025-10-16 14:57:02 +03:00
src	Fuse add+add+fused_rms (#853 )	2025-10-22 16:18:11 +03:00
.gitignore	Merge mainline llama.cpp (#3 )	2024-07-27 07:55:01 +02:00
CMakeLists.txt	Set default value of GGML_SCHED_MAX_COPIES to 1 (#751 )	2025-09-02 07:04:39 +02:00