Commit Graph

3 Commits

Author SHA1 Message Date
Georgi Gerganov
5a79c1900f
eagle3 : improve naming 2025-12-17 15:49:03 +02:00
ruixiangw
ac5667dcc6 fix eagle3 logits sync bug & remove ggml_set_sync() 2025-12-16 16:53:28 +00:00
ruixiangw
8fac4b1cc8 feat: add EAGLE3 speculative decoding support
EAGLE3 is an encoder-decoder based speculative decoding method:
- Extracts features from target model at specific layers
- Uses feature fusion layer to compress target features
- Generates draft tokens with single-layer decoder
- Maps draft vocabulary to target vocabulary via d2t tensor

Key changes:
- Add LLM_ARCH_EAGLE3 architecture
- Add EAGLE3 encoder/decoder graph (src/models/eagle3.cpp)
- Add feature extraction from target model layers
- Add g_embeddings handling for decoder input
- Add GGML_TENSOR_FLAG_SYNC for GPU synchronization
- Add --eagle3 flag for speculative-simple example
- Add EAGLE3 model conversion in convert_hf_to_gguf.py
2025-12-14 18:12:33 +00:00