mirror of
https://github.com/ggerganov/llama.cpp
synced 2026-04-18 21:26:07 +02:00
* Introduced NVFP4 generic MMQ kernel * Added extra FP8 guard, hope to solve ci HIP failure * Rename tiles and use HIP_FP8_AVAILABLE * Removed remaning FP8 straggler and added const int * Const * Removed DECL_MMQ_CASE artifact * Removed newline * Removed space after else * Changed HIP FP8 NVFP4 conversion gate * Added new line to bottom of mmq.cu 270 * Removed extra spaces * Removed single space in front of else on line 814 * Added NVFP4 to generate cu script so HIP can see it, further tightened logic * Include generated mmq-instance-nvfp4.cu * Added NVFP4 mmq to HIP Check ignore list * Update ggml/src/ggml-cuda/mmq.cuh Changed to Q3_K tile to read MMQ_MMA_TILE_X_K_NVFP4 Co-authored-by: Johannes Gäßler <johannesg@5d6.de> * Update ggml/src/ggml-cuda/mmq.cuh Changed to Q3_K tile to read MMQ_MMA_TILE_X_K_NVFP4 in tile assert Co-authored-by: Johannes Gäßler <johannesg@5d6.de> * Update ggml/src/ggml-cuda/mmq.cuh Added function name ending for end if Co-authored-by: Johannes Gäßler <johannesg@5d6.de> * Added function names to closing endif Co-authored-by: Johannes Gäßler <johannesg@5d6.de> --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de> |
||
|---|---|---|
| .. | ||
| apple | ||
| hip | ||
| jinja | ||
| snapdragon | ||
| bench-models.sh | ||
| build-info.sh | ||
| check-requirements.sh | ||
| compare-commits.sh | ||
| compare-llama-bench.py | ||
| compare-logprobs.py | ||
| create_ops_docs.py | ||
| debug-test.sh | ||
| fetch_server_test_models.py | ||
| gen-authors.sh | ||
| gen-unicode-data.py | ||
| get_chat_template.py | ||
| get-flags.mk | ||
| get-hellaswag.sh | ||
| get-pg.sh | ||
| get-wikitext-2.sh | ||
| get-winogrande.sh | ||
| git-bisect-run.sh | ||
| git-bisect.sh | ||
| hf.sh | ||
| install-oneapi.bat | ||
| pr2wt.sh | ||
| serve-static.js | ||
| server-bench.py | ||
| server-test-model.py | ||
| sync_vendor.py | ||
| sync-ggml-am.sh | ||
| sync-ggml.last | ||
| sync-ggml.sh | ||
| tool_bench.py | ||
| tool_bench.sh | ||
| verify-checksum-models.py | ||
| xxd.cmake | ||