| .. |
|
8 - New quantization types IQ2_K_ IQ3_K_ IQ4_K_ IQ5_K.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
15 - Will LQER improve k- and i-quants_.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
18 - CPU beating GPU in token generation speed.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
25 - CPU prompt processing speed for large contexts.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
63 - LLaMA-3.2 quantization evaluation.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
82 - 4bpw GGML TYPE_.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
95 - Bitnet.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
100 - New argument _ env variable for GGML_SCHED_MAX_COPIES_.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
104 - Convenience improvements for llama-quantize.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
140 - Questions about weight_j_.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
164 - Latest CPU performance comparison with llama.cpp.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
165 - Norm RMS Epsilon.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
166 - Learning more LLM quantization.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
201 - What is the NUMA situation _.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
211 - help me create an importance matrix primer.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
223 - Recent performance testing with DeepSeek R1.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
242 - Switching from llama.cpp_ktransformers_ seeking advice_guidance.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
256 - Diverging from llama.cpp.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
258 - Quick-start Guide coming over from llama.cpp and ktransformers_.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
266 - Benchmarking DeepSeek R1 - 16x3090.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
286 - Testing _deepseek-ai_DeepSeek-V3-0324_ model support..md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
288 - On _compilade_s PR 12557 and _jukofyork_s quantization ideas.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
316 - Mainline is now copying stuff from ik_llama.cpp.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
319 - KTransformers copying ik_llama.cpp.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
323 - Is there an easy way to repack an existing GGUF so it could be used wit.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
334 - _iq4_ks_ performs great on gemma-3-27b-it-qat-q4_0-unquantized.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
350 - Maverick slow prompt with gpu.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
354 - Not all MLAs are born equal.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
357 - Qwen3 - early performance comparisons.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
359 - Qwen3 quantization experiments.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
372 - multy gpu.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
384 - ik_llama.cpp issues on an old workstation.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
385 - Qwen3 235B performance on Intel Xeon Scalable processor.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
393 - Creating quantized models.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
395 - Why does imatrix not tokenize special tokens_.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
396 - Best settings for Maverick - Dual CPU Xeon 8480_ - RTX 3090.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
397 - KV split while using _-sm row_.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
399 - Qwen 30b.A3b IK_LCPP comparisons on lowspec machine.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
401 - install bitnet _or other cpu models_ on a fresh termux aarch64.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
403 - Tool Calling and Structured Response _Json Mode_ support.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
434 - Quant Cookers Basic Guide.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
451 - Context reuse _ context shift for long prompts.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
459 - qwen3 metrics on ancient hardware _2x xeon Vs 2x P100_.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
466 - A curiosity..md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
477 - DeepSeek-R1-0528 ik quants_.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
491 - -rtr actually hurts prompt t_s for large ubatch_.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
519 - Android Build.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
526 - Partial requant feature to save compute and time during tests..md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
532 - Guidance on GPU Layer Offloading Strategy in ik_llama.cpp for Multi GPU.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
543 - dots.llm1 support and thanks.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
545 - Vulkan support_.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
548 - Poor performance with bf16 model on Qwen3 30B-A3B.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
556 - ik_llama.cpp for Armv8.0.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
562 - AMD GPU Vulkan _ ROCm_HIP Discussion.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
564 - Maybe an interesting CUDA PR here..md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
586 - Slow KV cache rm operation.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
590 - How important is Vulkan back-end development_.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
591 - I dont see any speed improvement in generation_ so want to understand i.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
594 - Is AVX2 a hard requirement on x64_.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
599 - mla matrix absorbtion.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
613 - Pathological Quant_CUDA combinations -- How to know what works_.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
619 - gpu p2p utilization.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
621 - Deepseek v3_r1 poisoned prompt_.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |
|
623 - Quantizing panels_bundles instead of blocks_.md
|
Add GitHub data: filename sanitization (#640)
|
2025-07-23 13:31:53 +02:00 |