mirror of
https://github.com/ggerganov/llama.cpp
synced 2026-03-12 10:10:43 +01:00
* fix(docs): correct typos found during code review Non-functional changes only: - Fixed minor spelling mistakes in comments - Corrected typos in user-facing strings - No variables, logic, or functional code was modified. Signed-off-by: Marcel Petrick <mail@marcelpetrick.it> * Update docs/backend/CANN.md Co-authored-by: Aaron Teo <taronaeo@gmail.com> * Revert "Auxiliary commit to revert individual files from 846d1c301281178efbc6ce6060ad34c1ebe45af8" This reverts commit 02fcf0c7db661d5ff3eff96b2b2db9fdb7213256. * Update tests/test-backend-ops.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update tests/test-backend-ops.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Signed-off-by: Marcel Petrick <mail@marcelpetrick.it> Co-authored-by: Aaron Teo <taronaeo@gmail.com> Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
60 lines
2.2 KiB
Markdown
60 lines
2.2 KiB
Markdown
# Diffusion Text Generation
|
|
|
|
This directory contains implementations for Diffusion LLMs (DLLMs)
|
|
|
|
More Info:
|
|
- https://github.com/ggml-org/llama.cpp/pull/14644
|
|
- https://github.com/ggml-org/llama.cpp/pull/14771
|
|
|
|
## Parameters
|
|
The diffusion CLI supports various parameters to control the generation process:
|
|
|
|
### Core Diffusion Parameters
|
|
- `--diffusion-steps`: Number of diffusion steps (default: 256)
|
|
- `--diffusion-algorithm`: Algorithm for token selection
|
|
- `0`: ORIGIN - Token will be generated in a purely random order from https://arxiv.org/abs/2107.03006.
|
|
- `1`: ENTROPY_BASED - Entropy-based selection
|
|
- `2`: MARGIN_BASED - Margin-based selection
|
|
- `3`: RANDOM - Random selection
|
|
- `4`: CONFIDENCE_BASED - Confidence-based selection (default)
|
|
- More documentation here https://github.com/DreamLM/Dream
|
|
- `--diffusion-visual`: Enable live visualization during generation
|
|
|
|
### Scheduling Parameters
|
|
Choose one of the following scheduling methods:
|
|
|
|
**Timestep-based scheduling:**
|
|
- `--diffusion-eps`: Epsilon value for timestep scheduling (e.g., 0.001)
|
|
|
|
**Block-based scheduling:**
|
|
- `--diffusion-block-length`: Block size for block-based scheduling (e.g., 32)
|
|
|
|
### Sampling Parameters
|
|
- `--temp`: Temperature for sampling (0.0 = greedy/deterministic, higher = more random)
|
|
- `--top-k`: Top-k filtering for sampling
|
|
- `--top-p`: Top-p (nucleus) filtering for sampling
|
|
- `--seed`: Random seed for reproducibility
|
|
|
|
### Model Parameters
|
|
- `-m`: Path to the GGUF model file
|
|
- `-p`: Input prompt text
|
|
- `-ub`: Maximum sequence length (ubatch size)
|
|
- `-c`: Context size
|
|
- `-b`: Batch size
|
|
|
|
### Examples
|
|
#### Dream architecture:
|
|
```
|
|
llama-diffusion-cli -m dream7b.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-eps 0.001 --diffusion-algorithm 3 --diffusion-steps 256 --diffusion-visual
|
|
```
|
|
|
|
#### LLaDA architecture:
|
|
```
|
|
llama-diffusion-cli -m llada-8b.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-block-length 32 --diffusion-steps 256 --diffusion-visual
|
|
```
|
|
|
|
#### RND1 architecture:
|
|
```
|
|
llama-diffusion-cli -m RND1-Base-0910.gguf -p "write code to train MNIST in pytorch" -ub 512 --diffusion-algorithm 1 --diffusion-steps 256 --diffusion-visual --temp 0.5 --diffusion-eps 0.001
|
|
```
|