mirror of https://github.com/ggerganov/llama.cpp synced 2026-03-02 21:29:35 +01:00

ggml webgpu: support for backend sampling (#18880 )

* ggml webgpu: add SOFTPLUS unary operator

Implements SOFTPLUS (log(1 + exp(x))) with f16/f32 support. Uses f32
precision for intermediate calculations to prevent f16 overflow.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support
* Follow Vulkan backend numerical stability pattern

* ggml webgpu: add EXPM1 unary operator

Implements EXPM1 (exp(x) - 1) with f16/f32 support.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support

* ggml webgpu: add FLOOR unary operator

Implements FLOOR (rounds down to nearest integer) with f16/f32 support.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support

* ggml webgpu: add CEIL unary operator

Implements CEIL (rounds up to nearest integer) with f16/f32 support.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support

* ggml webgpu: add ROUND unary operator

Implements ROUND (rounds to nearest integer) with f16/f32 support.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support

* ggml webgpu: add TRUNC unary operator

Implements TRUNC (truncates towards zero) with f16/f32 support.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support

* docs : update WebGPU support for unary operators (FLOOR, CEIL, ROUND, TRUNC, EXPM1, SOFTPLUS)

* Updates to webgpu get_memory

* Add argmax

* Add argmax,cumsum,sum,sum_rows

* Add necessary CPY/GET_ROWS operators

* Support for argsort using multi-pass strategy

* Update set_rows for i32 indices, move to pre-wgsl

* Port unary operators to pre-wgsl and support FILL

* Implement PAD

* Add support for top-k

* clean up, scope pipeline init mutex

* fix newline

* Add support for log

* Update LOG for better precision, and ops doc

---------

Co-authored-by: Abhijit Ramesh <abhijitramesh2k@gmail.com>

2026-01-16 16:12:43 -08:00

11 KiB

Raw Permalink Blame History

GGML Operations

List of GGML operations and backend support status.

How to add a backend to this table:

Run test-backend-ops support --output csv with your backend name and redirect output to a csv file in docs/ops/ (e.g., docs/ops/CUDA.csv)
Regenerate /docs/ops.md via ./scripts/create_ops_docs.py

Legend:

✅ Fully supported by this backend
🟡 Partially supported by this backend
❌ Not supported by this backend

Operation	BLAS	CANN	CPU	CUDA	Metal	OpenCL	SYCL	Vulkan	WebGPU	ZenDNN	zDNN
ABS	❌	✅	✅	🟡	🟡	❌	✅	🟡	✅	❌	❌
ACC	❌	✅	✅	✅	✅	❌	✅	✅	❌	❌	❌
ADD	❌	✅	✅	✅	🟡	✅	✅	✅	✅	❌	❌
ADD1	❌	✅	✅	✅	❌	❌	✅	✅	❌	❌	❌
ADD_ID	❌	❌	✅	✅	✅	✅	✅	✅	❌	❌	❌
ARANGE	❌	✅	✅	✅	✅	❌	✅	✅	❌	❌	❌
ARGMAX	❌	✅	✅	✅	✅	❌	✅	✅	✅	❌	❌
ARGSORT	❌	✅	✅	✅	✅	🟡	🟡	✅	✅	❌	❌
CEIL	❌	❌	✅	🟡	❌	❌	🟡	🟡	✅	❌	❌
CLAMP	❌	✅	✅	✅	🟡	🟡	✅	🟡	✅	❌	❌
CONCAT	❌	✅	✅	🟡	✅	🟡	✅	✅	❌	❌	❌
CONT	❌	🟡	✅	✅	✅	🟡	🟡	✅	🟡	❌	❌
CONV_2D	❌	❌	✅	✅	✅	✅	❌	✅	❌	❌	❌
CONV_2D_DW	❌	❌	✅	✅	❌	❌	❌	✅	❌	❌	❌
CONV_3D	❌	❌	✅	❌	❌	❌	❌	❌	❌	❌	❌
CONV_TRANSPOSE_1D	❌	✅	✅	✅	✅	❌	✅	✅	❌	❌	❌
CONV_TRANSPOSE_2D	❌	❌	✅	✅	✅	❌	❌	✅	❌	❌	❌
COS	❌	✅	✅	✅	🟡	❌	✅	🟡	❌	❌	❌
COUNT_EQUAL	❌	✅	✅	✅	✅	❌	✅	✅	❌	❌	❌
CPY	❌	🟡	🟡	🟡	🟡	🟡	🟡	🟡	🟡	❌	❌
CROSS_ENTROPY_LOSS	❌	✅	✅	✅	❌	❌	❌	❌	❌	❌	❌
CROSS_ENTROPY_LOSS_BACK	❌	❌	✅	✅	❌	❌	❌	❌	❌	❌	❌
CUMSUM	❌	❌	✅	✅	✅	❌	❌	✅	✅	❌	❌
DIAG	❌	❌	✅	✅	❌	❌	❌	❌	❌	❌	❌
DIAG_MASK_INF	❌	✅	✅	✅	❌	🟡	✅	✅	❌	❌	❌
DIV	❌	✅	✅	✅	🟡	✅	✅	✅	✅	❌	❌
DUP	❌	✅	✅	🟡	🟡	🟡	✅	✅	❌	❌	❌
ELU	❌	✅	✅	🟡	🟡	❌	✅	❌	✅	❌	❌
EXP	❌	✅	✅	🟡	🟡	❌	✅	🟡	✅	❌	❌
EXPM1	❌	❌	✅	🟡	🟡	❌	❌	❌	✅	❌	❌
FILL	❌	❌	✅	✅	✅	❌	❌	✅	✅	❌	❌
FLASH_ATTN_EXT	❌	🟡	✅	🟡	🟡	🟡	❌	🟡	🟡	❌	❌
FLOOR	❌	❌	✅	🟡	❌	❌	🟡	🟡	✅	❌	❌
GATED_LINEAR_ATTN	❌	✅	✅	✅	❌	❌	✅	❌	❌	❌	❌
GEGLU	❌	✅	✅	✅	🟡	✅	✅	🟡	✅	❌	❌
GEGLU_ERF	❌	✅	✅	✅	🟡	✅	✅	🟡	✅	❌	❌
GEGLU_QUICK	❌	✅	✅	✅	🟡	✅	✅	🟡	✅	❌	❌
GELU	❌	✅	✅	🟡	🟡	🟡	✅	🟡	✅	❌	❌
GELU_ERF	❌	✅	✅	🟡	🟡	🟡	✅	🟡	✅	❌	❌
GELU_QUICK	❌	✅	✅	🟡	🟡	🟡	✅	🟡	✅	❌	❌
GET_ROWS	❌	🟡	✅	🟡	✅	🟡	🟡	🟡	🟡	❌	❌
GET_ROWS_BACK	❌	❌	🟡	🟡	❌	❌	❌	❌	❌	❌	❌
GROUP_NORM	❌	✅	✅	✅	✅	✅	✅	✅	❌	❌	❌
HARDSIGMOID	❌	✅	✅	🟡	🟡	❌	✅	🟡	✅	❌	❌
HARDSWISH	❌	✅	✅	🟡	🟡	❌	✅	🟡	✅	❌	❌
IM2COL	❌	✅	✅	✅	✅	✅	✅	✅	❌	❌	❌
IM2COL_3D	❌	❌	✅	✅	❌	❌	❌	✅	❌	❌	❌
L2_NORM	❌	✅	✅	✅	✅	❌	✅	✅	❌	❌	❌
LEAKY_RELU	❌	✅	✅	✅	🟡	❌	✅	🟡	❌	❌	❌
LOG	❌	✅	✅	✅	🟡	❌	✅	✅	✅	❌	❌
MEAN	❌	✅	✅	✅	✅	✅	✅	✅	❌	❌	❌
MUL	❌	✅	✅	✅	🟡	✅	✅	✅	✅	❌	❌
MUL_MAT	🟡	🟡	🟡	🟡	✅	🟡	🟡	🟡	🟡	🟡	🟡
MUL_MAT_ID	❌	🟡	✅	✅	✅	🟡	🟡	✅	❌	❌	❌
NEG	❌	✅	✅	🟡	🟡	❌	✅	🟡	✅	❌	❌
NORM	❌	✅	✅	✅	✅	✅	✅	🟡	❌	❌	❌
OPT_STEP_ADAMW	❌	❌	✅	✅	✅	❌	❌	✅	❌	❌	❌
OPT_STEP_SGD	❌	❌	✅	✅	✅	❌	❌	✅	❌	❌	❌
OUT_PROD	🟡	🟡	🟡	🟡	❌	❌	🟡	❌	❌	❌	🟡
PAD	❌	🟡	✅	🟡	🟡	🟡	🟡	✅	✅	❌	❌
PAD_REFLECT_1D	❌	✅	✅	✅	✅	❌	✅	❌	❌	❌	❌
POOL_1D	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌	❌
POOL_2D	❌	🟡	✅	✅	✅	❌	✅	✅	❌	❌	❌
REGLU	❌	✅	✅	✅	🟡	✅	✅	🟡	✅	❌	❌
RELU	❌	✅	✅	🟡	🟡	🟡	✅	🟡	✅	❌	❌
REPEAT	❌	✅	✅	🟡	✅	🟡	✅	🟡	❌	❌	❌
REPEAT_BACK	❌	❌	✅	✅	❌	❌	✅	✅	❌	❌	❌
RMS_NORM	❌	✅	✅	✅	✅	✅	✅	✅	✅	❌	❌
RMS_NORM_BACK	❌	❌	✅	✅	❌	❌	✅	✅	❌	❌	❌
ROLL	❌	❌	✅	✅	❌	❌	✅	✅	❌	❌	❌
ROPE	❌	✅	✅	✅	✅	✅	✅	✅	✅	❌	❌
ROPE_BACK	❌	❌	✅	✅	❌	❌	❌	✅	❌	❌	❌
ROUND	❌	❌	✅	🟡	❌	❌	🟡	🟡	✅	❌	❌
RWKV_WKV6	❌	❌	✅	✅	✅	❌	✅	✅	❌	❌	❌
RWKV_WKV7	❌	❌	✅	✅	✅	❌	✅	✅	❌	❌	❌
SCALE	❌	🟡	✅	✅	✅	✅	✅	✅	✅	❌	❌
SET	❌	❌	✅	✅	❌	❌	🟡	❌	❌	❌	❌
SET_ROWS	❌	🟡	🟡	🟡	🟡	🟡	🟡	🟡	🟡	❌	❌
SGN	❌	✅	✅	🟡	🟡	❌	✅	❌	✅	❌	❌
SIGMOID	❌	✅	✅	🟡	🟡	🟡	✅	🟡	✅	❌	❌
SILU	❌	✅	✅	🟡	🟡	🟡	✅	🟡	✅	❌	❌
SILU_BACK	❌	❌	✅	✅	❌	❌	❌	✅	❌	❌	❌
SIN	❌	✅	✅	✅	🟡	❌	✅	🟡	❌	❌	❌
SOFTPLUS	❌	❌	✅	🟡	🟡	❌	❌	🟡	✅	❌	❌
SOFT_MAX	❌	🟡	✅	✅	✅	✅	✅	✅	✅	❌	❌
SOFT_MAX_BACK	❌	❌	🟡	🟡	❌	❌	🟡	✅	❌	❌	❌
SOLVE_TRI	❌	❌	✅	🟡	❌	❌	❌	🟡	❌	❌	❌
SQR	❌	✅	✅	✅	🟡	✅	✅	🟡	❌	❌	❌
SQRT	❌	✅	✅	✅	🟡	✅	✅	🟡	❌	❌	❌
SSM_CONV	❌	✅	✅	✅	✅	✅	✅	✅	❌	❌	❌
SSM_SCAN	❌	❌	✅	✅	✅	❌	❌	🟡	❌	❌	❌
STEP	❌	✅	✅	🟡	🟡	❌	✅	🟡	✅	❌	❌
SUB	❌	✅	✅	✅	🟡	✅	✅	✅	✅	❌	❌
SUM	❌	🟡	✅	🟡	🟡	❌	🟡	🟡	🟡	❌	❌
SUM_ROWS	❌	✅	✅	🟡	✅	🟡	🟡	✅	✅	❌	❌
SWIGLU	❌	✅	✅	✅	🟡	✅	✅	🟡	✅	❌	❌
SWIGLU_OAI	❌	❌	✅	✅	✅	✅	✅	🟡	✅	❌	❌
TANH	❌	✅	✅	🟡	🟡	✅	✅	🟡	✅	❌	❌
TIMESTEP_EMBEDDING	❌	✅	✅	✅	✅	✅	✅	✅	❌	❌	❌
TOP_K	❌	❌	✅	❌	✅	❌	❌	🟡	✅	❌	❌
TRI	❌	❌	✅	✅	✅	❌	❌	✅	❌	❌	❌
TRUNC	❌	❌	✅	🟡	❌	❌	🟡	🟡	✅	❌	❌
UPSCALE	❌	🟡	✅	✅	🟡	🟡	🟡	🟡	❌	❌	❌
XIELU	❌	❌	✅	❌	❌	❌	❌	❌	✅	❌	❌

11 KiB Raw Permalink Blame History

GGML Operations

How to add a backend to this table:

11 KiB

Raw Permalink Blame History