llama.cpp/docs/ops.md
Reese Levine a89002f07b
ggml webgpu: support for backend sampling (#18880)
* ggml webgpu: add SOFTPLUS unary operator

Implements SOFTPLUS (log(1 + exp(x))) with f16/f32 support. Uses f32
precision for intermediate calculations to prevent f16 overflow.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support
* Follow Vulkan backend numerical stability pattern

* ggml webgpu: add EXPM1 unary operator

Implements EXPM1 (exp(x) - 1) with f16/f32 support.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support

* ggml webgpu: add FLOOR unary operator

Implements FLOOR (rounds down to nearest integer) with f16/f32 support.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support

* ggml webgpu: add CEIL unary operator

Implements CEIL (rounds up to nearest integer) with f16/f32 support.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support

* ggml webgpu: add ROUND unary operator

Implements ROUND (rounds to nearest integer) with f16/f32 support.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support

* ggml webgpu: add TRUNC unary operator

Implements TRUNC (truncates towards zero) with f16/f32 support.

* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support

* docs : update WebGPU support for unary operators (FLOOR, CEIL, ROUND, TRUNC, EXPM1, SOFTPLUS)

* Updates to webgpu get_memory

* Add argmax

* Add argmax,cumsum,sum,sum_rows

* Add necessary CPY/GET_ROWS operators

* Support for argsort using multi-pass strategy

* Update set_rows for i32 indices, move to pre-wgsl

* Port unary operators to pre-wgsl and support FILL

* Implement PAD

* Add support for top-k

* clean up, scope pipeline init mutex

* fix newline

* Add support for log

* Update LOG for better precision, and ops doc

---------

Co-authored-by: Abhijit Ramesh <abhijitramesh2k@gmail.com>
2026-01-16 16:12:43 -08:00

11 KiB

GGML Operations

List of GGML operations and backend support status.

How to add a backend to this table:

  1. Run test-backend-ops support --output csv with your backend name and redirect output to a csv file in docs/ops/ (e.g., docs/ops/CUDA.csv)
  2. Regenerate /docs/ops.md via ./scripts/create_ops_docs.py

Legend:

  • Fully supported by this backend
  • 🟡 Partially supported by this backend
  • Not supported by this backend
Operation BLAS CANN CPU CUDA Metal OpenCL SYCL Vulkan WebGPU ZenDNN zDNN
ABS 🟡 🟡 🟡
ACC
ADD 🟡
ADD1
ADD_ID
ARANGE
ARGMAX
ARGSORT 🟡 🟡
CEIL 🟡 🟡 🟡
CLAMP 🟡 🟡 🟡
CONCAT 🟡 🟡
CONT 🟡 🟡 🟡 🟡
CONV_2D
CONV_2D_DW
CONV_3D
CONV_TRANSPOSE_1D
CONV_TRANSPOSE_2D
COS 🟡 🟡
COUNT_EQUAL
CPY 🟡 🟡 🟡 🟡 🟡 🟡 🟡 🟡
CROSS_ENTROPY_LOSS
CROSS_ENTROPY_LOSS_BACK
CUMSUM
DIAG
DIAG_MASK_INF 🟡
DIV 🟡
DUP 🟡 🟡 🟡
ELU 🟡 🟡
EXP 🟡 🟡 🟡
EXPM1 🟡 🟡
FILL
FLASH_ATTN_EXT 🟡 🟡 🟡 🟡 🟡 🟡
FLOOR 🟡 🟡 🟡
GATED_LINEAR_ATTN
GEGLU 🟡 🟡
GEGLU_ERF 🟡 🟡
GEGLU_QUICK 🟡 🟡
GELU 🟡 🟡 🟡 🟡
GELU_ERF 🟡 🟡 🟡 🟡
GELU_QUICK 🟡 🟡 🟡 🟡
GET_ROWS 🟡 🟡 🟡 🟡 🟡 🟡
GET_ROWS_BACK 🟡 🟡
GROUP_NORM
HARDSIGMOID 🟡 🟡 🟡
HARDSWISH 🟡 🟡 🟡
IM2COL
IM2COL_3D
L2_NORM
LEAKY_RELU 🟡 🟡
LOG 🟡
MEAN
MUL 🟡
MUL_MAT 🟡 🟡 🟡 🟡 🟡 🟡 🟡 🟡 🟡 🟡
MUL_MAT_ID 🟡 🟡 🟡
NEG 🟡 🟡 🟡
NORM 🟡
OPT_STEP_ADAMW
OPT_STEP_SGD
OUT_PROD 🟡 🟡 🟡 🟡 🟡 🟡
PAD 🟡 🟡 🟡 🟡 🟡
PAD_REFLECT_1D
POOL_1D
POOL_2D 🟡
REGLU 🟡 🟡
RELU 🟡 🟡 🟡 🟡
REPEAT 🟡 🟡 🟡
REPEAT_BACK
RMS_NORM
RMS_NORM_BACK
ROLL
ROPE
ROPE_BACK
ROUND 🟡 🟡 🟡
RWKV_WKV6
RWKV_WKV7
SCALE 🟡
SET 🟡
SET_ROWS 🟡 🟡 🟡 🟡 🟡 🟡 🟡 🟡
SGN 🟡 🟡
SIGMOID 🟡 🟡 🟡 🟡
SILU 🟡 🟡 🟡 🟡
SILU_BACK
SIN 🟡 🟡
SOFTPLUS 🟡 🟡 🟡
SOFT_MAX 🟡
SOFT_MAX_BACK 🟡 🟡 🟡
SOLVE_TRI 🟡 🟡
SQR 🟡 🟡
SQRT 🟡 🟡
SSM_CONV
SSM_SCAN 🟡
STEP 🟡 🟡 🟡
SUB 🟡
SUM 🟡 🟡 🟡 🟡 🟡 🟡
SUM_ROWS 🟡 🟡 🟡
SWIGLU 🟡 🟡
SWIGLU_OAI 🟡
TANH 🟡 🟡 🟡
TIMESTEP_EMBEDDING
TOP_K 🟡
TRI
TRUNC 🟡 🟡 🟡
UPSCALE 🟡 🟡 🟡 🟡 🟡
XIELU