Commit Graph

526 Commits

Author SHA1 Message Date
Sigbjørn Skjæret
037bfe38d0
ci : install spirv-headers for vulkan-cross (#22109) 2026-04-19 10:32:08 +03:00
Sigbjørn Skjæret
83d58e02fc
ci : free disk space for rocm release (#22012) 2026-04-18 09:37:30 +02:00
Reese Levine
45cac7ca70
ggml-webgpu: fix compiler warnings and refactor FlashAttention encoding (#21052)
* Update workflows to remove dependence on llvmpipe

* Try setting Dawn_DIR

* remove c++20 initializers

* Move to proper guid

* Try avoiding segfaults on vulkan backend process exit

* Remove compiler warnings on parameter casting

* Fix soft_max and update reg_tile accumulation to f32 for better precision

* Refactor flash_attn a bit

* remove c++20 initializers and format

* Increase div precision for NVIDIA

* revert div precision and comment out ggml-ci node for now

* Formatting

* Try debugging on a failing CI node

* Revert "Try debugging on a failing CI node"

This reverts commit 1971e33cba.
2026-04-17 09:17:11 -07:00
Yuri Khrustalev
a279d0f0f4
ci : add android arm64 build and release (#21647)
* server: respect the ignore eos flag

* ci: add android arm64 build and release

* patch

* pin android-setup actions to v4

* Apply suggestions from code review

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* lf in the suggestion

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-04-17 11:32:24 +02:00
Ludovic Henry
8612ed18b7
ci : Use ggml-org/ccache-action on RISC-V as well (#21632) 2026-04-16 11:11:25 +03:00
Ruben Ortlam
8dc530b86d
ci: disable test-backend-ops on Vulkan llvmpipe run and resture default timeout (#21901) 2026-04-15 10:55:21 +02:00
Jeff Bolz
1f30ac0cea
vulkan: Programmatically add RoundingModeRTE to all shaders when the device supports it (#21572)
* vulkan: Programmatically add RoundingModeRTE to all shaders when the device supports it

* use FetchContent to get SPIRV-Headers

* Fetch spirv-headers unconditionally

* remove fetchcontent, rely on installed headers

* fix ubuntu job

* Update docs/build.md
2026-04-14 15:17:45 +02:00
Georgi Gerganov
f4b5bf2f32
ci : re-enable mac workflows (#21894)
* ci : re-enable mac workflows

* vulkan : fix compile warning
2026-04-14 15:58:09 +03:00
Christian Kastner
a8bad3842e
ci: Also exempt 'security' tag from auto-close (#21844) 2026-04-14 01:18:44 +08:00
Marxist-Leninist
8a65a7a8ee
ci: drop v5 all: composition from labeler.yml (#21627)
actions/labeler@v6 removed the `all:` / `any:` composition keys.
The `server/webui` and `server` entries used `all:` to combine
`any-glob-to-any-file` with negated `all-globs-to-all-files`,
which now errors on every PR with:

    Unknown config options were under "changed-files": all

Flatten both entries to a single `any-glob-to-any-file`. PRs
touching both webui and other server files will now receive both
labels instead of only `server/webui`.

Co-authored-by: Marxist-Leninist <noreply@users.noreply.github.com>
2026-04-09 08:20:19 +02:00
Aleksander Grygier
3bd9aa1f92
chore: Update labeler to have separate labels for server/webui and server changes (#21567) 2026-04-08 10:35:31 +02:00
Martin Klacer
5c4aae66e1
devops: kleidiai: provide KleidiAI-Enabled ARM Release Artifact (#21259)
* Unified macOS release setup with strategy-matrix block
 * Added KleidiAI arm64 macOS release definition


Change-Id: I05520889ffc646488a178d06817a17f29274465a

Signed-off-by: Martin Klacer <martin.klacer@arm.com>
2026-04-08 13:06:12 +08:00
Ludovic Henry
761797ffdf
ci : use default RISE RISC-V Runners (#21263) 2026-04-05 20:29:48 +02:00
M1DNYT3
c08d28d088
ci: lower cuda12 floor to 12.8.1 for broader host compatibility (#21438)
Co-authored-by: M1DNYT3 <m1dnyt3@MacBookPro.lan>
2026-04-05 09:04:00 +08:00
Nicholas Sparks
661e9acb36
ci: fix vulkan workflow referencing non-existent action (#21442) 2026-04-05 08:59:51 +08:00
Masato Nakasaka
e439700992
ci: Add Windows Vulkan backend testing on Intel (#21292)
* experimenting CI

* Experimenting CI fix for MinGW

* experimenting CI on Windows

* modified script for integration with VisualStudio

* added proxy handling

* adding python version for Windows execution

* fix iterator::end() dereference

* fixed proxy handling

* Fix errors occurring on Windows

* fixed ci script

* Reverted to master

* Stripping test items to simplify Windows test

* adjusting script for windows testing

* Changed shell

* Fixed shell

* Fixed shell

* Fix CI setting

* Fix CI setting

* Fix CI setting

* Experimenting ci fix

* Experimenting ci fix

* Experimenting ci fix

* Experimenting ci fix

* experimenting fix for unit test error

* Changed to use BUILD_LOW_PERF to skip python tests

* Fix CI

* Added option to specify Ninja generator

* Reverted proxy related changes
2026-04-03 20:16:44 +03:00
M1DNYT3
277ff5fff7
docker : bump cuda12 to 12.9.1 (#20920)
Co-authored-by: M1DNYT3 <m1dnyt3@MacBookPro.lan>
Co-authored-by: CISC <CISC@users.noreply.github.com>
2026-04-03 15:06:45 +02:00
uvos
43a4ee4a2c
HIP: build eatch ci build test for a different architecture (#21337)
This helps improve our chances of finding build failures before the release workflow
builds for all architectures.
2026-04-03 11:38:22 +02:00
Vishal Singh
f49e917876
ci : add AMD ZenDNN label to PR labeler (#21345)
* ci : add AMD CPU label to PR labeler
Add automatic labeling for PRs that modify AMD CPU (ZenDNN) backend files

* ci : rename label AMD CPU to AMD ZenDNN in labeler config

Co-authored-by: Aaron Teo <taronaeo@gmail.com>

---------

Co-authored-by: Aaron Teo <taronaeo@gmail.com>
2026-04-03 10:35:15 +08:00
Slobodan Josic
7c7d6ce5c7
[HIP] Bump ROCm version to 7.2.1 (#21066)
Bump ROCm version on Linux from 7.2 to 7.2.1
Add gfx1102 target
Delete LLVM workaround since ROCm 7.2.1 has fix for ROCm 7.2 perf regression https://github.com/ROCm/rocm-systems/issues/2865

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-04-03 00:59:20 +02:00
Nikhil Jain
5a0ed5150a
Update Dawn version in WebGPU CI (#20784)
* Pin Dawn version

* Update docs with new Dawn commit hash
2026-04-01 09:53:05 -07:00
Seungmin Kim
eec6f85d7b
CI: Enable CPU and Vulkan ARM64 Release (#21207) 2026-03-31 19:02:56 +08:00
Seungmin Kim
84ae8434d0
CI : Enable CUDA and Vulkan ARM64 runners and fix CI/CD (#21122)
* CI: Enable CUDA and Vulkan ARM64 runners and fix CI/CD

Co-authored-by: Ts-sound <44093942+Ts-sound@users.noreply.github.com>

* Obtain source tag name from git tag

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

---------

Co-authored-by: Ts-sound <44093942+Ts-sound@users.noreply.github.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-03-30 20:24:37 +02:00
Sigbjørn Skjæret
e2eb39e81c
ci : bump ty to 0.0.26 (#21156)
* fix incorrect type ignore comments

* bump ty to 0.0.26
2026-03-30 09:29:15 +02:00
Ts-sound
bf934f28db
docker : fix and enable ARM64 image build (#20929)
* CI: fix ARM64 image build error & enable compilation

* Update .github/workflows/docker.yml

Co-authored-by: Aaron Teo <taronaeo@gmail.com>

* CI: revert ggml/src/ggml-cpu/CMakeLists.txt

* Update .github/workflows/docker.yml

Co-authored-by: Aaron Teo <taronaeo@gmail.com>

* CI: update runs-on to ubuntu24.04, and update ARM64 build image ( ubuntu_version: "24.04")

* CI: change cpu.Dockerfile gcc to 14;

* CI : cpu.Dockerfile , update pip install .

* Update .github/workflows/docker.yml

Co-authored-by: Aaron Teo <taronaeo@gmail.com>

---------

Co-authored-by: Aaron Teo <taronaeo@gmail.com>
2026-03-28 01:45:09 +01:00
KokerZhou
6861f6509a
CANN: update docker images to 8.5.0 and improve CANN.md (#20801)
* cann: update docker images to 8.5.0

- bump CANN base image from 8.3.rc2 to 8.5.0
- bump ASCEND_VERSION from 8.1.RC1.alpha001 to 8.5.0

Move to newer stable releases.

* cann: update CANN.md

* Update CANN.md to include BF16 support

Added BF16 support information to the CANN documentation and corrected formatting for the installation instructions.

* Fix formatting issues in CANN.md

Fix 234: Trailing whitespace
2026-03-27 08:53:00 +08:00
Xuan-Son Nguyen
8c60b8a2be
ci: pin external actions to exact commit SHA (#21033) 2026-03-26 20:44:00 +01:00
uvos
ec54ac13a8
ci : fix parsing of vgpr counts in hip-quality-check (#20987)
* scripts: hip: gcn-cdna-vgpr-check: fix parsing of vgpr counts when an amdclang Remark block is interlieved with another from a different process

* Return warning ignore

* obay pep8 inline double space before inline commets

* add # noqa: NP100 for other prints too

* Add script changes to cause autotrigger
2026-03-25 19:00:37 +01:00
Shreya Jain
345de3cd87
Use docker in build-android.yml (#20928)
* use docker instead of SDK separately

* fix whitespaces

* Update .github/workflows/build-android.yml

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

---------

Co-authored-by: Max Krasnyansky <maxk@qti.qualcomm.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-03-25 09:36:27 -07:00
Masato Nakasaka
b2704f9028
ci: Allow ninja to be used during unit test (#20742)
* Remove make dependency

* Added option to specify Ninja generator

* use ninja-build as default for several CI

* Revert "use ninja-build as default for several CI"

This reverts commit f552c4559b.

* changed use plain string rather than arrays

* Enabled ninja build by default for experimentation

* ci: add run.sh to test conditions to trigger GitHub CI and self-hosted runners

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* Enabled ninja build by default on self-hosted envs for experimentation

* ci: revert generator to ninja instead of ninja multi-config

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* ci: install ninja-build for self-hosted workflows

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* ci: revert ninja from self-hosted runners

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* ci: missed one self-hosted step

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* ci: fix windows ci errors from an errenous revert

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* Added explicit build types for Ninja

Also reverted some needless change

* ci: use ninja multi-config for vulkan-x64 build

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

* added time command to measure build time

* Keeping some configs to use Ninja which show improvement

* minor fix based on review

Co-authored-by: Aaron Teo <taronaeo@gmail.com>

* ci: rm `time` from custom containers

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>

---------

Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Co-authored-by: Aaron Teo <aaron.teo1@ibm.com>
Co-authored-by: Aaron Teo <taronaeo@gmail.com>
2026-03-25 21:00:49 +08:00
Georgi Gerganov
3fab96cd04
ci : disable self-hosted mac jobs (#20985) 2026-03-25 14:46:40 +02:00
Sigbjørn Skjæret
403c9c9cef
ci : bump gguf publish python version (#20982) 2026-03-25 11:04:59 +02:00
Sigbjørn Skjæret
8fc85db9d2
ci : limit requirements versions (#20980)
* set requests version

* limit versions outside requirements
2026-03-25 10:55:37 +02:00
Aaron Teo
c2e224d829
issues: add openvino backends (#20932)
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2026-03-24 14:41:10 +08:00
Xuan-Son Nguyen
bd6992180b
contrib: add "Requirements" section to PR template (#20841)
* contrib: add "Requirements" section to PR template

* typo [no ci]

* use h2, add "Additional information"

---------

Co-authored-by: Piotr Wilkin (ilintar) <piotr.wilkin@syndatis.com>
2026-03-23 16:59:02 +01:00
Georgi Gerganov
e32d243849
ai : update gh permissions (#20895) 2026-03-23 13:21:41 +02:00
Sigbjørn Skjæret
29b28a9824
ci : switch from pyright to ty (#20826)
* type fixes

* switch to ty

* tweak rules

* tweak more rules

* more tweaks

* final tweak

* use common import-not-found rule
2026-03-21 08:54:34 +01:00
Georgi Gerganov
4cb7e0bd61
ai : limit runtime of the agent (#20816) 2026-03-20 20:31:25 +02:00
Georgi Gerganov
b31b30f31d
ai : do not run bash commands in the prompt (#20810) 2026-03-20 19:06:33 +02:00
Georgi Gerganov
464fd0e71f
ai : update find-related action (#20790)
* ai : update "related issues" prompt

* cont

* cont

* cont
2026-03-20 10:28:14 +02:00
Georgi Gerganov
6c72646a61
ci : improve action for duplicate issue (#20772)
* ci : show thinking traces of the agent

* cont : increase thinking

* cont : remove agent files

* cont : move the model selection to the provider
2026-03-19 21:11:53 +02:00
Georgi Gerganov
900efd531d
ci : clarify gh command for viewing issues (#20766) 2026-03-19 18:43:54 +02:00
uvos
b49d8b8757
ci : add hip quality check (#20430)
* CI: add hip quality check

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update .github/workflows/hip-quality-check.yml

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update .github/workflows/hip-quality-check.yml

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update .github/workflows/hip-quality-check.yml

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update scripts/hip/gcn-cdna-vgpr-check.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Revert "Update .github/workflows/hip-quality-check.yml"

This reverts commit efa0bfcdb01dfac0feee674987a0482d50f46145.

* scripts: gcn-cdna-vgpr-check.py: enforce int type for total_vgprs

* scripts: gcn-cdna-vgpr-check.py: add flash attention instances to ignore list

* Bump ccache version

* Add mssing seperators to list

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-03-19 17:05:44 +01:00
Georgi Gerganov
f071ce67c9
ci : add action for finding duplicate issues (#20756)
* ci : add action for finding duplicates issues

* cont : gen info

* cont : formatting

* cont : fix

* cont : instructions

* cont : bump checkout action

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-03-19 16:17:37 +02:00
Sigbjørn Skjæret
ab0bb93748
ci : bump ccache [no ci] (#20679)
* bump ccache

* forgotten

* disable for s390x

* disable also for ppc64le
2026-03-17 14:54:31 +01:00
Georgi Gerganov
45172df4d6
ci : disable AMX jobs (#20654)
[no ci]
2026-03-16 22:38:59 +02:00
Sigbjørn Skjæret
0ed992973b
ci : update labeler (#20629) 2026-03-16 20:24:20 +01:00
Sigbjørn Skjæret
b91d7dfe5b
ci : only save openvino caches on github-hosted master (#20593)
* only save openvino ccache on master

* disable toolkit cache if self-hosted

* only cache on github-hosted runners

* remove toolkit cache [no ci]
2026-03-15 18:58:13 +01:00
Georgi Gerganov
9cd4ebcfb1
ci : split build.yml + server.yml (#20546)
* ci : split build.yml

* cont : split server.yml

* cont : reduce paths

* cont : split build-android.yml + update paths

* ci : make msys workflows manual (#20588)

* ci : make cross-build workflows manual (#20585)

* cont : fix release paths

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-03-15 15:11:17 +02:00
Georgi Gerganov
b4768955c4
ci : move self-hosted workflows to separate files (#20540) 2026-03-14 23:15:35 +02:00