* ci : add github release job
This commit adds a GitHub Actions workflow to automate the release
process. Currently this will only create an archive of the sources for
ggml when a tag is pushed.
The motivation for this is that when we start releasing versions of ggml
using semantic versioning it can be nice to have the sources needed for
ggml to be deployed as a github release. This enables CMake users that
use `FetchContent` efficiently specify the the zip file instead of
cloning.
Example usage with `FetchContent`:
```cmake
cmake_minimum_required(VERSION 3.14)
project(ggml_example)
set(CMAKE_CXX_STANDARD 17)
include(FetchContent)
FetchContent_Declare(ggml
URL https://github.com/danbev/ggml/archive/refs/tags/v1.1.5-test.zip
DOWNLOAD_EXTRACT_TIMESTAMP TRUE
)
FetchContent_MakeAvailable(ggml)
add_executable(ggml_example main.cpp)
target_link_libraries(ggml_example ggml)
```
And with the following `main.cpp` file:
```c++
#include <iostream>
#include <ggml.h>
int main() {
std::cout << "GGML Version: " << ggml_version() << std::endl;
return 0;
}
```
This could then be built using:
```console
$ cmake -S . -B build
$ cmake --build build
$ ./build/ggml_example
GGML Version: 0.0.2472
```
* scripts : update sync [no ci]
* ggml : move headers one up [no ci]
* files : reorganize + update CMake
ggml-ci
* cmake : build normal ggml library
ggml-ci
* cmake : link math library to test + remove ci for code cov
ggml-ci
* files : move public headers to include
ggml-ci
* ggml : support broadcasting in dim 0 in add and mul
* add cuda add/mul broadcast impl
add configurable eps to cuda norm
* add metal impl
ggml-ci
* deduplicate code in cuda impl
* try to optimize cuda impl
* ggml : support broadcasting in ggml_div
* test-backend-ops : allow filtering by op and backend
* ggml-cuda : add ggml_div impl
* ggml : add ggml_mul_mat_id, ggml_sort, ggml_top_k (CPU only)
* fix ggml_div threads
* fix ggml_div with accelerate
* ggml_sort -> ggml_argsort
* whatever
* actually fix accelerate div
* disable opencl ci
* ci : disable ctest error check temporarily until we fix backend ops test
* cmake : propagete GGML_USE_xxx compile flags with ggml target
* whisper : utlize new ggml_add broadcast for dim 0
* cmake : adendum to ee666ae9
* ggml_backend_graph_copy : fix leak
* ggml_cuda : add ggml_sum_rows impl
* metal : add ggml_div
* metal : add ggml_sum_rows
* ggml_cuda : add ggml_argsort impl
* move kernel
* metal : add ggml_argsort
* mul_mat_id : fix missing init task
* cuda/metal: fix argsort synchronization
* metal : add ggml_mul_mat_id
* ggml-cuda : add mul_mat_id for f16 + tensor cores
* test-backend-ops : add tests for quants mat mul
* ggml : fix q5_0 and q5_1 hist stats
* test-backend-ops : use smaller matrices to avoid automatic offloading, add mat-vec tests
* metal : fix alibi to match the CPU behavior
* metal : check dimensions in supports_op
* test-backend-ops : reduce error threshold for mat muls
* ggml-cuda : simplify dequantize funs, add supports_op by type for mul_mat_id
* ggml-cuda : support quantized types in mul_mat_id with cublas
* ggml-cuda : add fallback over CPU for mul_mat_id
* test-backend-ops : increase mul mat error threshold
* cleanup
ggml-ci
* test-backend-ops : fix usage
* cleanup
* ci : re-enable tests
* metal : fix compile warnings
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* ggml-backend update
* update metal backend
* show metal logs with ggml-backend
* move buffer types to functions
* cuda: add per-device backends
* cuda: add host buffer type
* fix metal build
* ggml_backend_alloc_ctx_tensors : ignore allocated tensors
* ggml_backend_compare_graph_backend fixes
* ci : try to fix metal build
* metal : first print device info, then build kernels
* ci : disable GGML_METAL on Github Actions
* test-backend-ops initial impl (unary and get_rows)
* more op tests
* cleanup
* print test params, add more tests cases for add and mul
* add tests for im2col
* better f16 init
* metal : add basic impl of supports_op
* add test for ggml_concat
* update im2col test params, show callstack with GGML_ASSERT on CUDA failures
* add more rope tests
* add more rope and mul_mat test cases
* add more get_rows test cases
ggml-ci
* add more norm and rms_norm test cases with different eps
* ci : fix metal resource path
ggml-ci
* tests : silence warning
* add ggml_backend_tensor_alloc and ggml_backend_view_init for initializing tensors without ggml-alloc
* add mul_mat test cases without dims 3 and 4
ggml-ci
* check for nans and infs
ggml-ci
* add diag_mask_inf test cases without dims 3 and 4
ggml-ci
* fix cuda leak while backend reg
* fix msvc issues
* remove backend_sched debug causes by default
* gpt-2 : increase graph size
ggml-ci
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Introduce Github Actions CI workflow for the ggml repo
This commit integrates a Github Actions CI workflow that compiles and tests the codebase on both Ubuntu 22.04 and macOS 12 Monterey. The workflow is triggered on pull requests against the main branch and on every push to the main branch.
To accommodate the resource constraints of the Github-hosted runners, a `GGML_NITER` environment variable is introduced, allowing tests to run within a reasonable time frame. `test-grad0.c` is modified to use this variable instead of `GGML_NLOOP`.
The workflow file includes:
- A build strategy for both Ubuntu and MacOS.
- An environment setup with variables `GGML_NLOOP` and `GGML_NITER`.
- A step to limit the number of threads used by `test2.c` for efficient execution.
- A typical build process with steps for environment creation, CMake configuration, building, and verbose testing with a timeout.
* main to master