Commit Graph

13 Commits

Author SHA1 Message Date
Georgi Gerganov
d3eec12d8f ci : fix workflow name 2025-02-27 13:12:11 +02:00
Georgi Gerganov
5094b69577 ci : remove opencl workflow 2024-12-03 21:05:37 +02:00
Georgi Gerganov
5378ea0d3c
ggml : reorganize source code + improve CMake (#865)
* scripts : update sync [no ci]

* ggml : move headers one up [no ci]

* files : reorganize + update CMake

ggml-ci

* cmake : build normal ggml library

ggml-ci

* cmake : link math library to test + remove ci for code cov

ggml-ci

* files : move public headers to include

ggml-ci
2024-06-26 19:33:53 +03:00
slaren
169738dc66 move BLAS to a separate backend (cont) (llama/6210)
ggml-ci
2024-06-16 18:33:49 +03:00
Georgi Gerganov
c57aa8e905
sync : llama.cpp (fused soft max, gpu cpy ops, etc.) (#640)
* sync : llama.cpp (fused soft max, gpu cpy ops, etc.)

ggml-ci

* cuda : restore accidentally deleted changes

ggml-ci

* cuda : fix rope + disable device-side dequantize

ggml-ci

* test-backend-ops : enable stablelm rope test

* cuda : remove rope assert

* sync.sh : add test-backend-ops

* ggml : fix ggml_concat + ggml_get_n_tasks logic

* sync : whisper.cpp

ggml-ci

* metal : fix assert

* ci : fix Metal path to shaders

ggml-ci

* whisper : fix bug if metal init fails

---------

Co-authored-by: slaren <slarengh@gmail.com>
2023-12-07 22:26:34 +02:00
slaren
703825ffab
ggml : full broadcast in mul, add, div + ggml_mul_mat_id, ggml_argsort, ggml_top_k (#625)
* ggml : support broadcasting in dim 0 in add and mul

* add cuda add/mul broadcast impl
add configurable eps to cuda norm

* add metal impl
ggml-ci

* deduplicate code in cuda impl

* try to optimize cuda impl

* ggml : support broadcasting in ggml_div

* test-backend-ops : allow filtering by op and backend

* ggml-cuda : add ggml_div impl

* ggml : add ggml_mul_mat_id, ggml_sort, ggml_top_k (CPU only)

* fix ggml_div threads

* fix ggml_div with accelerate

* ggml_sort -> ggml_argsort

* whatever

* actually fix accelerate div

* disable opencl ci

* ci : disable ctest error check temporarily until we fix backend ops test

* cmake : propagete GGML_USE_xxx compile flags with ggml target

* whisper : utlize new ggml_add broadcast for dim 0

* cmake : adendum to ee666ae9

* ggml_backend_graph_copy : fix leak

* ggml_cuda : add ggml_sum_rows impl

* metal : add ggml_div

* metal : add ggml_sum_rows

* ggml_cuda : add ggml_argsort impl

* move kernel

* metal : add ggml_argsort

* mul_mat_id : fix missing init task

* cuda/metal: fix argsort synchronization

* metal : add ggml_mul_mat_id

* ggml-cuda : add mul_mat_id for f16 + tensor cores

* test-backend-ops : add tests for quants mat mul

* ggml : fix q5_0 and q5_1 hist stats

* test-backend-ops : use smaller matrices to avoid automatic offloading, add mat-vec tests

* metal : fix alibi to match the CPU behavior

* metal : check dimensions in supports_op

* test-backend-ops : reduce error threshold for mat muls

* ggml-cuda : simplify dequantize funs, add supports_op by type for mul_mat_id

* ggml-cuda : support quantized types in mul_mat_id with cublas

* ggml-cuda : add fallback over CPU for mul_mat_id

* test-backend-ops : increase mul mat error threshold

* cleanup
ggml-ci

* test-backend-ops : fix usage

* cleanup

* ci : re-enable tests

* metal : fix compile warnings

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-12-05 13:56:07 +01:00
slaren
38f46afdf2
ggml-backend update: buffer types, backend registry, graph compare, tests (#620)
* ggml-backend update

* update metal backend

* show metal logs with ggml-backend

* move buffer types to functions

* cuda: add per-device backends

* cuda: add host buffer type

* fix metal build

* ggml_backend_alloc_ctx_tensors : ignore allocated tensors

* ggml_backend_compare_graph_backend fixes

* ci : try to fix metal build

* metal : first print device info, then build kernels

* ci : disable GGML_METAL on Github Actions

* test-backend-ops initial impl (unary and get_rows)

* more op tests

* cleanup

* print test params, add more tests cases for add and mul

* add tests for im2col

* better f16 init

* metal : add basic impl of supports_op

* add test for ggml_concat

* update im2col test params, show callstack with GGML_ASSERT on CUDA failures

* add more rope tests

* add more rope and mul_mat test cases

* add more get_rows test cases
ggml-ci

* add more norm and rms_norm test cases with different eps

* ci : fix metal resource path

ggml-ci

* tests : silence warning

* add ggml_backend_tensor_alloc and ggml_backend_view_init for initializing tensors without ggml-alloc

* add mul_mat test cases without dims 3 and 4
ggml-ci

* check for nans and infs
ggml-ci

* add diag_mask_inf test cases without dims 3 and 4
ggml-ci

* fix cuda leak while backend reg

* fix msvc issues

* remove backend_sched debug causes by default

* gpt-2 : increase graph size

ggml-ci

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-30 19:03:03 +01:00
Diogo
dd1d575956
ci : add Metal build (#514)
* metal on mac

* remove apt-get

* added xcrun prefix
2023-09-08 19:54:30 +03:00
Diogo
e77653a9d1
ci : add CLBlast build (#513)
* added clblast test to ci

* moved threads to env

* changed name

* upgraded checkout to v3
2023-09-08 18:07:53 +03:00
goerch
7b55e124e3
ggml : add coverage measurement for Clang, increase test coverage, F16 ggml_sum (#377)
* First shot at adding clang/llvm coverage analysis

* Fix for compiler dependency

* Reducing dimensions in test-opt

* cmake : try to fix test coverage build + CI

* cmake : fix CMAKE option + CI

* Adding some tests for half precision floating point tests

* Adding missing tests for unary operations

* Some more tests for unary operations

* Fix syntax error.

* Fix bug in relu derivative computation

* Revert testing change

* ggml : style fixes

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-23 19:35:43 +03:00
Georgi Gerganov
b10834c90e
tests : allow to set threads to test-grad0 2023-06-24 19:39:32 +03:00
Georgi Gerganov
0a63fc0f6c
ci : reduce GGML_NLOOP to 3 2023-06-19 21:28:16 +03:00
Adam Tazi
886f1c830b
ci : introduce Github Actions CI workflow (#247)
* Introduce Github Actions CI workflow for the ggml repo

This commit integrates a Github Actions CI workflow that compiles and tests the codebase on both Ubuntu 22.04 and macOS 12 Monterey. The workflow is triggered on pull requests against the main branch and on every push to the main branch.

To accommodate the resource constraints of the Github-hosted runners, a `GGML_NITER` environment variable is introduced, allowing tests to run within a reasonable time frame. `test-grad0.c` is modified to use this variable instead of `GGML_NLOOP`.

The workflow file includes:

- A build strategy for both Ubuntu and MacOS.
- An environment setup with variables `GGML_NLOOP` and `GGML_NITER`.
- A step to limit the number of threads used by `test2.c` for efficient execution.
- A typical build process with steps for environment creation, CMake configuration, building, and verbose testing with a timeout.

* main to master
2023-06-18 11:15:58 +03:00