git/ggml - ggml - Gitea: Git with a cup of tea

git/ggml

mirror of https://github.com/ggerganov/ggml synced 2026-03-01 20:50:26 +01:00

Author	SHA1	Message	Date
Georgi Gerganov	d3eec12d8f	ci : fix workflow name	2025-02-27 13:12:11 +02:00
Georgi Gerganov	5094b69577	ci : remove opencl workflow	2024-12-03 21:05:37 +02:00
Georgi Gerganov	5378ea0d3c	ggml : reorganize source code + improve CMake (#865 ) * scripts : update sync [no ci] * ggml : move headers one up [no ci] * files : reorganize + update CMake ggml-ci * cmake : build normal ggml library ggml-ci * cmake : link math library to test + remove ci for code cov ggml-ci * files : move public headers to include ggml-ci	2024-06-26 19:33:53 +03:00
slaren	169738dc66	move BLAS to a separate backend (cont) (llama/6210) ggml-ci	2024-06-16 18:33:49 +03:00
Georgi Gerganov	c57aa8e905	sync : llama.cpp (fused soft max, gpu cpy ops, etc.) (#640 ) * sync : llama.cpp (fused soft max, gpu cpy ops, etc.) ggml-ci * cuda : restore accidentally deleted changes ggml-ci * cuda : fix rope + disable device-side dequantize ggml-ci * test-backend-ops : enable stablelm rope test * cuda : remove rope assert * sync.sh : add test-backend-ops * ggml : fix ggml_concat + ggml_get_n_tasks logic * sync : whisper.cpp ggml-ci * metal : fix assert * ci : fix Metal path to shaders ggml-ci * whisper : fix bug if metal init fails --------- Co-authored-by: slaren <slarengh@gmail.com>	2023-12-07 22:26:34 +02:00
slaren	703825ffab	ggml : full broadcast in mul, add, div + ggml_mul_mat_id, ggml_argsort, ggml_top_k (#625 ) * ggml : support broadcasting in dim 0 in add and mul * add cuda add/mul broadcast impl add configurable eps to cuda norm * add metal impl ggml-ci * deduplicate code in cuda impl * try to optimize cuda impl * ggml : support broadcasting in ggml_div * test-backend-ops : allow filtering by op and backend * ggml-cuda : add ggml_div impl * ggml : add ggml_mul_mat_id, ggml_sort, ggml_top_k (CPU only) * fix ggml_div threads * fix ggml_div with accelerate * ggml_sort -> ggml_argsort * whatever * actually fix accelerate div * disable opencl ci * ci : disable ctest error check temporarily until we fix backend ops test * cmake : propagete GGML_USE_xxx compile flags with ggml target * whisper : utlize new ggml_add broadcast for dim 0 * cmake : adendum to ee666ae9 * ggml_backend_graph_copy : fix leak * ggml_cuda : add ggml_sum_rows impl * metal : add ggml_div * metal : add ggml_sum_rows * ggml_cuda : add ggml_argsort impl * move kernel * metal : add ggml_argsort * mul_mat_id : fix missing init task * cuda/metal: fix argsort synchronization * metal : add ggml_mul_mat_id * ggml-cuda : add mul_mat_id for f16 + tensor cores * test-backend-ops : add tests for quants mat mul * ggml : fix q5_0 and q5_1 hist stats * test-backend-ops : use smaller matrices to avoid automatic offloading, add mat-vec tests * metal : fix alibi to match the CPU behavior * metal : check dimensions in supports_op * test-backend-ops : reduce error threshold for mat muls * ggml-cuda : simplify dequantize funs, add supports_op by type for mul_mat_id * ggml-cuda : support quantized types in mul_mat_id with cublas * ggml-cuda : add fallback over CPU for mul_mat_id * test-backend-ops : increase mul mat error threshold * cleanup ggml-ci * test-backend-ops : fix usage * cleanup * ci : re-enable tests * metal : fix compile warnings --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-12-05 13:56:07 +01:00
slaren	38f46afdf2	ggml-backend update: buffer types, backend registry, graph compare, tests (#620 ) * ggml-backend update * update metal backend * show metal logs with ggml-backend * move buffer types to functions * cuda: add per-device backends * cuda: add host buffer type * fix metal build * ggml_backend_alloc_ctx_tensors : ignore allocated tensors * ggml_backend_compare_graph_backend fixes * ci : try to fix metal build * metal : first print device info, then build kernels * ci : disable GGML_METAL on Github Actions * test-backend-ops initial impl (unary and get_rows) * more op tests * cleanup * print test params, add more tests cases for add and mul * add tests for im2col * better f16 init * metal : add basic impl of supports_op * add test for ggml_concat * update im2col test params, show callstack with GGML_ASSERT on CUDA failures * add more rope tests * add more rope and mul_mat test cases * add more get_rows test cases ggml-ci * add more norm and rms_norm test cases with different eps * ci : fix metal resource path ggml-ci * tests : silence warning * add ggml_backend_tensor_alloc and ggml_backend_view_init for initializing tensors without ggml-alloc * add mul_mat test cases without dims 3 and 4 ggml-ci * check for nans and infs ggml-ci * add diag_mask_inf test cases without dims 3 and 4 ggml-ci * fix cuda leak while backend reg * fix msvc issues * remove backend_sched debug causes by default * gpt-2 : increase graph size ggml-ci --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-11-30 19:03:03 +01:00
Diogo	dd1d575956	ci : add Metal build (#514 ) * metal on mac * remove apt-get * added xcrun prefix	2023-09-08 19:54:30 +03:00
Diogo	e77653a9d1	ci : add CLBlast build (#513 ) * added clblast test to ci * moved threads to env * changed name * upgraded checkout to v3	2023-09-08 18:07:53 +03:00
goerch	7b55e124e3	ggml : add coverage measurement for Clang, increase test coverage, F16 ggml_sum (#377 ) * First shot at adding clang/llvm coverage analysis * Fix for compiler dependency * Reducing dimensions in test-opt * cmake : try to fix test coverage build + CI * cmake : fix CMAKE option + CI * Adding some tests for half precision floating point tests * Adding missing tests for unary operations * Some more tests for unary operations * Fix syntax error. * Fix bug in relu derivative computation * Revert testing change * ggml : style fixes --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-07-23 19:35:43 +03:00
Georgi Gerganov	b10834c90e	tests : allow to set threads to test-grad0	2023-06-24 19:39:32 +03:00
Georgi Gerganov	0a63fc0f6c	ci : reduce GGML_NLOOP to 3	2023-06-19 21:28:16 +03:00
Adam Tazi	886f1c830b	ci : introduce Github Actions CI workflow (#247 ) * Introduce Github Actions CI workflow for the ggml repo This commit integrates a Github Actions CI workflow that compiles and tests the codebase on both Ubuntu 22.04 and macOS 12 Monterey. The workflow is triggered on pull requests against the main branch and on every push to the main branch. To accommodate the resource constraints of the Github-hosted runners, a `GGML_NITER` environment variable is introduced, allowing tests to run within a reasonable time frame. `test-grad0.c` is modified to use this variable instead of `GGML_NLOOP`. The workflow file includes: - A build strategy for both Ubuntu and MacOS. - An environment setup with variables `GGML_NLOOP` and `GGML_NITER`. - A step to limit the number of threads used by `test2.c` for efficient execution. - A typical build process with steps for environment creation, CMake configuration, building, and verbose testing with a timeout. * main to master	2023-06-18 11:15:58 +03:00

13 Commits