It is annoying to run the tests using the sanitizers
because of all the uninteresting reports about the memory
leaked by the tests themselves.
Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com>
* scripts : update sync [no ci]
* ggml : move headers one up [no ci]
* files : reorganize + update CMake
ggml-ci
* cmake : build normal ggml library
ggml-ci
* cmake : link math library to test + remove ci for code cov
ggml-ci
* files : move public headers to include
ggml-ci
* add new cuda kernels and new op ggml_pad
* add ggml_tanh cuda kernel
* remove old broadcast impl
* restore some changes
* cuda: optimized im2col + group_norm kernels
* extent ggml_leaky -> ggml_leaky_relu
* fix some code issues
* cuda: concat support 4 dims
* cuda: fix ggml_acc + add backends ops test
* restore ggml_pad + add backend op test
* metal : implement GGML_OP_ACC
* ggml : fix bug in ggml_upscale
* metal : add ggml_upscale
* metal : add ggml_tanh
* metal : add ggml_gelu_quick
* ggml : make ggml_pad more general purpose
* metal : add ggml_pad
* ggml_leaky_relu as regular op + fix identation
* cuda: ggml_acc admit all op_parms
* negative_slope better pass param
* metal : add ggml_leaky_relu
* metal : add ggml_group_norm
* cuda : minor
* ggml : add GGML_OP_LEAKY_RELU to ggml_compute_backward
* metal : soft max, tanh, supports_op fixes
* test-backend-ops : add sentinels between tensors to detect overflows
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: slaren <slarengh@gmail.com>
* ggml : support broadcasting in dim 0 in add and mul
* add cuda add/mul broadcast impl
add configurable eps to cuda norm
* add metal impl
ggml-ci
* deduplicate code in cuda impl
* try to optimize cuda impl
* ggml : support broadcasting in ggml_div
* test-backend-ops : allow filtering by op and backend
* ggml-cuda : add ggml_div impl
* ggml : add ggml_mul_mat_id, ggml_sort, ggml_top_k (CPU only)
* fix ggml_div threads
* fix ggml_div with accelerate
* ggml_sort -> ggml_argsort
* whatever
* actually fix accelerate div
* disable opencl ci
* ci : disable ctest error check temporarily until we fix backend ops test
* cmake : propagete GGML_USE_xxx compile flags with ggml target
* whisper : utlize new ggml_add broadcast for dim 0
* cmake : adendum to ee666ae9
* ggml_backend_graph_copy : fix leak
* ggml_cuda : add ggml_sum_rows impl
* metal : add ggml_div
* metal : add ggml_sum_rows
* ggml_cuda : add ggml_argsort impl
* move kernel
* metal : add ggml_argsort
* mul_mat_id : fix missing init task
* cuda/metal: fix argsort synchronization
* metal : add ggml_mul_mat_id
* ggml-cuda : add mul_mat_id for f16 + tensor cores
* test-backend-ops : add tests for quants mat mul
* ggml : fix q5_0 and q5_1 hist stats
* test-backend-ops : use smaller matrices to avoid automatic offloading, add mat-vec tests
* metal : fix alibi to match the CPU behavior
* metal : check dimensions in supports_op
* test-backend-ops : reduce error threshold for mat muls
* ggml-cuda : simplify dequantize funs, add supports_op by type for mul_mat_id
* ggml-cuda : support quantized types in mul_mat_id with cublas
* ggml-cuda : add fallback over CPU for mul_mat_id
* test-backend-ops : increase mul mat error threshold
* cleanup
ggml-ci
* test-backend-ops : fix usage
* cleanup
* ci : re-enable tests
* metal : fix compile warnings
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* ggml-backend update
* update metal backend
* show metal logs with ggml-backend
* move buffer types to functions
* cuda: add per-device backends
* cuda: add host buffer type
* fix metal build
* ggml_backend_alloc_ctx_tensors : ignore allocated tensors
* ggml_backend_compare_graph_backend fixes
* ci : try to fix metal build
* metal : first print device info, then build kernels
* ci : disable GGML_METAL on Github Actions
* test-backend-ops initial impl (unary and get_rows)
* more op tests
* cleanup
* print test params, add more tests cases for add and mul
* add tests for im2col
* better f16 init
* metal : add basic impl of supports_op
* add test for ggml_concat
* update im2col test params, show callstack with GGML_ASSERT on CUDA failures
* add more rope tests
* add more rope and mul_mat test cases
* add more get_rows test cases
ggml-ci
* add more norm and rms_norm test cases with different eps
* ci : fix metal resource path
ggml-ci
* tests : silence warning
* add ggml_backend_tensor_alloc and ggml_backend_view_init for initializing tensors without ggml-alloc
* add mul_mat test cases without dims 3 and 4
ggml-ci
* check for nans and infs
ggml-ci
* add diag_mask_inf test cases without dims 3 and 4
ggml-ci
* fix cuda leak while backend reg
* fix msvc issues
* remove backend_sched debug causes by default
* gpt-2 : increase graph size
ggml-ci
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* sync : whisper.cpp (whisper full GPU, fix warnings)
ggml-ci
* ci : enable CUDA / Metal
ggml-ci
* cuda : fallback to CPU for mul mat ne03 != ne13 (fix SAM + CUDA)
ggml-ci
* added conv2d stage 0 - 1 cuda kernels
* add im2col + refactor conv1d and conv2d
* fix params invalid index
* add conv1d and conv2d unit tests
* resolving wrong values and fix mul_mat validation
* improve tests + reduce code duplication
* add cuda kernels
* more data test
* fix ggml_op_count to 70
* add temp test - gemm != mul_mat
* tests : fix test-mul-mat matrix multiplication
* test-mul-mat match gemm == ggml_mul_mat with conv2d op
* replaced gemm by ggml_mul_mat
* ggml_mul_mat cpu backend support fp16 src1
* ggml_mul_mat cuda backend fp16 fixed
* remove unnecessary ggml_cont and removed conv1d-2d functions deprecated
* some fixes
* explain conv1d reshapes
* ggml : fix tests on Arm + do not use BLAS for F16 data
* tests : fix FP16 handling on Arm
* ggml : avoid ggml_cont and ggml_transpose in ggml_conv_xd
* ci : switch back to release
* cuda : fix wrong pointer usage
* ggml : add metal support for im2col and f16xf16 mul mat
* ggml : im2col opts
* Update src/ggml-cuda.cu
Co-authored-by: slaren <slarengh@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: slaren <slarengh@gmail.com>