mirror of
https://github.com/ggerganov/llama.cpp
synced 2026-04-09 00:35:42 +02:00
* New Feature:
1. Sum_Rows:
fix cuda kernel overflow
fix block shape error when nrows too big
2. Im2Col:
Support Batch in cuda
Support f32 to f32 both in cpu && cuda
3. DepthWiseConv:
Support by Im2Col && MulMat
4. Pool_2d:
Supoort avg pooling in cuda
5. HardSigmoid:
Imp in cuda
6. HardSwish:
Imp in cuda
* fix tabs instead of spaces
* code clean
* CUDA POOL2D
* ADD POOL2D test case in test-backend-ops.cpp
* code clean
* fix pool2d_kernel
nits
* fix bug in pool2d kernel
* fix avg pooling, count_include_pad
nits
* test-backend-ops : add more pool_2d tests
* cuda : fix warnings and formatting
* ggml : check types in release builds too in pool_2d
* test-backend-ops : remove f16 pool_2d tests
* cuda : more style fixes
* Add assert in ggml_cuda_op_pool2d
* pool2d float padding fallback
* test-backend-ops : add dst_type to im2col
---------
Co-authored-by: slaren <slarengh@gmail.com>
|
||
|---|---|---|
| .. | ||
| .gitignore | ||
| CMakeLists.txt | ||
| get-model.cpp | ||
| get-model.h | ||
| test-autorelease.cpp | ||
| test-backend-ops.cpp | ||
| test-c.c | ||
| test-double-float.cpp | ||
| test-grad0.cpp | ||
| test-grammar-parser.cpp | ||
| test-llama-grammar.cpp | ||
| test-model-load-cancel.cpp | ||
| test-opt.cpp | ||
| test-quantize-fns.cpp | ||
| test-quantize-perf.cpp | ||
| test-rope.cpp | ||
| test-sampling.cpp | ||
| test-tokenizer-0-falcon.cpp | ||
| test-tokenizer-0-falcon.py | ||
| test-tokenizer-0-llama.cpp | ||
| test-tokenizer-0-llama.py | ||
| test-tokenizer-1-bpe.cpp | ||
| test-tokenizer-1-llama.cpp | ||