Default Branch

c96f608d98 · common: consolidate PEG string parsers (#20263) · Updated 2026-03-10 00:29:21 +01:00

Branches

f8e9f11428 · common : add -dkvc arg for enabling kv cache dumps · Updated 2023-11-23 17:47:56 +01:00

6715
4

f824902623 · YaRN : correction to GPT-NeoX implementation · Updated 2023-11-15 23:10:52 +01:00

6747
1

d0445a2eff · better documentation · Updated 2023-11-10 01:38:20 +01:00

6764
3

47d604fa2d · fix issues · Updated 2023-11-05 13:20:22 +01:00

6745
3

3ef358fffd · Revert "cuda : use CUDA memory pool with async memory allocation/deallocation when available (#3903)" · Updated 2023-11-04 21:26:51 +01:00

6749
2

46868a499e · metal : multi-simd softmax · Updated 2023-11-01 20:16:34 +01:00

6774
1

a8796f9609 · llm : cleanup + comments · Updated 2023-11-01 19:08:02 +01:00

6783
4

7420bef83e · wip wip wip · Updated 2023-11-01 07:51:43 +01:00

6783
1

afb3929279 · Merge branch 'master' into llama-refactor · Updated 2023-10-31 19:35:31 +01:00

6785
21

29fe516913 · wip · Updated 2023-10-31 17:36:37 +01:00

6786
1

dab42893c9 · scripts : working curl pipe · Updated 2023-10-31 16:03:56 +01:00

6786
3

7923b70cb8 · llama : add llm_build_inp_embd helper · Updated 2023-10-31 15:43:08 +01:00

6791
37

4b3cb98d46 · ggml-impl : move extern "C" to start of file · Updated 2023-10-30 18:05:58 +01:00

6787
7
lto

bc28aaa8c2 · make : use -lfto=auto to avoid warnings and maintain perf · Updated 2023-10-30 15:00:53 +01:00

6787
5

15267192c0 · llama : refactor tensor offloading as callback · Updated 2023-10-29 12:04:36 +01:00

6791
15

8a86b95e87 · quantize : --pure option for disabling k-quant mixtures · Updated 2023-10-28 22:37:03 +02:00

6792
3

de7e0912b6 · convert : ignore tokens if their IDs are within [0, vocab_size) · Updated 2023-10-28 14:01:36 +02:00

6795
1

bbfc62ac2f · sampling : temp == 0.0 -> no probs, temp < 0.0 -> probs · Updated 2023-10-28 13:04:57 +02:00

6803
3

cd3e20fb50 · cuda : fix multi-gpu with tensor cores · Updated 2023-10-27 22:11:50 +02:00

6802
3

49af767fad · build : add compile option to force use of MMQ kernels · Updated 2023-10-27 12:21:04 +02:00

6804
7