Commit Graph

16 Commits

Author SHA1 Message Date
StyMaar
978f6e1993
Update gguf specification to synchronize the ggml_types declaration shown in the doc with the actual one. (#1342)
BF16, TQ1_0,TQ2_0 and MXFP4 were missing in the enum declaration in the spec.
2025-09-16 13:42:24 +02:00
Georgi Gerganov
d50c1fff85
media : rm logos (#1203)
* media : add

* media : experiment with cards

* media : add common.sh

* cards : fix frame and shadow

* cards : adjustments

* cards : add qwen3

* media : rm
2025-04-30 08:05:22 +03:00
Brian
f18a111977
gguf.md: naming convention synced to llama.cpp (#896)
It is now updated to this form

`<BaseName><SizeLabel><FineTune><Version><Encoding><Type><Shard>.gguf`
2024-07-22 13:25:01 +03:00
Brian
a44ac4e48e
gguf.md: kv store has new authorship metadata keys (#897) 2024-07-21 11:20:30 +03:00
compilade
8d6b703887
gguf : use Qn_K for k-quants instead of KQn (#837) 2024-05-24 23:58:29 +03:00
Brian
0cbb7c0e05
gguf.md: add sharding to naming convention (#826)
* gguf.md: add sharding to naming convention [no ci]

* gguf.md: Add note on using gguf metadata for model name, version and expert count [no ci]

* gguf.md: Tighten up wording and add regex example [no ci]

* gguf.md: json output for expertcount and shard is numerical [no ci]
2024-05-19 11:05:26 +03:00
Brian
9988298d9a
gguf.md: Add GGUF Naming Convention Section (#822)
* gguf.md: Add GGUF Naming Convention Section

* gguf.md: add BF16

* gguf.md: GGUF Filename Parsing Strategy

* gguf.md: include tensor type table and historical context

* gguf.md: minor corrections

* gguf.md: more detailed breakdown of tensor type mapping

* gguf.md: use Encoding Scheme name instead

* gguf.md: minor correction to overall naming convention

* gguf.md: simplify GGUF Naming Convention
2024-05-17 09:09:01 +03:00
Daniel Bevenius
e1daebbf9d
spec : fix typo in gguf.md (#798)
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2024-04-18 19:47:17 +03:00
Georgi Gerganov
48d64d932c
logo : add files (#782) 2024-04-03 22:59:55 +03:00
JacobLinCool
795de7bc58
gguf : update type enum (#775)
* spec: add missing semicolons in GGUF structs

Co-Authored-By: 郝東彥 Arthur Hao <41247050s@gapps.ntnu.edu.tw>

* spec: update GGUF tensor types

---------

Co-authored-by: 郝東彥 Arthur Hao <41247050s@gapps.ntnu.edu.tw>
2024-03-27 19:48:56 +02:00
Georgi Gerganov
ef085b5a8c
spec : add GGUF diagram (#765) 2024-03-15 14:10:35 +02:00
compilade
9c2adc4962
gguf : add Mamba keys and tensors (#763) 2024-03-13 16:33:19 +02:00
postmasters
9e221033f3
gguf : add keys for kv sizes to spec (#676)
* Add keys for kv sizes to GGUF spec

* Fix types of key_length and value_length
2024-01-05 17:25:38 +02:00
ariez-xyz
a027a92c1d
gguf : document Mixtral changes in spec (#646)
* add new tensor names

* add new keys

* fix tensor names

* gguf : change wording a bit

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-12-13 14:01:31 +02:00
slaren
57c468b865
gguf : add tokenizer.chat_template documentation (#616) 2023-11-19 10:37:08 +02:00
Philpax
54cec9f619
gguf : add file format specification (#302)
* docs: gguf spec first pass

* docs(gguf): update with review comments

* docs(gguf): update with review comments

* docs(gguf): quant version optional for unquant

* docs(gguf): normalize naming, add whisper

* docs(gguf): more review updates

* docs(gguf): add norm eps and added_tokens

* docs(gguf): move padding

* docs(gguf): remove migration tool

* docs(gguf): make offset base explicit

* docs(gguf): fix replace oops

* docs(gguf): alignment metadata+tensor name len max

* docs(gguf): clarification, fixes, tensor names

* docs(gguf): clarify license

* docs(gguf): minor tweaks

* docs(gguf): data layout, GQA eq, no ft, LE GGUF

* docs(gguf): fix magic order

* docs(gguf): match impl

* docs(gguf): specify fallback alignment

* docs(gguf): remove TensorInfo::n_elements

* docs(gguf): filetype, rope base/linear scale

* docs(gguf): v2 - uint64 all the things

* docs(gguf): tweak extensibility wording

* docs(gguf): fix spec discrepancies

* docs(gguf): v3 + other fixes

* fix(editorconfig): use 2-space tabs for markdown

* docs(gguf): clarify big-endian
2023-11-01 19:01:49 +02:00