* feat(mm): add UnknownModelConfig * refactor(ui): move model categorisation-ish logic to central location, simplify model manager models list * refactor(ui)refactor(ui): more cleanup of model categories * refactor(ui): remove unused excludeSubmodels I can't remember what this was for and don't see any reference to it. Maybe it's just remnants from a previous implementation? * feat(nodes): add unknown as model base * chore(ui): typegen * feat(ui): add unknown model base support in ui * feat(ui): allow changing model type in MM, fix up base and variant selects * feat(mm): omit model description instead of making it "base type filename model" * feat(app): add setting to allow unknown models * feat(ui): allow changing model format in MM * feat(app): add the installed model config to install complete events * chore(ui): typegen * feat(ui): toast warning when installed model is unidentified * docs: update config docstrings * chore(ui): typegen * tests(mm): fix test for MM, leave the UnknownModelConfig class in the list of configs * tidy(ui): prefer types from zod schemas for model attrs * chore(ui): lint * fix(ui): wrong translation string * feat(mm): normalized model storage Store models in a flat directory structure. Each model is in a dir named its unique key (a UUID). Inside that dir is either the model file or the model dir. * feat(mm): add migration to flat model storage * fix(mm): normalized multi-file/diffusers model installation no worky now worky * refactor: port MM probes to new api - Add concept of match certainty to new probe - Port CLIP Embed models to new API - Fiddle with stuff * feat(mm): port TIs to new API * tidy(mm): remove unused probes * feat(mm): port spandrel to new API * fix(mm): parsing for spandrel * fix(mm): loader for clip embed * fix(mm): tis use existing weight_files method * feat(mm): port vae to new API * fix(mm): vae class inheritance and config_path * tidy(mm): patcher types and import paths * feat(mm): better errors when invalid model config found in db * feat(mm): port t5 to new API * feat(mm): make config_path optional * refactor(mm): simplify model classification process Previously, we had a multi-phase strategy to identify models from their files on disk: 1. Run each model config classes' `matches()` method on the files. It checks if the model could possibly be an identified as the candidate model type. This was intended to be a quick check. Break on the first match. 2. If we have a match, run the config class's `parse()` method. It derive some additional model config attrs from the model files. This was intended to encapsulate heavier operations that may require loading the model into memory. 3. Derive the common model config attrs, like name, description, calculate the hash, etc. Some of these are also heavier operations. This strategy has some issues: - It is not clear how the pieces fit together. There is some back-and-forth between different methods and the config base class. It is hard to trace the flow of logic until you fully wrap your head around the system and therefore difficult to add a model architecture to the probe. - The assumption that we could do quick, lightweight checks before heavier checks is incorrect. We often _must_ load the model state dict in the `matches()` method. So there is no practical perf benefit to splitting up the responsibility of `matches()` and `parse()`. - Sometimes we need to do the same checks in `matches()` and `parse()`. In these cases, splitting the logic is has a negative perf impact because we are doing the same work twice. - As we introduce the concept of an "unknown" model config (i.e. a model that we cannot identify, but still record in the db; see #8582), we will _always_ run _all_ the checks for every model. Therefore we need not try to defer heavier checks or resource-intensive ops like hashing. We are going to do them anyways. - There are situations where a model may match multiple configs. One known case are SD pipeline models with merged LoRAs. In the old probe API, we relied on the implicit order of checks to know that if a model matched for pipeline _and_ LoRA, we prefer the pipeline match. But, in the new API, we do not have this implicit ordering of checks. To resolve this in a resilient way, we need to get all matches up front, then use tie-breaker logic to figure out which should win (or add "differential diagnosis" logic to the matchers). - Field overrides weren't handled well by this strategy. They were only applied at the very end, if a model matched successfully. This means we cannot tell the system "Hey, this model is type X with base Y. Trust me bro.". We cannot override the match logic. As we move towards letting users correct mis-identified models (see #8582), this is a requirement. We can simplify the process significantly and better support "unknown" models. Firstly, model config classes now have a single `from_model_on_disk()` method that attempts to construct an instance of the class from the model files. This replaces the `matches()` and `parse()` methods. If we fail to create the config instance, a special exception is raised that indicates why we think the files cannot be identified as the given model config class. Next, the flow for model identification is a bit simpler: - Derive all the common fields up-front (name, desc, hash, etc). - Merge in overrides. - Call `from_model_on_disk()` for every config class, passing in the fields. Overrides are handled in this method. - Record the results for each config class and choose the best one. The identification logic is a bit more verbose, with the special exceptions and handling of overrides, but it is very clear what is happening. The one downside I can think of for this strategy is we do need to check every model type, instead of stopping at the first match. It's a bit less efficient. In practice, however, this isn't a hot code path, and the improved clarity is worth far more than perf optimizations that the end user will likely never notice. * refactor(mm): remove unused methods in config.py * refactor(mm): add model config parsing utils * fix(mm): abstractmethod bork * tidy(mm): clarify that model id utils are private * fix(mm): fall back to UnknownModelConfig correctly * feat(mm): port CLIPVisionDiffusersConfig to new api * feat(mm): port SigLIPDiffusersConfig to new api * feat(mm): make match helpers more succint * feat(mm): port flux redux to new api * feat(mm): port ip adapter to new api * tidy(mm): skip optimistic override handling for now * refactor(mm): continue iterating on config * feat(mm): port flux "control lora" and t2i adapter to new api * tidy(ui): use Extract to get model config types * fix(mm): t2i base determination * feat(mm): port cnet to new api * refactor(mm): add config validation utils, make it all consistent and clean * feat(mm): wip port of main models to new api * feat(mm): wip port of main models to new api * feat(mm): wip port of main models to new api * docs(mm): add todos * tidy(mm): removed unused model merge class * feat(mm): wip port main models to new api * tidy(mm): clean up model heuristic utils * tidy(mm): clean up ModelOnDisk caching * tidy(mm): flux lora format util * refactor(mm): make config classes narrow Simpler logic to identify, less complexity to add new model, fewer useless attrs that do not relate to the model arch, etc * refactor(mm): diffusers loras w * feat(mm): consistent naming for all model config classes * fix(mm): tag generation & scattered probe fixes * tidy(mm): consistent class names * refactor(mm): split configs into separate files * docs(mm): add comments for identification utils * chore(ui): typegen * refactor(mm): remove legacy probe, new configs dir structure, update imports * fix(mm): inverted condition * docs(mm): update docsstrings in factory.py * docs(mm): document flux variant attr * feat(mm): add helper method for legacy configs * feat(mm): satisfy type checker in flux denoise * docs(mm): remove extraneous comment * fix(mm): ensure unknown model configs get unknown attrs * fix(mm): t5 identification * fix(mm): sdxl ip adapter identification * feat(mm): more flexible config matching utils * fix(mm): clip vision identification * feat(mm): add sanity checks before probing paths * docs(mm): add reminder for self for field migrations * feat(mm): clearer naming for main config class hierarchy * feat(mm): fix clip vision starter model bases, add ref to actual models * feat(mm): add model config schema migration logic * fix(mm): duplicate import * refactor(mm): split big migration into 3 Split the big migration that did all of these things into 3: - Migration 22: Remove unique contraint on base/name/type in models table - Migration 23: Migrate configs to v6.8.0 schemas - Migration 24: Normalize file storage * fix(mm): pop base/type/format when creating unknown model config * fix(db): migration 22 insert only real cols * fix(db): migration 23 fall back to unknown model when config change fails * feat(db): run migrations 23 and 24 * fix(mm): false negative on flux lora * fix(mm): vae checkpoint probe checking for dir instead of file * fix(mm): ModelOnDisk skips dirs when looking for weights Previously a path w/ any of the known weights suffixes would be seen as a weights file, even if it was a directory. We now check to ensure the candidate path is actually a file before adding it to the list of weights. * feat(mm): add method to get main model defaults from a base * feat(mm): do not log when multiple non-unknown model matches * refactor(mm): continued iteration on model identifcation * tests(mm): refactor model identification tests Overhaul of model identification (probing) tests. Previously we didn't test the correctness of probing except in a few narrow cases - now we do. See tests/model_identification/README.md for a detailed overview of the new test setup. It includes instructions for adding a new test case. In brief: - Download the model you want to add as a test case - Run a script against it to generate the test model files - Fill in the expected model type/format/base/etc in the generated test metadata JSON file Included test cases: - All starter models - A handful of other models that I had installed - Models present in the previous test cases as smoke tests, now also tested for correctness * fix(mm): omit type/format/base when creating unknown config instance * feat(mm): use ValueError for model id sanity checks * feat(mm): add flag for updating models to allow class changes * tests(mm): fix remaining MM tests * feat: allow users to edit models freely * feat(ui): add warning for model settings edit * tests(mm): flux state dict tests * tidy: remove unused file * fix(mm): lora state dict loading in model id * feat(ui): use translation string for model edit warning * docs(db): update version numbers in migration comments * chore: bump version to v6.9.0a1 * docs: update model id readme * tests(mm): attempt to fix windows model id tests * fix(mm): issue with deleting single file models * feat(mm): just delete the dir w/ rmtree when deleting model * tests(mm): windows CI issue * fix(ui): typegen schema sync * fix(mm): fixes for migration 23 - Handle CLIP Embed and Main SD models missing variant field - Handle errors when calling the discriminator function, previously only handled ValidationError but it could be a ValueError or something else - Better logging for config migration * chore: bump version to v6.9.0a2 * chore: bump version to v6.9.0a3
6.9 KiB
Model Probe (Identification) Testing
Invoke's model Identification system is tested against example model files. Test cases are lightweight representations of real models which have been "stripped" of their tensor data.
Setup
Test cases are stored with git lfs. You must install git lfs to pull down the test cases and add to them.
# Only need to do this once
git lfs install
# Pull the actual model files down - if you just do `git pull` you'll only get pointers
git lfs pull
Running the Tests
To run the tests use:
pytest -v tests/test_model_probe/test_identification.py
Stripped Model Files
Invoke abstracts the loading of a model's state dict and metadata in a class called ModelOnDisk. This class loads real model weights. We use it to inspect models and identify them.
For testing purposes, we create a stripped-down version of model weights that contain only the model structure and metadata for each key, without the actual tensor data. The state dict structure is typically all we need to identify models; the tensors themselves are not needed. This allows us to store test cases in the repo without adding many gigabytes of data.
To see how this works, check out StrippedModelOnDisk. This class includes logic to strip models and to load these stripped models for testing.
Some Models Cannot Be Stripped
Certain models cannot be stripped because identification relies on inspecting the actual tensor data. We have to store the full model files for these test cases.
Currently, the only models that cannot be stripped are
spandrelimage-to-image models.spandrelsupports many model architectures but doesn't provide a way to identify or assert support for a model by its state dict structure alone.To positively identify these models, we must attempt to load the model using spandrel. If it loads successfully, we assume it is a supported model. Therefore, we cannot strip these models and must store the full model files in the test cases. We only store one such model to keep the test suite size manageable.
StrippedModelOnDiskwill simply pass-through the "live" tensor data for these models when loading them to test.
Adding New Test Cases
Run the strip_model.py script to create a new test case. For example:
python strip_model.py /path/to/your/model --output_dir ./stripped_models
It supports single-file models and multi-file models (e.g. diffusers-style models). The output will be a directory named with a UUID, containing the stripped model files and a dummy __test_metadata__.json file.
Example output structure for a single-file model:
stripped_models/
└── 19fd1a40-c5b7-4734-bd3a-6e0e948cce0b/
├── __test_metadata__.json
└── Standard Reference (XLabs FLUX IP-Adapter v2).safetensors
This test metadata file should contain a single JSON dict and must be filled out manually with the expected identification results.
Structure of __test_metadata__.json
This file contains a single JSON dict. Here's an example for a FLUX IP Adapter checkpoint:
{
"source": "https://huggingface.co/XLabs-AI/flux-ip-adapter-v2/resolve/main/ip_adapter.safetensors",
"file_name": "Standard Reference (XLabs FLUX IP-Adapter v2).safetensors",
"expected_config_attrs": {
"type": "ip_adapter",
"format": "checkpoint",
"base": "flux"
}
}
See the details below for each field.
"source"
A string indicating the source of the model (e.g. a Hugging Face repo ID or URL). This is not used for identification, but is useful for reference so we know where the model came from. Nothing will break if this field is missing or incorrect, but it is good practice to fill it out.
- Example HF Repo ID:
"RunDiffusion/Juggernaut-XL-v9" - Example URL:
"https://huggingface.co/XpucT/Deliberate/resolve/main/Deliberate_v5.safetensors"
"file_name"
If the model is a single file (e.g. a .safetensors file), this is the name of that file. The test suite will look for this file in the test case directory.
If the model is multi-file (e.g. diffusers-style), omit this key or set it to a falsey value like null or an empty string.
- Example:
"model.safetensors"
The
strip_model.pyscript will automatically fill this field in for single-file models.
"expected_config_attrs"
This field is a dict of expected configuration attributes for the model. It is required for all test cases.
It is used to verify that the model's configuration matches expectations. The keys and values in this dict depend on the specific model and its configuration.
These attributes must be included, as they are the primary discriminators for models:
"type": The type of the model. This is the value of theModelTypeenum."format": The format of the model files. This is the value of theModelFormatenum."base": The base model pipeline architecture associated with this model. Many models do not have an associated base. For these, use"any". This is the value of theBaseModelTypeenum.
Depending on the kind of model, these additional keys may be useful:
"prediction_type": The prediction type used by the model. This is the value of theSchedulerPredictionTypeenum."variant": The variant of the model, if applicable. This is the value of theModelVariantTypeenum.
To see all possible values for these enums, check out their definitions in invokeai/backend/model_manager/taxonomy.py.
For example, for a SD1.5 main (pipeline) inpainting model in diffusers format, you might have:
{
"expected_config_attrs": {
"type": "main",
"format": "diffusers",
"base": "sd-1",
"prediction_type": "epsilon",
"variant": "inpaint"
}
}
"notes"
This is an optional string field where you can add any notes or comments about the test case. It can be useful for providing context or explaining any special considerations.
"override_fields"
In some rare cases, we may need to provide additional hints to the identification system to help it identify the model correctly.
Currently, the only known case where we need extra information is to differentiate between single-file SD1.x, SD2.x and SDXL VAEs. These models have identical structures, so we need to provide a hint. Though it is far from ideal, we use simple string matching on the model's name to provide this hint.
For example, when users install the taesdxl VAE from the HF repo madebyollin/taesdxl, the identification system will get the model name taesdxl. It sees "xl" in the name and infers that this is a SDXL VAE. To reproduce this in a test case, we add the following to __test_metadata__.json:
{
"override_fields": {
"name": "taesdxl"
}
}