Jonathan
dc5007fe95
Fix/model cache Qwen/CogView4 cancel repair ( #8959 )
...
* Repair partially loaded Qwen models after cancel to avoid device mismatches
* ruff
* Repair CogView4 text encoder after canceled partial loads
* Avoid MPS CI crash in repair regression test
* Fix MPS device assertion in repair test
2026-03-15 10:04:15 -04:00
psychedelicious
a8a07598c8
chore: ruff
2025-08-18 21:14:00 +10:00
psychedelicious
23206e22e8
tests: skip excessively flaky MPS-specific tests in CI
2025-08-18 21:14:00 +10:00
Ryan Dick
c76d08d1fd
Add keep_ram_copy option to CachedModelOnlyFullLoad.
2025-01-16 15:08:23 +00:00
Ryan Dick
04087c38ce
Add keep_ram_copy option to CachedModelWithPartialLoad.
2025-01-16 14:51:44 +00:00
Ryan Dick
d7ab464176
Offload the current model when locking if it is already partially loaded and we have insufficient VRAM.
2025-01-07 02:53:44 +00:00
Ryan Dick
6d49ee839c
Switch the LayerPatcher to use 'custom modules' to manage layer patching.
2024-12-29 01:18:30 +00:00
Ryan Dick
987c9ae076
Move custom autocast modules to separate files in a custom_modules/ directory.
2024-12-24 22:21:31 +00:00
Ryan Dick
f8ab414f99
Add CachedModelOnlyFullLoad to mirror the CachedModelWithPartialLoad for models that cannot or should not be partially loaded.
2024-12-24 14:32:11 +00:00
Ryan Dick
c6795a1b47
Make CachedModelWithPartialLoad work with models that have non-persistent buffers.
2024-12-24 14:32:11 +00:00
Ryan Dick
0a8fc74ae9
Add CachedModelWithPartialLoad to manage partially-loaded models using the new autocast modules.
2024-12-24 14:32:11 +00:00