InvokeAI

mirror of https://github.com/invoke-ai/InvokeAI synced 2026-04-26 00:32:22 +02:00

Author	SHA1	Message	Date
Alexander Eichhorn	dd5758b53d	fix: SDXL DoRA LoRA fails with enable_partial_loading=true (#9063 ) * fix: SDXL DoRA LoRA fails with enable_partial_loading=true cast_to_device returns plain torch.Tensor instead of torch.nn.Parameter, causing _aggregate_patch_parameters to replace valid weights with meta device dummies, falsely triggering DoRA's quantization guard. Fixes invoke-ai/InvokeAI#8624 * test: regression coverage for DoRA + partial-loading + CPU→device autocast Adds targeted coverage for the bug fixed in `a0a87212` (#8624, PR #9063): - test_aggregate_patch_parameters_preserves_plain_tensor_with_dora: CPU-only unit test that feeds a plain torch.Tensor (as handed in by _cast_weight_bias_for_input) into _aggregate_patch_parameters with a DoRA patch. Pre-fix, the tensor was replaced by a meta-device dummy, tripping DoRA's quantization guard. - "single_dora" variant in the patch_under_test fixture: exercises the full CUDA/MPS autocast hot path via test_linear_sidecar_patches_with_autocast_from_cpu_to_device. --------- Co-authored-by: Jonathan <34005131+JPPhoto@users.noreply.github.com>	2026-04-25 16:33:20 +00:00
Jonathan	0d7205ff79	Handle mixed-dtype mismatches in autocast linear and conv wrappers (#9006 ) * Handle CustomConv2d bias dtype mismatches * Fix mixed-dtype autocast regressions * Format custom_conv2d with ruff	2026-04-20 20:31:35 +00:00
Jonathan	ee600973ed	Broaden text encoder partial-load recovery (#9034 )	2026-04-09 20:09:40 -04:00
Jonathan	dc5007fe95	Fix/model cache Qwen/CogView4 cancel repair (#8959 ) * Repair partially loaded Qwen models after cancel to avoid device mismatches * ruff * Repair CogView4 text encoder after canceled partial loads * Avoid MPS CI crash in repair regression test * Fix MPS device assertion in repair test	2026-03-15 10:04:15 -04:00
copilot-swe-agent[bot]	b7afd9b5b3	Fix test failures caused by MagicMock TypeError Configure mock logger to return a valid log level for getEffectiveLevel() to prevent TypeError when comparing with logging.DEBUG constant. The issue was that ModelCache._log_cache_state() checks self._logger.getEffectiveLevel() > logging.DEBUG, and when the logger is a MagicMock without configuration, getEffectiveLevel() returns another MagicMock, causing a TypeError when compared with an int. Fixes all 4 test failures in test_model_cache_timeout.py Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-24 05:42:45 +00:00
Lincoln Stein	9d1de81fe2	(style) correct ruff formatting error	2025-12-24 00:19:25 -05:00
copilot-swe-agent[bot]	8d76b4e4d4	Fix ruff whitespace errors and improve timeout logging - Remove all trailing whitespace (W293 errors) - Add debug logging when timeout fires but activity detected - Add debug logging when timeout fires but cache is empty - Only log "Clearing model cache" message when actually clearing - Prevents misleading timeout messages during active generation Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-24 04:05:57 +00:00
copilot-swe-agent[bot]	c3217d8a08	Address code review feedback - Remove unused variable in test - Add clarifying comment for daemon thread setting - Add detailed comment explaining cache clearing with 1000 GB value - Improve code documentation Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-24 00:27:39 +00:00
copilot-swe-agent[bot]	75a14e2a4b	Add unit tests for model cache timeout functionality - Created test_model_cache_timeout.py with comprehensive tests - Tests timeout clearing behavior - Tests activity resetting timeout - Tests no-timeout default behavior - Tests shutdown canceling timers Co-authored-by: lstein <111189+lstein@users.noreply.github.com>	2025-12-24 00:24:31 +00:00
psychedelicious	a8a07598c8	chore: ruff	2025-08-18 21:14:00 +10:00
psychedelicious	23206e22e8	tests: skip excessively flaky MPS-specific tests in CI	2025-08-18 21:14:00 +10:00
Ryan Dick	5357d6e08e	Rename ConcatenatedLoRALayer to MergedLayerPatch. And other minor cleanup.	2025-01-28 14:51:35 +00:00
Ryan Dick	28514ba59a	Update ConcatenatedLoRALayer to work with all sub-layer types.	2025-01-28 14:51:35 +00:00
Ryan Dick	e2f05d0800	Add unit tests for LoKR patch layers. The new tests trigger a bug when LoKR layers are applied to BnB-quantized layers (also impacts several other LoRA variant types).	2025-01-22 09:20:40 +11:00
Ryan Dick	c76d08d1fd	Add keep_ram_copy option to CachedModelOnlyFullLoad.	2025-01-16 15:08:23 +00:00
Ryan Dick	04087c38ce	Add keep_ram_copy option to CachedModelWithPartialLoad.	2025-01-16 14:51:44 +00:00
Ryan Dick	d7ab464176	Offload the current model when locking if it is already partially loaded and we have insufficient VRAM.	2025-01-07 02:53:44 +00:00
Ryan Dick	402dd840a1	Add seed to flaky unit test.	2025-01-07 00:31:00 +00:00
Ryan Dick	9a0a226ce1	Fix bitsandbytes imports in unit tests on MacOS.	2024-12-30 10:41:48 -05:00
Ryan Dick	52fc5a64d4	Add a unit test for a LoRA patch applied to a quantized linear layer with weights streamed from CPU to GPU.	2024-12-29 17:14:55 +00:00
Ryan Dick	a8bef59699	First pass at making custom layer patches work with weights streamed from the CPU to the GPU.	2024-12-29 17:01:37 +00:00
Ryan Dick	6d49ee839c	Switch the LayerPatcher to use 'custom modules' to manage layer patching.	2024-12-29 01:18:30 +00:00
Ryan Dick	918f541af8	Add unit test for a SetParameterLayer patch applied to a CustomFluxRMSNorm layer.	2024-12-28 20:44:48 +00:00
Ryan Dick	93e76b61d6	Add CustomFluxRMSNorm layer.	2024-12-28 20:33:38 +00:00
Ryan Dick	f2981979f9	Get custom layer patches working with all quantized linear layer types.	2024-12-27 22:00:22 +00:00
Ryan Dick	ef970a1cdc	Add support for FluxControlLoRALayer in CustomLinear layers and add a unit test for it.	2024-12-27 21:00:47 +00:00
Ryan Dick	5ee7405f97	Add more unit tests for custom module LoRA patching: multiple LoRAs and ConcatenatedLoRALayers.	2024-12-27 19:47:21 +00:00
Ryan Dick	e24e386a27	Add support for patches to CustomModuleMixin and add a single unit test (more to come).	2024-12-27 18:57:13 +00:00
Ryan Dick	b06d61e3c0	Improve custom layer wrap/unwrap logic.	2024-12-27 16:29:48 +00:00
Ryan Dick	7d6ab0ceb2	Add a CustomModuleMixin class with a flag for enabling/disabling autocasting (since it incurs some runtime speed overhead.)	2024-12-26 20:08:30 +00:00
Ryan Dick	9692a36dd6	Use a fixture to parameterize tests in test_all_custom_modules.py so that a fresh instance of the layer under test is initialized for each test.	2024-12-26 19:41:25 +00:00
Ryan Dick	b0b699a01f	Add unit test to test that isinstance(...) behaves as expected with custom module types.	2024-12-26 18:45:56 +00:00
Ryan Dick	a8b2c4c3d2	Add inference tests for all custom module types (i.e. to test autocasting from cpu to device).	2024-12-26 18:33:46 +00:00
Ryan Dick	03944191db	Split test_autocast_modules.py into separate test files to mirror the source file structure.	2024-12-24 22:29:11 +00:00
Ryan Dick	987c9ae076	Move custom autocast modules to separate files in a custom_modules/ directory.	2024-12-24 22:21:31 +00:00
Ryan Dick	0fc538734b	Skip flaky test when running on Github Actions, and further reduce peak unit test memory.	2024-12-24 14:32:11 +00:00
Ryan Dick	7214d4969b	Workaround a weird quirk of QuantState.to() and add a unit test to exercise it.	2024-12-24 14:32:11 +00:00
Ryan Dick	a83a999b79	Reduce peak memory used for unit tests.	2024-12-24 14:32:11 +00:00
Ryan Dick	f8a6accf8a	Fix bitsandbytes imports to avoid ImportErrors on MacOS.	2024-12-24 14:32:11 +00:00
Ryan Dick	f8ab414f99	Add CachedModelOnlyFullLoad to mirror the CachedModelWithPartialLoad for models that cannot or should not be partially loaded.	2024-12-24 14:32:11 +00:00
Ryan Dick	c6795a1b47	Make CachedModelWithPartialLoad work with models that have non-persistent buffers.	2024-12-24 14:32:11 +00:00
Ryan Dick	0a8fc74ae9	Add CachedModelWithPartialLoad to manage partially-loaded models using the new autocast modules.	2024-12-24 14:32:11 +00:00
Ryan Dick	dc54e8763b	Add CustomInvokeLinearNF4 to enable CPU -> GPU streaming for InvokeLinearNF4 layers.	2024-12-24 14:32:11 +00:00
Ryan Dick	1b56020876	Add CustomInvokeLinear8bitLt layer for device streaming with InvokeLinear8bitLt layers.	2024-12-24 14:32:11 +00:00
Ryan Dick	97d56f7dc9	Add torch module autocast unit test for GGUF-quantized models.	2024-12-24 14:32:11 +00:00
Ryan Dick	fe0ef2c27c	Add torch module autocast utilities.	2024-12-24 14:32:11 +00:00

46 Commits