Two fixes for Z-Image LoRA support:
1. Override _validate_looks_like_lora in LoRA_LyCORIS_ZImage_Config to
recognize Z-Image specific LoRA formats that use different key patterns
than SD/SDXL LoRAs. Z-Image LoRAs use lora_down.weight/lora_up.weight
and dora_scale suffixes instead of lora_A.weight/lora_B.weight.
2. Fix _group_by_layer in z_image_lora_conversion_utils.py to correctly
group LoRA keys by layer name. The previous logic used rsplit with
maxsplit=2 which incorrectly grouped keys like:
- "to_k.alpha" -> layer "diffusion_model.layers.17.attention"
- "lora_down.weight" -> layer "diffusion_model.layers.17.attention.to_k"
Now uses suffix matching to ensure all keys for a layer are grouped
together (alpha, dora_scale, lora_down.weight, lora_up.weight).
* feat: Add Regional Guidance support for Z-Image model
Implements regional prompting for Z-Image (S3-DiT Transformer) allowing
different prompts to affect different image regions using attention masks.
Backend changes:
- Add ZImageRegionalPromptingExtension for mask preparation
- Add ZImageTextConditioning and ZImageRegionalTextConditioning data classes
- Patch transformer forward to inject 4D regional attention masks
- Use additive float mask (0.0 attend, -inf block) in bfloat16 for compatibility
- Alternate regional/full attention layers for global coherence
Frontend changes:
- Update buildZImageGraph to support regional conditioning collectors
- Update addRegions to create z_image_text_encoder nodes for regions
- Update addZImageLoRAs to handle optional negCond when guidance_scale=0
- Add Z-Image validation (no IP adapters, no autoNegative)
* @Pfannkuchensack
Fix windows path again
* ruff check fix
* ruff formating
* fix(ui): Z-Image CFG guidance_scale check uses > 1 instead of > 0
Changed the guidance_scale check from > 0 to > 1 for Z-Image models.
Since Z-Image uses guidance_scale=1.0 as "no CFG" (matching FLUX convention),
negative conditioning should only be created when guidance_scale > 1.
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* (bugfix)(mm) work around Windows being unable to rmtree tmp directories after GGUF install
* (style) fix ruff error
* (fix) add workaround for Windows Permission Denied on GGUF file move() call
* (fix) perform torch copy() in GGUF reader to avoid deletion failures on Windows
* (style) fix ruff formatting issues
Add support for loading Flux LoRA models in the xlabs format, which uses
keys like `double_blocks.X.processor.{qkv|proj}_lora{1|2}.{down|up}.weight`.
The xlabs format maps:
- lora1 -> img_attn (image attention stream)
- lora2 -> txt_attn (text attention stream)
- qkv -> query/key/value projection
- proj -> output projection
Changes:
- Add FluxLoRAFormat.XLabs enum value
- Add flux_xlabs_lora_conversion_utils.py with detection and conversion
- Update formats.py to detect xlabs format
- Update lora.py loader to handle xlabs format
- Update model probe to accept recognized Flux LoRA formats
- Add unit tests for xlabs format detection and conversion
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* Feature: Add Tag System for user made Workflows
* feat(ui): display tags on workflow library tiles
Show workflow tags at the bottom of each tile in the workflow browser,
making it easier to identify workflow categories at a glance.
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* feat(nodes): add Prompt Template node
Add a new node that applies Style Preset templates to prompts in workflows.
The node takes a style preset ID and positive/negative prompts as inputs,
then replaces {prompt} placeholders in the template with the provided prompts.
This makes Style Preset templates accessible in Workflow mode, enabling
users to apply consistent styling across their workflow-based generations.
* feat(nodes): add StylePresetField for database-driven preset selection
Adds a new StylePresetField type that enables dropdown selection of
style presets from the database in the workflow editor.
Changes:
- Add StylePresetField to backend (fields.py)
- Update Prompt Template node to use StylePresetField instead of string ID
- Add frontend field type definitions (zod schemas, type guards)
- Create StylePresetFieldInputComponent with Combobox
- Register field in InputFieldRenderer and nodesSlice
- Add translations for preset selection
* fix schema.ts on windows.
* chore(api): regenerate schema.ts after merge
---------
Co-authored-by: Claude <noreply@anthropic.com>
* fix(model-install): support multi-subfolder downloads for Z-Image Qwen3 encoder
The Z-Image Qwen3 text encoder requires both text_encoder and tokenizer
subfolders from the HuggingFace repo, but the previous implementation
only downloaded the text_encoder subfolder, causing model identification
to fail.
Changes:
- Add subfolders property to HFModelSource supporting '+' separated paths
- Extend filter_files() and download_urls() to handle multiple subfolders
- Update _multifile_download() to preserve subfolder structure
- Make Qwen3Encoder probe check both nested and direct config.json paths
- Update Qwen3EncoderLoader to handle both directory structures
- Change starter model source to text_encoder+tokenizer
* ruff format
* fix schema description
* fix schema description
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
* feat(ui): add model path update for external models
Add ability to update file paths for externally managed models (models with
absolute paths). Invoke-controlled models (with relative paths in the models
directory) are excluded from this feature to prevent breaking internal
model management.
- Add ModelUpdatePathButton component with modal dialog
- Only show button for external models (absolute path check)
- Add translations for path update UI elements
* Added support for Windows UNC paths in ModelView.tsx:38-41. The isExternalModel function now detects:
Unix absolute paths: /home/user/models/...
Windows drive paths: C:\Models\... or D:/Models/...
Windows UNC paths: \\ServerName\ShareName\... or //ServerName/ShareName/...
* fix(ui): validate path format in Update Path modal to prevent invalid paths
When updating an external model's path, the new path is now validated to ensure
it follows an absolute path format (Unix, Windows drive, or UNC). This prevents
users from accidentally entering invalid paths that would cause the Update Path
button to disappear, leaving them unable to correct the mistake.
* fix(ui): extract isExternalModel to separate file to fix circular dependency
Moves the isExternalModel utility function to its own file to break the
circular dependency between ModelView.tsx and ModelUpdatePathButton.tsx.
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
Add higher quality Q8_0 quantization option for Z-Image Turbo (~6.6GB)
to complement existing Q4_K variant, providing better quality for users
with more VRAM.
Add dedicated Z-Image ControlNet Tile model (~6.7GB) for upscaling and
detail enhancement workflows.
* chore: localize extraction errors
* chore: rename extract masked area menu item
* chore: rename inpaint mask extract component
* fix: use mask bounds for extraction region
* Prettier format applied to InpaintMaskMenuItemsExtractMaskedArea.tsx
* Fix base64 image import bug in extracted area in InpaintMaskMenuItemsExtractMaskedArea.tsx and removed unused locales entries in en.json
* Fix formatting issue in InpaintMaskMenuItemsExtractMaskedArea.tsx
* Minor comment fix
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
GGUF Z-Image models store x_pad_token and cap_pad_token with shape [dim],
but diffusers ZImageTransformer2DModel expects [1, dim]. This caused a
RuntimeError when loading GGUF-quantized Z-Image models.
The fix dequantizes GGMLTensors first (since they don't support unsqueeze),
then reshapes to add the batch dimension.
* fix(ui): 🐛 `HotkeysModal` and `SettingsModal` initial focus
instead of using the `initialFocusRef` prop, the `Modal` component was focusing on the last available Button. This is a workaround that uses `tabIndex` instead which seems to be working.
Closes#8685
* style: 🚨 satisfy linter
---------
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
Add Z-Image Turbo and related models to the starter models list:
- Z-Image Turbo (full precision, ~13GB)
- Z-Image Turbo quantized (GGUF Q4_K, ~4GB)
- Z-Image Qwen3 Text Encoder (full precision, ~8GB)
- Z-Image Qwen3 Text Encoder quantized (GGUF Q6_K, ~3.3GB)
- Z-Image ControlNet Union (Canny, HED, Depth, Pose, MLSD, Inpainting)
The quantized Turbo model includes the quantized Qwen3 encoder as a
dependency for automatic installation.
Implement Z-Image ControlNet as an Extension pattern (similar to FLUX ControlNet)
instead of merging control weights into the base transformer. This provides:
- Lower memory usage (no weight duplication)
- Flexibility to enable/disable control per step
- Cleaner architecture with separate control adapter
Key implementation details:
- ZImageControlNetExtension: computes control hints per denoising step
- z_image_forward_with_control: custom forward pass with hint injection
- patchify_control_context: utility for control image patchification
- ZImageControlAdapter: standalone adapter with control_layers and noise_refiner
Architecture matches original VideoX-Fun implementation:
- Hints computed ONCE using INITIAL unified state (before main layers)
- Hints injected at every other main transformer layer (15 control blocks)
- Control signal added after each designated layer's forward pass
V2.0 ControlNet support (control_in_dim=33):
- Channels 0-15: control image latents
- Channels 16-31: reference image (zeros for pure control)
- Channel 32: inpaint mask (1.0 = don't inpaint, use control signal)
VRAM usage is high.
- Auto-detect control_in_dim from adapter weights (16 for V1, 33 for V2.0)
- Auto-detect n_refiner_layers from state dict
- Add zero-padding for V2.0's additional channels
- Use accelerate.init_empty_weights() for efficient model creation
- Add ControlNet_Checkpoint_ZImage_Config to frontend schema
feat: Add Z-Image ControlNet support with spatial conditioning
Add comprehensive ControlNet support for Z-Image models including:
Backend:
- New ControlNet_Checkpoint_ZImage_Config for Z-Image control adapter models
- Z-Image control key detection (_has_z_image_control_keys) to identify control layers
- ZImageControlAdapter loader for standalone control models
- ZImageControlTransformer2DModel combining base transformer with control layers
- Memory-efficient model loading by building combined state dict
The previous mixed-precision optimization for FP32 mode only converted
some VAE decoder layers (post_quant_conv, conv_in, mid_block) to the
latents dtype while leaving others (up_blocks, conv_norm_out) in float32.
This caused "expected scalar type Half but found Float" errors after
recent diffusers updates.
Simplify FP32 mode to consistently use float32 for both VAE and latents,
removing the incomplete mixed-precision logic. This trades some VRAM
usage for stability and correctness.
Also removes now-unused attention processor imports.
The Z-Image denoise node outputs latents, not images, so these mixins
were unnecessary. Metadata and board handling is correctly done in the
L2I (latents-to-image) node. This aligns with how FLUX denoise works.
- Add CustomDiffusersRMSNorm for diffusers.models.normalization.RMSNorm
- Add CustomLayerNorm for torch.nn.LayerNorm
- Register both in AUTOCAST_MODULE_TYPE_MAPPING
Enables partial loading (enable_partial_loading: true) for Z-Image models
by wrapping their normalization layers with device autocast support
The FLUX Dev license warning in model pickers used isCheckpointMainModelConfig
incorrectly:
```
isCheckpointMainModelConfig(config) && config.variant === 'dev'
```
This caused a TypeScript error because CheckpointModelConfig type doesn't
include the 'variant' property (it's extracted as `{ type: 'main'; format:
'checkpoint' }` which doesn't narrow to include variant).
Changes:
- Add isFluxDevMainModelConfig type guard that properly checks
base='flux' AND variant='dev', returning MainModelConfig
- Update MainModelPicker and InitialStateMainModelPicker to use new guard
- Remove isCheckpointMainModelConfig as it had no other usages
The function was removed because:
1. It was only used for detecting FLUX Dev models (incorrect use case)
2. No other code needs a generic "is checkpoint format" check
3. The pattern in this codebase is specific type guards per model variant
(isFluxFillMainModelModelConfig, isRefinerMainModelModelConfig, etc.)
Add robust device capability detection for bfloat16, replacing hardcoded
dtype with runtime checks that fallback to float16/float32 on unsupported
hardware. This prevents runtime failures on GPUs and CPUs without bfloat16.
Key changes:
- Add TorchDevice.choose_bfloat16_safe_dtype() helper for safe dtype selection
- Fix LoRA device mismatch in layer_patcher.py (add device= to .to() call)
- Replace all assert statements with descriptive exceptions (TypeError/ValueError)
- Add hidden_states bounds check and apply_chat_template fallback in text encoder
- Add GGUF QKV tensor validation (divisible by 3 check)
- Fix CPU noise generation to use float32 for compatibility
- Remove verbose debug logging from LoRA conversion utils
Add support for saving and recalling Z-Image component models (VAE and
Qwen3 Encoder) in image metadata.
Backend:
- Add qwen3_encoder field to CoreMetadataInvocation (version 2.1.0)
Frontend:
- Add vae and qwen3_encoder to Z-Image graph metadata
- Add Qwen3EncoderModel metadata handler for recall
- Add ZImageVAEModel metadata handler (uses zImageVaeModelSelected
instead of vaeSelected to set Z-Image-specific VAE state)
- Add qwen3Encoder translation key
This enables "Recall Parameters" / "Remix Image" to restore the VAE
and Qwen3 Encoder settings used for Z-Image generations.
Add support for loading Z-Image transformer and Qwen3 encoder models
from single-file safetensors format (in addition to existing diffusers
directory format).
Changes:
- Add Main_Checkpoint_ZImage_Config and Main_GGUF_ZImage_Config for
single-file Z-Image transformer models
- Add Qwen3Encoder_Checkpoint_Config for single-file Qwen3 text encoder
- Add ZImageCheckpointModel and ZImageGGUFCheckpointModel loaders with
automatic key conversion from original to diffusers format
- Add Qwen3EncoderCheckpointLoader using Qwen3ForCausalLM with fast
loading via init_empty_weights and proper weight tying for lm_head
- Update z_image_denoise to accept Checkpoint format models