mirror of https://github.com/invoke-ai/InvokeAI synced 2026-04-23 07:01:33 +02:00

Go to file

CypherNaugh_0x 9deb545cc1 External models (Gemini Nano Banana & OpenAI GPT Image) (#8633 ) (#8884 ) * feat: initial external model support * feat: support reference images for external models * fix: sorting lint error * chore: hide Reidentify button for external models * review: enable auto-install/remove fro external models * feat: show external mode name during install * review: model descriptions * review: implemented review comments * review: added optional seed control for external models * chore: fix linter warning * review: save api keys to a seperate file * docs: updated external model docs * chore: fix linter errors * fix: sync configured external starter models on startup * feat(ui): add provider-specific external generation nodes * feat: expose external panel schemas in model configs * feat(ui): drive external panels from panel schema * docs: sync app config docstring order * feat: add gemini 3.1 flash image preview starter model * feat: update gemini image model limits * fix: resolve TypeScript errors and move external provider config to api_keys.yaml Add 'external', 'external_image_generator', and 'external_api' to Zod enum schemas (zBaseModelType, zModelType, zModelFormat) to match the generated OpenAPI types. Remove redundant union workarounds from component prop types and Record definitions. Fix type errors in ModelEdit (react-hook-form Control invariance), parsing.tsx (model identifier narrowing), buildExternalGraph (edge typing), and ModelSettings import/export buttons. Move external_gemini_base_url and external_openai_base_url into api_keys.yaml alongside the API keys so all external provider config lives in one dedicated file, separate from invokeai.yaml. * feat: add resolution presets and imageConfig support for Gemini 3 models Add combined resolution preset selector for external models that maps aspect ratio + image size to fixed dimensions. Gemini 3 Pro and 3.1 Flash now send imageConfig (aspectRatio + imageSize) via generationConfig instead of text-based aspect ratio hints used by Gemini 2.5 Flash. Backend: ExternalResolutionPreset model, resolution_presets capability field, image_size on ExternalGenerationRequest, and Gemini provider imageConfig logic. Frontend: ExternalSettingsAccordion with combo resolution select, dimension slider disabling for fixed-size models, and panel schema constraint wiring for Steps/Guidance/Seed controls. * Remove unused external model fields and add provider-specific parameters - Remove negative_prompt, steps, guidance, reference_image_weights, reference_image_modes from external model nodes (unused by any provider) - Remove supports_negative_prompt, supports_steps, supports_guidance from ExternalModelCapabilities - Add provider_options dict to ExternalGenerationRequest for provider-specific parameters - Add OpenAI-specific fields: quality, background, input_fidelity - Add Gemini-specific fields: temperature, thinking_level - Add new OpenAI starter models: GPT Image 1.5, GPT Image 1 Mini, DALL-E 3, DALL-E 2 - Fix OpenAI provider to use output_format (GPT Image) vs response_format (DALL-E) and send model ID in requests - Add fixed aspect ratio sizes for OpenAI models (bucketing) - Add ExternalProviderRateLimitError with retry logic for 429 responses - Add provider-specific UI components in ExternalSettingsAccordion - Simplify ParamSteps/ParamGuidance by removing dead external overrides - Update all backend and frontend tests * Chore Ruff check & format * Chore typegen * feat: full canvas workflow integration for external models - Add missing aspect ratios (4:5, 5:4, 8:1, 4:1, 1:4, 1:8) to type system for external model support - Sync canvas bbox when external model resolution preset is selected - Use params preset dimensions in buildExternalGraph to prevent "unsupported aspect ratio" errors - Lock all bbox controls (resize handles, aspect ratio select, width/height sliders, swap/optimal buttons) for external models with fixed dimension presets - Disable denoise strength slider for external models (not applicable) - Sync bbox aspect ratio changes back to paramsSlice for external models - Initialize bbox dimensions when switching to an external model * Chore typegen Linux seperator * feat: full canvas workflow integration for external models - Update buildExternalGraph test to include dimensions in mock params * Merge remote-tracking branch 'upstream/main' into external-models * Chore pnpm fix * add missing parameter * docs: add External Models guide with Gemini and OpenAI provider pages * fix(external-models): address PR review feedback - Gemini recall: write temperature, thinking_level, image_size to image metadata; wire external graph as metadata receiver; add recall handlers. - Canvas: gate regional guidance, inpaint mask, and control layer for external models. - Canvas: throw a clear error on outpainting for external models (was falling back to inpaint and hitting an API-side mask/image size mismatch). - Workflow editor: add ui_model_provider_id filter so OpenAI and Gemini nodes only list their own provider's models. - Workflow editor: silently drop seed when the selected model does not support it instead of raising a capability error. - Remove the legacy external_image_generation invocation and the graph-builder fallback; providers must register a dedicated node. - Regenerate schema.ts. - remove Gemini debug dumps to outputs/external_debug * fix(external-models): resolve TSC errors in metadata parsing and external graph - Export imageSizeChanged from paramsSlice (required by the new ImageSize recall handler). - Emit the external graph's metadata model entry via zModelIdentifierField since ExternalApiModelConfig is not part of the AnyModelConfig union. * chore: prettier format ModelIdentifierFieldInputComponent * fix: remove unsupported thinkingConfig from Gemini image models and restrict GPT Image models to txt2img * chore typegen * chore(docs): regenerate settings.json for external provider fields * fix(external): fix mask handling and mode support for external providers - Remove img2img and inpaint modes from Gemini models (Gemini has no bitmap mask or dedicated edit API; image editing works via reference images in the UI) - Fix DALL-E 2 inpainting: convert grayscale mask to RGBA with alpha channel transparency (OpenAI expects transparent=edit area) and convert init image to RGBA when mask is present * fix(external): update mode support and UI for external providers - Remove DALL-E 2 from starter models (deprecated, shutdown May 12 2026) - Enable img2img for GPT Image 1/1.5/1-mini (supports edits endpoint) - Set Gemini models to txt2img only (no mask/edit API; editing via ref images) - Hide mode/init_image/mask_image fields on Gemini node (not usable) - Hide mask_image field on OpenAI node (no model supports inpaint) * Chore typegen * fix(external): improve OpenAI node UX and disable cache by default - Hide OpenAI node's mode and init_image fields: OpenAI's API has no img2img/inpaint distinction (the edits endpoint is invoked automatically when reference images are provided). init_image is functionally identical to a reference image and was misleading users. - Default use_cache to False for external image generation nodes: external API calls are non-deterministic and incur usage costs. Cache hits returned stale image references that did not produce new gallery entries on repeat invokes. * fix(external): duplicate cached images on cache hit instead of skipping External image generation nodes use the standard invocation cache, but returning the cached output (with stale image_name references) on cache hits resulted in no new gallery entries — the Invoke button would spin indefinitely on repeat invokes with identical parameters. Override invoke_internal so that on cache hit, the cached images are loaded and re-saved as new gallery entries. The expensive API call is still skipped (cost saving), but the user sees a new image as expected. * Chore typegen + ruff * CHore ruff format * fix(external): restore OpenAI advanced settings on Remix recall Remix recall iterates through ImageMetadataHandlers but only Gemini's temperature handler was wired up — OpenAI's quality, background, and input_fidelity were stored in image metadata but never parsed back into the params slice. Add the three missing handlers so Remix restores these settings as expected. --------- Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev> Co-authored-by: Alexander Eichhorn <alex@code-with.us> Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>		2026-04-20 17:13:26 +00:00
.dev_scripts	Apply black	2023-07-27 10:54:01 -04:00
.github	fix(docs): anticipate more redirects and update more links (#9076 )	2026-04-19 20:13:30 -04:00
coverage	combine pytest.ini with pyproject.toml	2023-03-05 17:00:08 +00:00
docker	fix(docs): anticipate more redirects and update more links (#9076 )	2026-04-19 20:13:30 -04:00
docs	External models (Gemini Nano Banana & OpenAI GPT Image) (#8633 ) (#8884 )	2026-04-20 17:13:26 +00:00
docs-old	External models (Gemini Nano Banana & OpenAI GPT Image) (#8633 ) (#8884 )	2026-04-20 17:13:26 +00:00
invokeai	External models (Gemini Nano Banana & OpenAI GPT Image) (#8633 ) (#8884 )	2026-04-20 17:13:26 +00:00
scripts	Revert "Revert "New Documentation Fixes (#9061 )" (#9065 )" (#9066 )	2026-04-18 16:45:32 -04:00
tests	External models (Gemini Nano Banana & OpenAI GPT Image) (#8633 ) (#8884 )	2026-04-20 17:13:26 +00:00
.dockerignore	refactor Dockerfile; get rid of multi-stage build; upgrade to python 3.12	2025-04-04 18:42:13 +11:00
.editorconfig	Merge dev into main for 2.2.0 (#1642 )	2022-11-30 16:12:23 -05:00
.git-blame-ignore-revs	Git blame ignore revs	2025-03-26 12:56:04 +11:00
.gitattributes	refactor: model manager v3 (#8607 )	2025-10-15 10:18:53 +11:00
.gitignore	Docs Overhaul (#8896 )	2026-04-16 22:03:05 -04:00
.gitmodules	remove src directory, which is gumming up conda installs; addresses issue #77	2022-08-25 10:43:05 -04:00
.nvmrc	update nodes schema / typegen	2025-04-04 18:42:13 +11:00
.pre-commit-config.yaml	chore: update pre-commit syntax; add check for uv.lock needing an update	2025-04-15 07:41:32 +10:00
.prettierrc.yaml	feat: automated releases via github action	2024-02-29 21:57:20 -05:00
flake.lock	update flake (#7032 )	2024-10-08 10:55:49 +11:00
flake.nix	update flake (#7032 )	2024-10-08 10:55:49 +11:00
InvokeAI_Statement_of_Values.md	Add @ebr to Contributors (#2095 )	2022-12-21 14:33:08 -05:00
LICENSE	Update LICENSE	2023-07-05 23:46:27 -04:00
LICENSE-SD1+SD2.txt	updated LICENSE files and added information about watermarking	2023-07-26 17:27:33 -04:00
LICENSE-SDXL.txt	updated LICENSE files and added information about watermarking	2023-07-26 17:27:33 -04:00
Makefile	Run vitest during frontend build (#9022 )	2026-04-05 19:18:24 -04:00
mkdocs.yml	External models (Gemini Nano Banana & OpenAI GPT Image) (#8633 ) (#8884 )	2026-04-20 17:13:26 +00:00
pins.json	chore: bump torch to 2.7.0	2025-05-19 12:29:51 +10:00
pyproject.toml	fix(docs): anticipate more redirects and update more links (#9076 )	2026-04-19 20:13:30 -04:00
README.md	fix(docs): anticipate more redirects and update more links (#9076 )	2026-04-19 20:13:30 -04:00
SECURITY.md	Create SECURITY.md	2024-11-25 04:10:03 -08:00
Stable_Diffusion_v1_Model_Card.md	Global replace [ \t]+$, add "GB" (#1751 )	2022-12-19 16:36:39 +00:00
USER_ISOLATION_IMPLEMENTATION.md	feat(multiuser mode): Support multiple isolated users on same backend (#8822 )	2026-02-26 23:47:25 -05:00
uv.lock	Feat[model support]: Qwen Image — full pipeline with edit, generate LoRA, GGUF, quantization, and UI (#9000 )	2026-04-12 14:39:13 +02:00

README.md

Invoke - Professional Creative AI Tools for Visual Media

Invoke is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. Invoke offers an industry leading web-based UI, and serves as the foundation for multiple commercial products.

Free to use under a commercially-friendly license
Download and install on compatible hardware
Generate, refine, iterate on images, and build workflows

📣 Are you a new or returning InvokeAI user?

Take our first annual User's Survey

Documentation

Quick Links
Installation and Updates - Documentation and Tutorials - Bug Reports - Contributing

Installation

To get started with Invoke, Download the Launcher.

Troubleshooting, FAQ and Support

Please review our FAQ for solutions to common installation problems and other issues.

For more help, please join our Discord.

Features

Full details on features can be found in our documentation.

Web Server & UI

Invoke runs a locally hosted web server & React UI with an industry-leading user experience.

Unified Canvas

The Unified Canvas is a fully integrated canvas implementation with support for all core generation capabilities, in/out-painting, brush tools, and more. This creative tool unlocks the capability for artists to create with AI as a creative collaborator, and can be used to augment AI-generated imagery, sketches, photography, renders, and more.

Workflows & Nodes

Invoke offers a fully featured workflow management solution, enabling users to combine the power of node-based workflows with the ease of a UI. This allows for customizable generation pipelines to be developed and shared by users looking to create specific workflows to support their production use-cases.

Board & Gallery Management

Invoke features an organized gallery system for easily storing, accessing, and remixing your content in the Invoke workspace. Images can be dragged/dropped onto any Image-base UI element in the application, and rich metadata within the Image allows for easy recall of key prompts or settings used in your workflow.

Model Support

SD 1.5
SD 2.0
SDXL
SD 3.5 Medium
SD 3.5 Large
CogView 4
Flux.1 Dev
Flux.1 Schnell
Flux.1 Kontext
Flux.1 Krea
Flux Redux
Flux Fill
Flux.2 Klein 4B
Flux.2 Klein 9B
Z-Image Turbo
Z-Image Base
Anima
Qwen Image
Qwen Image Edit
Nano Banana (API Only)
GPT Image (API Only)
Wan (API Only)

Other features

Support for ckpt, diffusers, and some gguf models
Upscaling Tools
Embedding Manager & Support
Model Manager & Support
Workflow creation & management
Node-Based Architecture
Object Segmentation & Selection Models (SAM / SAM2)

Contributing

Anyone who wishes to contribute to this project - whether documentation, features, bug fixes, code cleanup, testing, or code reviews - is very much encouraged to do so.

Get started with contributing by reading our contribution documentation, joining the #dev-chat or the GitHub discussion board.

We hope you enjoy using Invoke as much as we enjoy creating it, and we hope you will elect to become part of our community.

Thanks

Invoke is a combined effort of passionate and talented people from across the world. We thank them for their time, hard work and effort.