Stable Diffusion XL
The Stable Diffusion XL (SDXL) integration in ACT3 AI enables high-quality image generation for set backgrounds, concept art, and character reference imagery. SDXL is an image model rather than a video model — it produces static visuals that you use for planning, previsualization, and as inputs to video generation tools like Flux.
What Stable Diffusion XL Does
SDXL generates high-resolution, photorealistic or stylized still images from text descriptions. You can use the outputs as:
- Set backgrounds for scene environments
- Character concept art to guide digital actor creation
- Texture references for Blender 3D modeling
- Pitch deck imagery for presentations
- Visual planning references before committing to video renders
Key Capabilities
- High-Resolution Image Generation — Up to 2048×2048 pixels for detailed visual concepts
- Prompt-Based Styling — Describe the look, mood, and subject; SDXL generates matching visuals
- Style Consistency — Apply custom models or LoRA files to maintain a consistent aesthetic
- Image-to-Image Mode — Transform existing images into variations while preserving composition
- Batch Output — Generate multiple variations from a single prompt for faster visual development
How to Use
- In the Editor, click AI → Stable Diffusion XL
- Enter your prompt, including style, setting, lighting, and subject details
- Optionally upload a base image for Image-to-Image transformation
- Adjust resolution, seed, and model settings as needed
- Click Generate and review results in the Asset Library
- Use accepted images as set backgrounds or concept references
Image-to-Image Mode
Image-to-Image mode transforms an existing reference image into a new variation:
- Upload a sketch, reference photo, or previous generation
- Describe what the new version should look like
- SDXL preserves the composition while applying the new style and content description
- Use this to iterate on a visual concept without starting from scratch
Credit Usage
SDXL image generation uses credits based on resolution and number of variations:
- 1024×1024 image: approximately 0.5 credits
- 2048×2048 image: approximately 1 credit
- Image-to-Image costs the same as Text-to-Image at equivalent resolution
- Batch generation (multiple variations) costs per image generated
Best Use Cases
- Visual Planning — Generate reference images quickly before committing to video renders
- Set Design — Create background imagery for fantasy, sci-fi, or period environments
- Character Concepts — Visualize character appearance options before building digital actors
- Blender Textures — Generate texture images for 3D surface materials
- Animatics — Use SDXL images as the visual input for Flux's Image-to-Video workflow
Combining SDXL with Other Tools
SDXL works particularly well in combination with other ACT3 AI features:
- Generate concept images with SDXL, then animate them with Flux Image-to-Video
- Create set backgrounds with SDXL, then import to Blender for 3D use as environment textures
- Build character concept art with SDXL, then use it as a reference when creating Digital Actors
- Run complex SDXL pipelines through ComfyUI for multi-pass compositing
Prompt Tips
- Be descriptive — include camera style, lighting conditions, color palette, and mood
- Use consistent keywords across multiple prompts to maintain visual coherence within a project
- For scene matching, start with an Image-to-Image workflow using a reference frame from your existing shots
- Save prompts that produce good results for reuse across similar shots
Best Practices
- Use SDXL early in production for cheap visual exploration before committing to video renders
- Generate 3–5 variations per key shot and choose the best as a visual reference
- Keep consistent style prompts across related shots for visual coherence
- Save accepted images to the Asset Library with descriptive tags for easy retrieval
Troubleshooting
Images don't match the style you described — Be more explicit about art style keywords (e.g., "cinematic photography," "oil painting," "digital art," "photorealistic"). Vague style descriptions produce inconsistent results.
Composition is wrong — Use Image-to-Image mode with a rough sketch or reference image to constrain the composition.
Batch generation consuming too many credits — Reduce the number of variations per prompt and select only what you need.