Flux ComfyUI: The Complete Guide (July 2026)

Q: What Is the Flux AI Image Generator?

The Flux AI image generator is an text-to-image model from Black Forest Labs. It is available through several interfaces, but ComfyUI is the most popular choice for users who want full control over the generation pipeline, custom node extensions, and the ability to save and share workflows as portable JSON files.

Carl PetersonJuly 21, 202611 min read

What Is Flux AI?

Flux is an open-weight text-to-image model family. It's a go-to choice for artists, developers, and researchers who want precise control over AI image generation. It runs natively in ComfyUI, making the combination especially powerful for building repeatable, customizable workflows.

The Flux text-to-image and image-editing models are built on a rectified flow transformer architecture. This allows for straighter paths between noise and clean images, resulting in faster convergence, fewer inference steps, and strong prompt adherence. It supports resolutions up to 4 megapixels, making it suitable for everything from social media graphics to print-ready artwork.

The model family has variants to perform different tasks and accomodate a variety of GPUs. From the small Flux 2 Klein series which runs on consumer GPUs, to Flux.2 Dev for workloads requiring photorealism and multi-reference composition.

Who Made Flux AI?

Flux was created by Black Forest Labs (BFL), a German AI research company headquartered in Freiburg. It was founded in 2024 by Robin Rombach, Andreas Blattmann, and Patrick Esser, former Stability AI researchers who previously built Stable Diffusion at LMU Munich. The original Flux.1 family launched in August 2024, followed by the Flux.2 series in November 2025.

Black Forest Labs releases key models under permissive licenses. The Flux 2 Klein 4B model is available under Apache 2.0, allowing unrestricted commercial use. That open licensing, combined with strong quality-to-speed ratios, has made Flux models popular.

Flux AI Models Explained

The Flux family has grown significantly since its debut. Choosing the right variant can save both time and compute costs. The most relevant models for local ComfyUI use are Flux.1 Dev, the Flux.2 generation, and particularly the Flux 2 Klein series.

Flux.1 vs Flux 2 Klein: What's the Difference?

Flux.1 was the original model family, released by Black Forest Labs in August 2024. It came in three variants: Schnell (fast, Apache 2.0), Dev (quality-focused, non-commercial), and Pro (proprietary API). Flux.1 Dev is a 12-billion parameter rectified flow transformer and became the community standard for high-quality local generation, though it demands significant VRAM and longer generation times than smaller models.

Flux.2 expands the lineup with Pro, Dev, Max, Flex, and the new Klein series. Klein was designed for local users, greatly reducing hardware requirements and latency. Where Flux.1 Dev prioritizes maximum quality, Flux 2 Klein targets real-time interactivity.

Flux 2 Klein 4B vs 9B: Which Model Should You Use?

Flux 2 Klein launched on January 15, 2026, and comes in two parameter counts: 4B and 9B. The 4B distilled variant generates images in under a second on capable hardware, requires around 13 GB of VRAM, and is released under Apache 2.0 for unrestricted commercial use. The 4B base variant is the undistilled version, intended for fine-tuning and LoRA training rather than direct inference.

The 9B variant offers noticeably better detail and coherence on complex prompts. However, it requires approximately 29 GB of VRAM at FP16 precision and operates under a non-commercial license. The 4B distilled model is a better choice for most users, while the 9B should be reserved for quality-critical work.

Model	Parameters	Min VRAM¹	Inference Steps	License	Best For
Flux 2 Klein 4B (distilled)	4B	8GB	4	Apache 2.0	Fast iteration, commercial use
Flux 2 Klein 4B Base	4B	13GB	20–50	Apache 2.0	LoRA training, fine-tuning
Flux 2 Klein 9B	9B	29GB	4	Non-commercial	Quality-critical work
Flux.2 Dev	32B	80GB+	20–28	Non-commercial	Maximum quality, API use
¹VRAM needed for full FP16/BF16 precision. FP8 and NVFP4 quantization can reduce requirements by up to 40% and 55% respectively.

Flux 2 Klein VRAM Requirements: Can Your GPU Handle It?

Flux 2 Klein VRAM requirements depend on your chosen variant and quantization level. The 4B distilled model needs at least 8 GB, putting it within reach of mid-range cards. The 9B model needs roughly 29 GB at FP16, but drops to around 15 GB with FP8 quantization, making it feasible on a 24 GB card.

Black Forest Labs offers official FP8 and NVFP4 quantized checkpoints that cut VRAM usage by up to 40% and 55% respectively.

If GPU still falls short, cloud GPUs are the practical alternative. Thunder Compute's RTX A6000 instances provide 48GB of VRAM at $0.35/hr.

Setting Up Flux in ComfyUI

ComfyUI is the standard interface for running Flux models locally. Its node-based graph exposes every step of the generation pipeline, giving you full control over models, samplers, text encoders, and VAEs. For a crash course, read the full ComfyUI setup guide.

There's a couple of ways to start using ComfyUI locally:

Download an installation file.
Clone the official repository and install Python dependencies

You will need Python 3.10 or higher, an NVIDIA GPU, and PyTorch for your CUDA version. Once set up, ComfyUI Manager handles installing and updating custom node packs without touching configuration files manually.

Start using ComfyUI in the cloud with a Thunder Compute template and pay by the minute. Pick out the hardware, connect using VSCode, and start generating in minutes.

Downloading and Loading the Flux AI Model

Model weights for Flux 2 Klein are hosted on Hugging Face. Download the diffusion model checkpoint and place it in ComfyUI/models/diffusion_models/. You will also need the T5-XXL text encoder (approximately 9.8 GB at FP16, or 4.9 GB at FP8) and the Flux VAE, placed in their respective folders under ComfyUI/models/.

Building Your First Flux Workflow in ComfyUI

You can jumpstart from a ComfyUI template. But it's good to know that a basic Flux text-to-image workflow requires five core nodes:

Load Diffusion Model - loads the Flux checkpoint
CLIP Text Encode (Flux) - encodes your prompt via T5
Empty Latent Image - defines canvas dimensions
KSampler - runs the sampling process
VAE Decode - converts the latent output to a visible image
Save Image - writes the output to disk

These connect in sequence: the model loader and text encoder feed the KSampler, the latent image provides the starting noise, and the KSampler output flows into VAE Decode before a Save Image node.

Queue a test generation after wiring the nodes to confirm file paths and connections are correct.

Flux workflows differ from Stable Diffusion in one key way: Flux does not use a negative prompt, and its CFG scale behaves differently than SDXL or SD 1.5. Setting CFG too high produces blown-out, oversaturated images.

Learn how to run ComfyUI in the cloud, or try Forge Neo for a simpler web UI alternative.

Getting the Best Results with Flux in ComfyUI

Running Flux is straightforward once the pipeline is wired. However, sampler settings and prompts make a substantial difference in output quality. A few targeted adjustments separate mediocre results from consistently strong ones.

Recommended Settings and Samplers for Flux

For the Flux 2 Klein 4B distilled model, use exactly 4 inference steps with a CFG scale of 1.0 to 1.5. Running more than 4 steps or raising CFG higher degrades quality rather than improving it.

The Euler sampler with the Simple scheduler produces the cleanest results; avoid Euler Ancestral, as it does not converge cleanly on distilled models.

For the 4B or 9B base models, increase steps to 20–24 and set CFG to 3.5–5.0, with the same Euler plus Simple scheduler. For Flux.1 Dev, keep CFG below 6 for naturalistic images; higher values push toward oversaturation. Save these as workflow presets so you can switch between variants without reconfiguring each time.

Prompting Tips for the Flux AI Image Generation Tool

Flux does not apply automatic prompt enhancement, so the exact text you write is what the model interprets. Generic keyword-based prompts that work in Stable Diffusion tend to produce poor results in Flux. Instead, write in descriptive flowing prose: subject, then setting, then lighting, then camera perspective.

A prompt like "A medium close-up of a woman in a rain-soaked alley at dusk, warm amber streetlights reflecting on wet cobblestones, 50mm lens, shallow depth of field" will consistently outperform a short keyword list. The more specific your lighting, angle, and environmental detail, the more control you have over the final image. This approach works for concept art, product mockups, and photorealistic portraits alike.

Troubleshooting Flux in ComfyUI

Even with a correct setup, Flux workflows can hit issues from memory limits, version mismatches, or misconfigured sampler settings. Knowing the most common failure modes saves significant time when something goes wrong.

Common Errors and How to Fix Them

The most frequent issue is a tensor size mismatch error, which usually means the text encoder and model checkpoint are mismatched. Check that your CLIP loader is configured for T5 and that the model path points to a Flux-compatible checkpoint, not an SDXL or SD 1.5 file.

If ComfyUI crashes during model loading, re-download using the Hugging Face CLI with --resume-download to avoid partial files.

Blown-out or washed-out outputs from the distilled Klein models are almost always a CFG issue. Confirm CFG is set to 1.0–1.5 and steps are at exactly 4. For custom node instability, install one pack at a time and take a ComfyUI Manager snapshot after each successful install.

Optimizing Performance on Low VRAM GPUs

If your GPU falls below the 8 GB minimum for Flux 2 Klein 4B, a few strategies can help. Launch ComfyUI with --lowvram to enable sequential component processing, which reduces peak VRAM usage at a 20–30% speed penalty. Adding --cpu-vae offloads VAE decoding to system RAM, freeing 1–2 GB of VRAM with a moderate slowdown on the final decode step.

GGUF quantization is the most effective technique for GPUs with 6–12GB of VRAM. Q4 and Q5 checkpoints reduce the memory footprint substantially while maintaining strong output quality. Load them using the Unet Loader (GGUF) node from the ComfyUI-GGUF pack, and restart ComfyUI every 10–15 generations to clear accumulated memory fragmentation.

Not sure which model to run? See how Flux compares to other open-source image generation models.

Last Thoughts on Flux ComfyUI

Flux runs cleanly in ComfyUI across the full range of hardware. Flux.2 Klein 4B distilled fits on 8GB cards and generates in 4 steps, while Flux.1 Dev remains the community standard for maximum quality. Match your model and precision to your VRAM, and use a cloud GPU when you need full-precision output without a hardware upgrade.

Run Flux in ComfyUI on Thunder Compute from $0.35/hr.

FAQ

What Is the Flux AI Image Generator?

The Flux AI image generator is an image-to-text model from Black Forest Labs. It is available through several interfaces, but ComfyUI is the most popular choice for users who want full control over the generation pipeline, custom node extensions, and the ability to save and share workflows as portable JSON files.

What are the Flux.1 Dev VRAM requirements?

Flux.1 Dev requires approximately 24GB of VRAM at FP16/BF16 precision. At FP8, it fits in around 12GB with minimal quality loss. GGUF Q4 quantization reduces requirements to 6-8GB, with moderate quality loss in fine detail and text rendering.

Can I run Flux on 8GB VRAM in ComfyUI?

Yes. Flux.2 Klein 4B GGUF is the cleanest option: the Q4_K_M build is around 2.6GB and generates in 4 steps. For Flux.1, use a GGUF Q4_K_S checkpoint with the --lowvram flag and 32GB+ of system RAM for offloading.

What is the difference between Flux.1 Dev and Flux.2 Klein?

Flux.1 Dev is a 12B parameter model with the deepest LoRA and ControlNet ecosystem. Flux.2 Klein 4B is a 4B distilled model that generates in 4 steps with less VRAM. Klein is faster and more accessible; Flux.1 Dev has higher ceiling quality.

Which Flux model is best for LoRA training?

The Flux.2 Klein 4B base (undistilled) variant is designed for fine-tuning and LoRA training. Flux.1 Dev also has a large community of training resources. The 9B Klein model offers higher quality, but its non-commercial license limits LoRA redistribution.

What ComfyUI sampler should I use for Flux?

For Flux.2 Klein 4B distilled, use the Euler sampler with the Simple scheduler, 4 steps, CFG 1.0-1.5. For Flux.1 Dev, use Euler with Simple, 20-28 steps, CFG 3.5-5.0. Avoid Euler Ancestral on distilled models.