Use Cases/Image & Video Generation

Image & video generation on GPU

Run Stable Diffusion, SDXL, FLUX, and video generation models at any scale. Velar provisions GPUs on demand, handles model loading, and scales to zero between requests — so you only pay for actual inference time.

Start generating View GPU pricing

Built for high-throughput generation

Image generation workloads are bursty by nature — you might need to generate 10 images per hour or 10,000. Velar's serverless model means you never over-provision or under-provision GPU capacity.

Automatic GPU scaling based on queue depth
Persistent model caching across invocations
Batch generation across multiple GPUs in parallel
Scale to zero when no requests are pending
Return images as bytes, URLs, or base64
Works with any diffusers-compatible model

Common architectures

Velar works with the full Hugging Face diffusers ecosystem as well as custom pipelines and ComfyUI workflows.

Text-to-image (SDXL, SD 1.5)

Generate images from text prompts. Most common use case.

Image-to-image

Transform or edit existing images with diffusion.

ControlNet

Guided generation with depth maps, pose, or edge inputs.

Video generation (AnimateDiff, SVD)

Generate short video clips from text or image prompts.

Code examples

Single image generation or bulk parallel workloads.

Single image — SDXL

generate.py

import velar

app = velar.App("image-gen")

image = velar.Image.from_registry(
    "pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime"
).pip_install("diffusers", "transformers", "accelerate")

@app.function(gpu="L40S", image=image)
def generate_image(prompt: str, steps: int = 30):
    from diffusers import StableDiffusionXLPipeline
    import torch, io

    pipe = StableDiffusionXLPipeline.from_pretrained(
        "stabilityai/stable-diffusion-xl-base-1.0",
        torch_dtype=torch.float16,
        use_safetensors=True,
    ).to("cuda")

    result = pipe(prompt, num_inference_steps=steps)
    buf = io.BytesIO()
    result.images[0].save(buf, format="PNG")
    return buf.getvalue()

app.deploy()
png_bytes = generate_image.remote("a cyberpunk city at sunset")

Bulk generation — parallel GPUs

batch.py

import velar

app = velar.App("bulk-generation")

image = velar.Image.from_registry(
    "pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime"
).pip_install("diffusers", "transformers", "accelerate")

@app.function(gpu="L4", image=image)
def generate_batch(prompts: list[str]):
    from diffusers import StableDiffusionPipeline
    import torch, io

    pipe = StableDiffusionPipeline.from_pretrained(
        "runwayml/stable-diffusion-v1-5",
        torch_dtype=torch.float16,
    ).to("cuda")

    images = pipe(prompts).images
    return [
        (buf := io.BytesIO(), img.save(buf, "PNG"), buf.getvalue())[2]
        for img in images
    ]

# Generate 500 images across multiple GPUs in parallel
prompts = load_prompts()  # your list
chunks = [prompts[i:i+4] for i in range(0, len(prompts), 4)]
results = [generate_batch.remote(chunk) for chunk in chunks]

GPU recommendations

Diffusion models are VRAM-intensive. Use the smallest GPU that fits your model.

Model	Recommended GPU	Notes
Stable Diffusion XL (SDXL)	L40S 48GB	Best quality, higher VRAM
Stable Diffusion 1.5 / 2.1	L4 24GB	Fast, cost-effective
FLUX.1	H100 80GB	State of the art quality
AnimateDiff / SVD	A100 80GB	Video generation
ControlNet variants	L4 24GB	Guided image generation

Related use cases

LLM Inference

Deploy language models for text generation.

Batch Processing

Process large image datasets in parallel.

Model Fine-Tuning

Fine-tune diffusion models on your style.

Start generating images on GPU

$10 in free GPU credits. No credit card required.

Get Started Free