Use Cases/Image & Video Generation

Image & video generation on GPU

Run Stable Diffusion, SDXL, FLUX, and video generation models at any scale. Velar provisions GPUs on demand, handles model loading, and scales to zero between requests — so you only pay for actual inference time.

Built for high-throughput generation

Image generation workloads are bursty by nature — you might need to generate 10 images per hour or 10,000. Velar's serverless model means you never over-provision or under-provision GPU capacity.

  • Automatic GPU scaling based on queue depth
  • Persistent model caching across invocations
  • Batch generation across multiple GPUs in parallel
  • Scale to zero when no requests are pending
  • Return images as bytes, URLs, or base64
  • Works with any diffusers-compatible model

Common architectures

Velar works with the full Hugging Face diffusers ecosystem as well as custom pipelines and ComfyUI workflows.

Text-to-image (SDXL, SD 1.5)

Generate images from text prompts. Most common use case.

Image-to-image

Transform or edit existing images with diffusion.

ControlNet

Guided generation with depth maps, pose, or edge inputs.

Video generation (AnimateDiff, SVD)

Generate short video clips from text or image prompts.

Code examples

Single image generation or bulk parallel workloads.

Single image — SDXL

generate.py
import velar

app = velar.App("image-gen")

image = velar.Image.from_registry(
    "pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime"
).pip_install("diffusers", "transformers", "accelerate")

@app.function(gpu="L40S", image=image)
def generate_image(prompt: str, steps: int = 30):
    from diffusers import StableDiffusionXLPipeline
    import torch, io

    pipe = StableDiffusionXLPipeline.from_pretrained(
        "stabilityai/stable-diffusion-xl-base-1.0",
        torch_dtype=torch.float16,
        use_safetensors=True,
    ).to("cuda")

    result = pipe(prompt, num_inference_steps=steps)
    buf = io.BytesIO()
    result.images[0].save(buf, format="PNG")
    return buf.getvalue()

app.deploy()
png_bytes = generate_image.remote("a cyberpunk city at sunset")

Bulk generation — parallel GPUs

batch.py
import velar

app = velar.App("bulk-generation")

image = velar.Image.from_registry(
    "pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime"
).pip_install("diffusers", "transformers", "accelerate")

@app.function(gpu="L4", image=image)
def generate_batch(prompts: list[str]):
    from diffusers import StableDiffusionPipeline
    import torch, io

    pipe = StableDiffusionPipeline.from_pretrained(
        "runwayml/stable-diffusion-v1-5",
        torch_dtype=torch.float16,
    ).to("cuda")

    images = pipe(prompts).images
    return [
        (buf := io.BytesIO(), img.save(buf, "PNG"), buf.getvalue())[2]
        for img in images
    ]

# Generate 500 images across multiple GPUs in parallel
prompts = load_prompts()  # your list
chunks = [prompts[i:i+4] for i in range(0, len(prompts), 4)]
results = [generate_batch.remote(chunk) for chunk in chunks]

GPU recommendations

Diffusion models are VRAM-intensive. Use the smallest GPU that fits your model.

ModelRecommended GPUNotes
Stable Diffusion XL (SDXL)L40S 48GBBest quality, higher VRAM
Stable Diffusion 1.5 / 2.1L4 24GBFast, cost-effective
FLUX.1H100 80GBState of the art quality
AnimateDiff / SVDA100 80GBVideo generation
ControlNet variantsL4 24GBGuided image generation

Start generating images on GPU

$10 in free GPU credits. No credit card required.