No Dockerfile. No Kubernetes. No YAML. Decorate your function — Velar builds the container, provisions the GPU, and runs your workload at per-second cost.
Why it matters
This is the actual difference between setting up GPU infra yourself and using Velar.
Features
Every design decision in Velar optimizes for shipping models fast, not managing infrastructure.
Decorate your class with @app.cls. __enter__ runs once when the container starts — your model stays in GPU memory for every subsequent call. No cold model loads. No wasted compute.
Call .map() on any function with a list of inputs. Velar provisions one GPU per item and runs them all in parallel. Embed 50k documents, transcribe 1000 audio files, or generate 10k images — same code, any scale.
Declare a Volume and mount it on any function. HuggingFace models, datasets, checkpoints — download once to persistent storage, reused on every run. Never wait 5 minutes for a model download again.
Store HF_TOKEN, API keys, or any credential in Velar's encrypted vault. Reference by name in your function — the value is injected at runtime and never appears in your code, image, or git history.
Use cases
Batch Whisper jobs on L4. Process 8 hours of audio in 4 minutes.
Learn moreRun sentence-transformers across millions of documents in parallel.
Learn moreStable Diffusion batch jobs for product mockups and synthetic datasets.
Learn moreLoRA fine-tuning on your custom dataset. Checkpoints to persistent Volume.
Learn moreDeploy your fine-tuned model as a persistent endpoint with a stable URL.
Learn moreRun ablations in parallel with .map(). No cluster setup.
Learn moreEarly days
Velar is new — and that works in your favor. You get founder-level support, direct access to the team, and per-second pricing built to win you over, not to maximize margin.
Start for freePricing
No minimums. No reserved capacity. The price shown is what you pay.
FAQ
$10 in GPU credits when you sign up. Cancel at any time.