Process millions of records with GPU-accelerated batch jobs. Velar automatically parallelizes your work across multiple GPU instances, handles retries, and scales back to zero when the job is done. Pay only for the compute time used.
The pattern is simple: split your dataset into chunks, call your GPU function with .remote() for each chunk, and Velar runs all of them concurrently across as many GPU instances as needed. Results are returned when all chunks complete.
Semantic embeddings
Generate vector embeddings for large document corpora to power search, RAG pipelines, and recommendation systems.
Models: BGE, E5, all-MiniLM, OpenCLIP
Audio transcription
Transcribe large audio or video archives with Whisper large-v3. Process hours of audio per minute with parallelized GPU jobs.
Models: Whisper large-v3, Whisper medium
AI data labeling
Use vision models or LLMs to label, classify, or augment training data at scale before a fine-tuning run.
Models: LLaVA, InternVL, custom classifiers
Re-ranking & scoring
Run cross-encoders or reward models over large candidate sets for retrieval augmentation or RLHF data preparation.
Models: Cross-encoders, DeBERTa, custom
Embeddings pipeline or audio transcription at scale.
Embedding 100k documents
import velar
app = velar.App("batch-embed")
image = velar.Image.from_registry(
"pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime"
).pip_install("sentence-transformers")
@app.function(gpu="L4", image=image)
def embed_batch(texts: list[str]) -> list[list[float]]:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("BAAI/bge-large-en-v1.5")
return model.encode(texts, batch_size=64).tolist()
# Embed 100k documents — Velar runs chunks in parallel across GPUs
def embed_all(documents: list[str]):
app.deploy()
chunk_size = 256
chunks = [documents[i:i+chunk_size] for i in range(0, len(documents), chunk_size)]
# All chunks run concurrently on separate GPU instances
results = [embed_batch.remote(chunk) for chunk in chunks]
return [vec for batch in results for vec in batch]Whisper transcription pipeline
import velar
app = velar.App("transcription")
image = velar.Image.from_registry(
"pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime"
).pip_install("openai-whisper", "ffmpeg-python")
@app.function(gpu="L4", image=image)
def transcribe(audio_bytes: bytes, language: str = "en") -> str:
import whisper, tempfile, os
model = whisper.load_model("large-v3")
with tempfile.NamedTemporaryFile(suffix=".mp3", delete=False) as f:
f.write(audio_bytes)
tmp_path = f.name
result = model.transcribe(tmp_path, language=language)
os.unlink(tmp_path)
return result["text"]
app.deploy()
# Transcribe 1000 audio files in parallel
audio_files = load_audio_files()
transcripts = [transcribe.remote(audio) for audio in audio_files]