Available on Velar

Rent RTX 4090 GPU Cloud

NVIDIA RTX 4090 24GB on-demand from $1.00/hr. Per-second billing, scale to zero, no idle costs. Deploy your first workload in under 60 seconds.

VRAM24 GB

ArchitectureAda Lovelace

Per second$0.000278/sec

Deploy on RTX 4090 View full pricing

Best for RTX 4090 workloads

LLM inference

High CUDA core count makes the 4090 excellent for fast token generation. Run 7–13B models with low latency at $1.00/hr.

Image generation

1 TB/s+ memory bandwidth enables fast Stable Diffusion and SDXL generation. Ideal for high-throughput image workloads.

Fine-tuning with LoRA/QLoRA

24 GB VRAM is enough for fine-tuning 7–13B models with quantization. Pay per second for training runs — no idle cost.

RTX 4090 specifications

GPU	NVIDIA RTX 4090 24GB
VRAM	24 GB
Architecture	Ada Lovelace
CUDA Cores	16,384
Tensor Cores	512
Memory Bandwidth	1,008 GB/s

RTX 4090 pricing on Velar

Serverless Jobs

Scale to zero · per-second billing

$1.00/hr

$0.000278/sec — billed to the second
Scale to zero when idle
No reserved capacity

Persistent Endpoint

Always-on · flat monthly rate

$600/mo

$0.83/hr effective — 17% cheaper than on-demand
Zero cold-start latency
Always available, guaranteed GPU

Pro plan required · See plans

Deploy on RTX 4090 in 12 lines of Python

No Dockerfile. No Kubernetes. Just a Python decorator and a GPU string.

inference.py

import velar

app = velar.App("rtx-inference")
image = velar.Image.from_registry(
    "pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime"
).pip_install("transformers", "accelerate")

@app.function(gpu="RTX4090", image=image)
def generate(prompt: str) -> str:
    from transformers import pipeline
    pipe = pipeline("text-generation", model="mistralai/Mistral-7B-v0.1")
    return pipe(prompt, max_new_tokens=256)[0]["generated_text"]

app.deploy()
print(generate.remote("Explain transformers in one paragraph"))

Try it free Read the docs

Also available on Velar

L4 24GB

24 GB VRAM

$0.66/hr

A100 80GB

80 GB VRAM

$2.36/hr

Deploy on RTX 4090 today

Start with $10 in free GPU credits. No credit card required. First workload live in under 60 seconds.

Get Started Free View all GPU pricing