H100 GPU Rental Price Comparison: Best Cloud Deals in 2026
2026-05-20 · 7 min read
Why H100 Pricing Varies So Dramatically
The NVIDIA H100 GPU is the workhorse of modern AI training — powering everything from LLM fine-tuning to diffusion model training. Yet rental prices across cloud providers can differ by 3–5× for identical hardware. Understanding why requires looking at three factors: spot vs. on-demand availability, datacenter region, and cluster topology (SXM5 interconnect vs. PCIe).
H100 80GB SXM vs PCIe: Which Should You Rent?
The H100 comes in two variants that cloud providers offer:
- H100 SXM5 — Uses NVLink/NVSwitch interconnect. Ideal for multi-GPU training runs. Typically 20–30% faster for distributed workloads. Costs more.
- H100 PCIe — Standard PCIe slot. Better for single-GPU inference or small batch training. More widely available at lower prices.
For inference-only workloads, H100 PCIe offers better cost-efficiency. For training runs that need 8+ GPUs, SXM is worth the premium.
Current H100 Rental Price Ranges (2026)
Based on data aggregated from 15+ platforms on ComputeUnion:
| Tier | Price Range ($/hr) | Typical Platform Type |
|---|---|---|
| Budget | $1.99 – $2.50 | Spot/preemptible, community clouds |
| Mid-range | $2.50 – $3.50 | Dedicated on-demand, smaller clouds |
| Premium | $3.50 – $5.00+ | Hyperscalers (AWS, Azure, GCP) |
The cheapest H100 rentals come from platforms like RunPod, Vast.ai, and Lambda Labs — typically 40–60% below AWS/Azure list prices for equivalent specs.
Hidden Costs to Watch For
Raw hourly GPU price isn't the full story. Before committing to a platform, check:
- Storage costs — Some platforms charge $0.10–0.25/GB/month for persistent storage. A 500GB dataset adds up fast.
- Egress bandwidth — Moving model checkpoints out can cost $0.05–0.09/GB on some clouds.
- Idle billing — Spot instances may bill for the full hour even if preempted after 5 minutes.
- Minimum reservations — A few platforms require 7-day or 30-day minimums for their lowest-priced tiers.
Best H100 Platforms by Use Case
For Short Experiments (<4 hours)
Prioritize platforms with per-minute billing and no minimum reservation. RunPod and Vast.ai both offer this with H100 availability. Expect to pay $2.20–$2.80/hr for reliable spot capacity.
For Long Training Runs (days–weeks)
Lambda Labs and CoreWeave offer reserved H100 instances with predictable pricing. Reserving 30 days in advance can drop effective cost to ~$2.00/hr — comparable to spot pricing without the interruption risk.
For Production Inference
Hyperscalers (AWS p4d/p5, Azure NDm A100) offer better SLAs and regional redundancy, justifying the 2–3× price premium. For most inference workloads, however, an LLM API (via platforms like Groq or Together AI) will be cheaper than self-hosted H100 at low-to-medium traffic.
H100 vs API: The Break-Even Point
If you're running inference, compare against token-based API pricing. A single H100 at $2.50/hr can serve roughly:
- ~2–4M tokens/hr for a 7B model (Llama-3.1-8B-class)
- ~400–800K tokens/hr for a 70B model
- ~80–150K tokens/hr for a 405B model
At Groq's Llama 3.3 70B price of $0.59/1M input + $0.79/1M output, you'd need to exceed ~600K tokens/hr before an H100 rental becomes cheaper. For most individual developers, API pricing wins unless traffic is consistently high.
How to Track H100 Prices in Real Time
GPU cloud prices change frequently — sometimes daily. ComputeUnion aggregates H100 (and A100, RTX 4090) prices from 20+ platforms, updated every 6 hours. Use the comparison table to spot price drops or set a mental benchmark before negotiating reserved capacity.