H100 GPU Rental Price Comparison: Best Cloud Deals in 2026

2026-05-20 · 7 min read

Why H100 Pricing Varies So Dramatically

The NVIDIA H100 GPU is the workhorse of modern AI training — powering everything from LLM fine-tuning to diffusion model training. Yet rental prices across cloud providers can differ by 3–5× for identical hardware. Understanding why requires looking at three factors: spot vs. on-demand availability, datacenter region, and cluster topology (SXM5 interconnect vs. PCIe).

H100 80GB SXM vs PCIe: Which Should You Rent?

The H100 comes in two variants that cloud providers offer:

  • H100 SXM5 — Uses NVLink/NVSwitch interconnect. Ideal for multi-GPU training runs. Typically 20–30% faster for distributed workloads. Costs more.
  • H100 PCIe — Standard PCIe slot. Better for single-GPU inference or small batch training. More widely available at lower prices.

For inference-only workloads, H100 PCIe offers better cost-efficiency. For training runs that need 8+ GPUs, SXM is worth the premium.

Current H100 Rental Price Ranges (2026)

Based on data aggregated from 15+ platforms on ComputeUnion:

TierPrice Range ($/hr)Typical Platform Type
Budget$1.99 – $2.50Spot/preemptible, community clouds
Mid-range$2.50 – $3.50Dedicated on-demand, smaller clouds
Premium$3.50 – $5.00+Hyperscalers (AWS, Azure, GCP)

The cheapest H100 rentals come from platforms like RunPod, Vast.ai, and Lambda Labs — typically 40–60% below AWS/Azure list prices for equivalent specs.

Hidden Costs to Watch For

Raw hourly GPU price isn't the full story. Before committing to a platform, check:

  • Storage costs — Some platforms charge $0.10–0.25/GB/month for persistent storage. A 500GB dataset adds up fast.
  • Egress bandwidth — Moving model checkpoints out can cost $0.05–0.09/GB on some clouds.
  • Idle billing — Spot instances may bill for the full hour even if preempted after 5 minutes.
  • Minimum reservations — A few platforms require 7-day or 30-day minimums for their lowest-priced tiers.

Best H100 Platforms by Use Case

For Short Experiments (<4 hours)

Prioritize platforms with per-minute billing and no minimum reservation. RunPod and Vast.ai both offer this with H100 availability. Expect to pay $2.20–$2.80/hr for reliable spot capacity.

For Long Training Runs (days–weeks)

Lambda Labs and CoreWeave offer reserved H100 instances with predictable pricing. Reserving 30 days in advance can drop effective cost to ~$2.00/hr — comparable to spot pricing without the interruption risk.

For Production Inference

Hyperscalers (AWS p4d/p5, Azure NDm A100) offer better SLAs and regional redundancy, justifying the 2–3× price premium. For most inference workloads, however, an LLM API (via platforms like Groq or Together AI) will be cheaper than self-hosted H100 at low-to-medium traffic.

H100 vs API: The Break-Even Point

If you're running inference, compare against token-based API pricing. A single H100 at $2.50/hr can serve roughly:

  • ~2–4M tokens/hr for a 7B model (Llama-3.1-8B-class)
  • ~400–800K tokens/hr for a 70B model
  • ~80–150K tokens/hr for a 405B model

At Groq's Llama 3.3 70B price of $0.59/1M input + $0.79/1M output, you'd need to exceed ~600K tokens/hr before an H100 rental becomes cheaper. For most individual developers, API pricing wins unless traffic is consistently high.

How to Track H100 Prices in Real Time

GPU cloud prices change frequently — sometimes daily. ComputeUnion aggregates H100 (and A100, RTX 4090) prices from 20+ platforms, updated every 6 hours. Use the comparison table to spot price drops or set a mental benchmark before negotiating reserved capacity.

← Back to Blog