Cloud GPUs
Rent NVIDIA H100 NVL GPUs from $2.79/hr
Dual-GPU data center accelerator based on Hopper architecture with 188GB combined HBM3 memory (94GB per GPU) designed specifically for LLM inference and deployment.

Powering the next generation of AI & high-performance computing.

Engineered for large-scale AI training, deep learning, and high-performance workloads, delivering unprecedented compute power and efficiency.

NVIDIA Hopper Architecture

Advanced architecture with fourth-generation Tensor Cores optimized for large language model workloads.

Fourth-Generation Tensor Cores

Enhanced AI acceleration with Transformer Engine delivering up to 12X better GPT-3 175B performance.

188GB HBM3 Memory

Industry-leading combined memory capacity enabling deployment of the largest language models.

Dual-GPU PCIe Design

Pre-bridged dual H100 configuration provides maximum memory capacity in standard server infrastructure.
Performance

Key specs at a glance.

Performance benchmarks that push AI, ML, and HPC workloads further.

Memory Bandwidth

3.94

TB/s

FP16 Tensor Performance

1.513

PFLOPS

NVLink Bandwidth

600

GB/s
Use Cases

Popular use cases.

Designed for demanding workloads
—learn if this GPU fits your needs.
Technical Specs

Ready for your most
demanding workloads.

Essential technical specifications to help you choose the right GPU for your workload.

Specification

Details

Great for...

Memory Bandwidth
3.94
TB/s
Feeding massive LLM weights and large datasets into HBM3 without stalls—critical for large-model inference and data-analytics pipelines.
Memory Bandwidth
1.513
Feeding massive LLM weights and large datasets into HBM3 without stalls—critical for large-model inference and data-analytics pipelines.
Feeding massive LLM weights and large datasets into HBM3 without stalls—critical for large-model inference and data-analytics pipelines.
3.94
FP16 Tensor Performance
1.513
PFLOPS
Accelerating mixed-precision transformer training and inference, cutting fine-tuning time and boosting throughput in production deployments.
FP16 Tensor Performance
1.513
Accelerating mixed-precision transformer training and inference, cutting fine-tuning time and boosting throughput in production deployments.
Accelerating mixed-precision transformer training and inference, cutting fine-tuning time and boosting throughput in production deployments.
1.513
NVLink Bandwidth
600
GB/s
Enabling high-bandwidth, low-latency GPU-to-GPU transfers across paired H100 NVL cards, so you can scale out massive models without hitting PCIe limits.
NVLink Bandwidth
600
Enabling high-bandwidth, low-latency GPU-to-GPU transfers across paired H100 NVL cards, so you can scale out massive models without hitting PCIe limits.
Enabling high-bandwidth, low-latency GPU-to-GPU transfers across paired H100 NVL cards, so you can scale out massive models without hitting PCIe limits.
600
Comparison

Powerful GPUs. Globally available.
Reliability you can trust.

30+ GPUs, 31 regions, instant scale. Fine-tune or go full Skynet—we’ve got you.

Community Cloud

$
2.59
/hr
N/A

Secure Cloud

$
2.79
/hr
Unique GPU Models
25
19
Unique GPU Models
Lorem ipsum
19
Lorem ipsum
25
Global Regions
17
14
Global Regions
Lorem ipsum
14
Lorem ipsum
17
Network Storage
Network Storage
Lorem ipsum
Lorem ipsum
Lorem ipsum
Lorem ipsum
Enterprise-Grade Reliability
Network Storage
Lorem ipsum
Lorem ipsum
Lorem ipsum
Lorem ipsum
Savings Plans
Network Storage
Lorem ipsum
Lorem ipsum
Lorem ipsum
Lorem ipsum
24/7 Support
Network Storage
Lorem ipsum
Lorem ipsum
Lorem ipsum
Lorem ipsum
Delightful Dev Experience
Network Storage
Lorem ipsum
Lorem ipsum
Lorem ipsum
Lorem ipsum

7,035,265,000

Requests since launch & 400k developers worldwide

Build what’s next.

The most cost-effective platform for building, training, and scaling machine learning models—ready when you are.

12:22