Explore our credit programs for startups and researchers.

Back
Guides
May 16, 2025

How to Serve Phi-2 on a Cloud GPU with vLLM and FastAPI

Emmett Fear
Solutions Engineer
Get started with RunPod 
today.
We handle millions of gpu requests a day. Scale your machine learning workloads while keeping costs low with RunPod.
Get Started