Hot starts, batch inference, and what's next for Runpod Serverless. Webinar June 25.

Faster-Whisper: 3x Cheaper and 4x Faster Than Whisper for Speech Transcription

Runpod's new Faster-Whisper endpoint delivers 2-4x faster transcription speeds than the original Whisper API, at a fraction of the cost. Perfect for.

Faster-Whisper: 3x Cheaper and 4x Faster Than Whisper for Speech Transcription

You read the title! Whisper just got faster with Runpod's new Faster-Whisper serverless endpoint.

What is Whisper?

For those who haven't used it before, Whisper is an AI speech recognition model trained on hundreds of thousands of hours of multilingual human speech. It's great for audio captioning (things like podcasts, YouTube videos, TV shows, songs, etc.), and is capable of translating non-English audio to English as it goes, too.

What exactly is changing?

We will be deprecating our existing Whisper serverless endpoint in favor of our new Faster-Whisper endpoint. This endpoint provides the same great service as the regular Whisper endpoint in a fraction of the compute time.

What does this mean for me?

You'll get your Whisper results 2-4x faster with Faster-Whisper! Here's some sample execution times across audio clips of varying lengths, all done with the large-v2 model using Runpod's endpoints:

Audio Clip Length Whisper Time Faster-Whisper Time Speedup (×)
Football as a source of revenue 0:51 9.917s 3.462s 2.86
GoTranscript transcription test 3:01 44.309s 13.172s 3.36
Driving in the U.S. 6:13 1:06.018 22.107s 2.99
Speech by Fiorello H. La Guardia 15:46 2:12.299 44.854s 2.95
Interview of Matthew C. Weiss 29:33 5:53.186 1:45.952 3.33
Interview of Peter A. and Sharen Gendebien 1:28:02 17:35.431 4:39.680 3.77
Interview of Brock Robert McIntosh 3:14:34 40:32.268 11:16.872 3.59

Do I have to pay more?

Nope! In fact, since the new Faster-Whisper endpoint is 2-4x faster, it's also 2-4x cheaper!

How do your API prices stack up then?

Our serverless APIs only charge the user based on the time it takes to execute a call, $0.00025/s. With the Faster-Whisper endpoint, our pricing for Whisper API access is now more competitive than ever. Most others such as OpenAI ($0.0001/s) charge users for API access based on the length of the audio clip to transcribe (which, we remind, is dramatically longer than the time it takes to return that transcription). Check out the sample cost comparison below for each of the audio clips above:

Audio Clip Length OpenAI Whisper Cost ($) Runpod Faster-Whisper Cost ($) Times Cheaper
Football as a source of revenue 0:51 0.0051 0.0009 5.89
GoTranscript transcription test 3:01 0.0181 0.0033 5.50
Driving in the U.S. 6:13 0.0373 0.0055 6.75
Speech by Fiorello H. La Guardia 15:46 0.0942 0.0112 8.44
Interview of Matthew C. Weiss 29:33 0.1773 0.0265 6.69
Interview of Peter A. and Sharen Gendebien 1:28:02 0.5282 0.0699 7.55
Interview of Brock Robert McIntosh 3:14:34 1.1674 0.1692 6.90

I'm still confused?

We'd be happy to answer your questions or concerns on our Discord server or via help@runpod.io!

Author profile: Brandon Ikeler

Related articles

View All
Deploy When Available is now GA

Deploy When Available is now GA

Queue for any GPU spec, even one that's fully rented out, and we'll deploy it the moment capacity opens up. No more refreshing the console or running a sniping tool.

All

Build what’s next.

Build, train, and scale AI workloads on Runpod with cloud GPUs, Serverless, and Clusters.