Question 1

How does the subtitle generator work?

Accepted Answer

We use OpenAI Whisper (Base model) compiled to WebAssembly and running in your browser. Your audio is processed locally during transcription and is not uploaded to our servers.

Question 2

What audio/video formats are supported?

Accepted Answer

MP3, MP4, WAV, WebM, OGG, M4A, FLAC, and most common audio/video formats. The tool extracts the audio track automatically.

Question 3

How many languages does it support?

Accepted Answer

Whisper supports 99 languages including English, Spanish, French, German, Chinese, Japanese, Korean, Arabic, Hindi, and more. Select your language or use auto-detect.

Question 4

What subtitle formats can I download?

Accepted Answer

SRT (most common, works everywhere) and VTT (WebVTT, for web video players). Both include timestamps and segmented text.

Question 5

How accurate are the subtitles?

Accepted Answer

Whisper Base provides good accuracy for clear speech in supported languages. Results are usually stronger with audio that has minimal background noise. Professional-grade accuracy requires the Large model (not available in-browser).

Question 6

Why does the first transcription take longer?

Accepted Answer

The Whisper Base model (~57MB) downloads on first use. After that it is cached in your browser. Subsequent transcriptions start immediately.

Question 7

Is my audio uploaded to a server?

Accepted Answer

No. Whisper runs in your browser via WebAssembly. Your audio is processed locally during transcription and is not uploaded to our servers.

Question 8

How long does transcription take?

Accepted Answer

Roughly 1-3x real-time on modern devices. A 5-minute clip takes 5-15 minutes. Desktop browsers are significantly faster than mobile.

Question 9

Can I edit the subtitles before downloading?

Accepted Answer

Yes. The transcribed text appears in an editable area. Fix any errors, adjust timing, then download.

Question 10

Does it work on mobile?

Accepted Answer

Yes, but transcription is CPU-intensive. Short clips (under 2 minutes) work well on phones. For longer audio, use a desktop browser.

Question 11

What is the file size limit?

Accepted Answer

Depends on your device memory. Audio is processed in chunks. Most devices handle files up to 100MB.

Question 12

Can I transcribe a YouTube video?

Accepted Answer

Not directly. Download the video first, then upload the file. Or use our YouTube Summarizer for text summaries.

Question 13

How does this compare to Otter.ai or Rev?

Accepted Answer

Otter and Rev use cloud-based models and may reach higher accuracy in some workflows. Our difference is the browser-local approach: transcription runs without uploading your source audio to our servers. Accuracy is solid for clear speech but not broadcast-grade.

Question 14

Is it really free?

Accepted Answer

Yes. The tool is free to use in the browser and does not require signup to start.

Question 15

Can I pay with cryptocurrency?

Accepted Answer

Video tools are free. For AI writing tools, we accept USDT, USDC, BTC, ETH. Plans start at $9.99/month.

Subtitle Generator

Run this tool in three short steps.

Upload audio or video

Whisper transcribes locally

Download SRT or VTT

What people ask before they use this tool.

Continue the workflow

100+ free AI tools