Drop audio file here or click to browse
MP3, WAV, M4A, WebM, OGG, FLAC · Max 25MB
Drop to upload
Start with a single file · Powered by Whisper AI
Transcript appears here
Upload an audio file and click Transcribe to get started.
Transcribing with Whisper AI...
This may take 10-30 seconds depending on file length
Need more transcription capacity?
Upgrade for longer files, higher daily capacity, and priority processing.
View PlansHow It Works
- 1
Upload Your Audio
Drag and drop or click to upload. Supports MP3, WAV, M4A, WebM, OGG, and FLAC files up to 25MB.
- 2
Choose Your Mode
Transcribe keeps the original language. Translate converts any language to English text.
- 3
Get Timestamped Text
Your transcript appears with clickable timestamps synced to the audio player. Download as TXT, SRT, or VTT.
Use Cases
Meeting recordings
Turn recorded meetings into searchable, shareable text with timestamps for key decisions.
Podcast episodes
Create full transcripts for show notes, SEO, and accessibility.
Interview transcripts
Transcribe research interviews with timestamps for easy reference and citation.
Lecture notes
Convert classroom recordings into study-ready notes with time references.
Frequently Asked Questions
What audio formats are supported?
MP3, WAV, M4A, WebM, OGG, FLAC, and MP4 audio tracks. Most common audio formats work.
Is there a file size limit?
Yes, 25MB maximum per file. This is the limit of the Whisper AI model. For larger files, try trimming or compressing first.
How accurate is the transcription?
Powered by OpenAI Whisper, a widely used speech recognition model. Accuracy is highest for clear English speech and decreases with heavy accents, background noise, or overlapping speakers.
Can it transcribe non-English audio?
Yes. Whisper supports 90+ languages. In Transcribe mode, it outputs text in the original language. In Translate mode, it converts any language to English.
What is the Translate mode?
Translate mode transcribes audio in any language and outputs the text in English. Useful for understanding foreign-language content.
Is my audio file uploaded to a server?
Yes, your audio is sent to our secure server and forwarded to OpenAI's Whisper API for processing. Files are not stored after transcription.
Can I get timestamps with the transcription?
Yes. Every transcript includes segment-level timestamps. Click any timestamp to jump to that point in the audio player.
What subtitle formats can I export?
TXT (plain text), SRT (SubRip — compatible with most video editors), and VTT (WebVTT — for web video players).
How does this compare to Otter.ai or Rev?
Otter and Rev offer live transcription, speaker diarization, and broader collaboration or service workflows. This tool focuses on direct single-file transcription with timestamps and subtitle export in the web app. If you want a faster first task for one recording, this workflow is a strong fit. If you need live notes, team features, or managed services, those tools may fit better.
Can I transcribe a YouTube video with this?
This tool works with uploaded audio files. For YouTube videos, use our <a href="/youtube-summarizer">YouTube Summarizer</a> which extracts captions directly from the video URL and generates structured summaries with timestamps.
Does it work on mobile?
Yes. Upload files from your phone gallery or Files app. The interface is fully responsive. Processing happens on our server, so device performance does not affect transcription speed.
What happens to my audio after transcription?
Your audio file is sent to our server, forwarded to OpenAI Whisper for processing, and immediately discarded. We do not store audio files. The transcript is returned to your browser and never saved on our end.
Can I turn the transcript into speech?
Yes. Copy the transcript and paste it into our <a href="/text-to-speech">Text to Speech</a> tool to generate audio in a different voice or language. Useful for creating voiceovers from interview transcripts or meeting notes.
How long does transcription take?
Typically 10-30 seconds depending on file length. A 5-minute audio clip usually finishes in about 15 seconds. The elapsed timer shows real-time progress during processing.
Coda One's Audio to Text tool transcribes audio files into accurate text with segment-level timestamps. Powered by OpenAI Whisper, it supports MP3, WAV, M4A, WebM, OGG, FLAC, and 90+ languages. Transcribe in the original language or translate any audio to English. Click timestamps to sync with the built-in audio player. Export as TXT, SRT subtitles, or VTT for web video. It is designed for a direct single-file workflow in the web app.
Other AI Tools
More AI Tools: Free Tools · YouTube Summarizer · AI Summarizer · Text to Speech · AI Translator