Convert audio and video to accurate text transcriptions with speaker identification, timestamps, and multi-language support.
Authorization: Bearer your_api_key
Authorization: Bearer your_jwt_token
Language | Code | Quality | Notes |
---|---|---|---|
English | en | Excellent | Best supported language |
Spanish | es | Excellent | High accuracy |
French | fr | Excellent | Good speaker diarization |
German | de | Excellent | Technical content support |
Portuguese | pt | Very Good | Brazilian and European |
Italian | it | Very Good | Good word timestamps |
Russian | ru | Very Good | Cyrillic text support |
Japanese | ja | Good | Hiragana/Katakana/Kanji |
Chinese | zh | Good | Simplified and Traditional |
Arabic | ar | Good | RTL text support |
Hindi | hi | Good | Devanagari script |
Korean | ko | Good | Hangul script |
Model | Speed | Accuracy | Speaker Diarization | Best For |
---|---|---|---|---|
whisperx | Medium | Highest | Excellent | Production transcription |
faster-whisper | Fastest | High | Good | Real-time applications |
whisper-timestamped | Slow | High | Good | Detailed analysis |
400 Bad Request - Invalid Audio
413 Payload Too Large
422 Processing Error
Service | Cost | Description |
---|---|---|
Basic Transcription | 660 credits/minute | Text-only transcription |
With Speaker Diarization | 660 credits/minute | Speaker identification included |
With Word Timestamps | 660 credits/minute | Word-level timing data |
Transcript Editing | Free | No additional cost for edits |
Duration | Features | Credits | USD Cost |
---|---|---|---|
10 minutes | Basic transcription | 6600 | $0.88 |
30 minutes | With speakers + timestamps | 19800 | $2.64 |
1 hour | Full features | 39600 | $5.28 |
2 hours | Full features | 79200 | $10.56 |