Overview
The Audiobook API exposes the complete production pipeline end to end, so you can drop AI audiobook generation directly into your own platform or control panel: a file goes in, a finished, ACX-compliant package comes back, with no human in the loop. It is the same API surface our own Audiobook Studio runs on — nothing here is a private endpoint.Key features
- Inputs: upload PDF, EPUB, DOCX, or TXT; paste raw text; or generate a book from a prompt. Chapters are detected automatically.
- Voices: 100+ languages with standard narration voices; custom voice cloning across 20+ languages.
- ACX-compliant export: per-chapter MP3 at 192 kbps CBR / 44.1 kHz, a retail sample, optional narrated opening/closing credits, a cover slot, and a
metadata.jsonmanifest — packaged as a single ZIP via a presigned URL. - Automatic compliance check: file format, audio levels, silence, and per-chapter length are validated on export; a non-compliant book is rejected with the exact reason rather than shipped silently.
- Asynchronous + webhooks:
job.completed/job.failedcallbacks for backlog-scale automation. - White-label output: delivered files carry the publisher’s own title/author/narrator metadata — nothing points back at AudioPod.
- Prepaid usage pricing: 24 for a 10-hour book). See API Wallet.
Authentication
All endpoints require authentication. Use an API key (recommended for server-to-server integrations):- API Key (Recommended):
X-API-Key: your_api_keyheader — create one in the dashboard. - JWT Token:
Authorization: Bearer your_jwt_token(session-based auth).
API Endpoints Quick Reference
| Operation | Method | Endpoint |
|---|---|---|
| Create project | POST | /audiobook/projects |
| List projects | GET | /audiobook/projects |
| Get / update / delete project | GET·PUT·DELETE | /audiobook/projects/{project_id} |
| Upload manuscript | POST | /audiobook/projects/{project_id}/manuscript/upload |
| Parse manuscript | POST | /audiobook/projects/{project_id}/manuscript/parse |
| Paste text | POST | /audiobook/projects/{project_id}/manuscript/paste |
| Generate book | POST | /audiobook/projects/{project_id}/manuscript/generate |
| List / get chapters | GET | /audiobook/projects/{project_id}/chapters |
| List voices | GET | /audiobook/voices/available |
| List languages | GET | /audiobook/voices/languages |
| Cost estimate | GET | /audiobook/projects/{project_id}/cost-estimate |
| Start narration | POST | /audiobook/projects/{project_id}/narration/start |
| Batch narration | POST | /audiobook/projects/{project_id}/narration/batch |
| Narration progress | GET | /audiobook/projects/{project_id}/narration/progress |
| Regenerate a chapter | POST | /audiobook/projects/{project_id}/chapters/{chapter_id}/regenerate |
| Production music/credits | GET·PUT | /audiobook/projects/{project_id}/music-settings |
| Upload media (cover/music) | POST | /audiobook/projects/{project_id}/media/upload |
| Start ACX export | POST | /audiobook/projects/{project_id}/export |
| Export status | GET | /audiobook/projects/{project_id}/export/{job_id}/status |
| Download package | GET | /audiobook/projects/{project_id}/export/{job_id}/download |
| Dashboard stats | GET | /audiobook/dashboard |
Quick Start (cURL)
1. Create a project
A project is the container for a single audiobook. Create it first, then attach a manuscript.Audiobook title.
Author name (written into the package metadata).
Project description.
ISBN, if available.
Book genre.
Default narration voice for the project (can also be set per narration request).
Speech speed multiplier.
Narration style.
Export format.
acx produces an Audible/ACX-ready package.Target distribution platform.
- cURL
- Python
AudiobookProjectResponse):
Project UUID.
Lifecycle state (see Status values).
Number of chapters once parsed.
Estimated finished runtime.
Uploaded filename, once attached.
pdf · epub · docx · txt.Total words after parsing.
GET /audiobook/projects?skip=0&limit=20 (paginated).
2. Add the manuscript
Three ways to get text into a project — pick one.Upload a file
POST /audiobook/projects/{project_id}/manuscript/upload — multipart form field file (PDF, EPUB, DOCX, or TXT). Returns a file_key to hand to the parser.
S3 key of the stored manuscript.
Reference URL for the stored file.
Maximum accepted size in bytes.
Parse the uploaded file into chapters
POST /audiobook/projects/{project_id}/manuscript/parse — form fields file_key and filename. Parsing is asynchronous and returns a job_id; poll GET .../chapters until chapters appear.
Paste text directly
POST /audiobook/projects/{project_id}/manuscript/paste — JSON. Best for content you already have as text.
Manuscript or chapter text.
Source label.
Generate a book from a prompt
POST /audiobook/projects/{project_id}/manuscript/generate — JSON. Drafts an original manuscript, then chapters it.
What the book should be about.
Optional working title.
Intended audience.
Writing tone.
Number of chapters.
Target words per chapter.
paste and generate endpoints return a ManuscriptIntakeResponse with total_chapters, total_words, and estimated_duration_minutes.
3. Chapters
List chapters withGET /audiobook/projects/{project_id}/chapters?skip=0&limit=50, or fetch one with GET .../chapters/{chapter_id}.
Chapter (AudiobookChapterResponse):
Chapter UUID.
Order in the book.
Chapter title.
Chapter text.
Words in the chapter.
Characters in the chapter.
Narrated length, once generated.
pending · processing · completed · failed.Stored audio key, once narrated.
Reason a chapter failed (e.g. content too short for narration).
If true, the chapter is excluded from narration and export.
Per-chapter voice override.
skip_narration, per-chapter voice_id, narration notes) with PUT .../chapters/{chapter_id} before narrating.
4. Choose a voice
GET /audiobook/voices/available returns a VoiceSelectionResponse:
All standard voices you can narrate with (each has an integer
id).A curated shortlist.
Your own cloned voices (see Voice Management).
GET /audiobook/voices/languages lists every supported narration locale (100+). Before committing, GET /audiobook/projects/{project_id}/cost-estimate returns a per-paragraph breakdown with word_count, estimated_seconds, and credits_to_charge.
5. Narrate
Narrate every chapter
POST /audiobook/projects/{project_id}/narration/start queues narration for all chapters with the selected voice.
Voice to narrate with (from
voices/available).Optional — narrate only these chapters; omit for all.
Speed multiplier.
Delivery style.
- cURL
- Python
Track progress
GET /audiobook/projects/{project_id}/narration/progress returns a NarrationProgress:
Total chapters.
Finished narrations.
Failed narrations.
Currently rendering.
Per-chapter status detail.
completed_chapters + failed_chapters >= total_chapters. (Or skip polling and wait for the completion webhook.)
Paragraph-level & regeneration
For fine-grained control,POST .../narration/batch narrates specific paragraphs (BatchNarrationRequest: optional voice_id, chapter_ids, paragraph_ids, include_completed/locked/skipped). To re-render a single chapter — e.g. after an edit — POST .../chapters/{chapter_id}/regenerate.
6. Production extras (optional)
Add intro/outro music, narrated opening/closing credits, or a cover before export:GET·PUT /audiobook/projects/{project_id}/music-settings—ProductionMusicSettings(intro,outro,credits,cover). NarratedChapter_00/99_*_Credits.mp3are produced only when opening/closing credit speech is configured here.POST /audiobook/projects/{project_id}/media/upload— upload a cover image or custom music track to reference from the settings above.
7. Export the ACX package
POST /audiobook/projects/{project_id}/export runs an automatic ACX compliance check, then assembles the package.
Optional — export only these chapters.
Optional — override export settings.
ACXExportResponse):
Export job ID.
An
ACXComplianceCheck (see below).Any additional TTS cost incurred by the export (0 when audio already exists).
ACXComplianceCheck:
Whether the book passes ACX requirements.
Blocking problems (export is refused when non-empty).
Non-blocking advisories.
Container/bitrate compliant.
RMS/peak within ACX bounds.
Head/tail room present.
File names follow the ACX convention.
GET .../export/{job_id}/status until COMPLETED, then fetch the download:
GET /audiobook/projects/{project_id}/export/{job_id}/download returns a JSON manifest with a 7-day presigned download_url to the ZIP:
The ACX package
| File | Contents |
|---|---|
Chapter_NN.mp3 | One per chapter — MP3, 192 kbps CBR, 44.1 kHz |
00_retail_sample.mp3 | Retail / audition sample |
Chapter_00/99_*_Credits.mp3 | Optional narrated opening/closing credits (when configured) |
metadata.json | Title, author, narrator, per-chapter durations, format spec |
manifest.csv | Flat file list |
provenance.json | Generation provenance |
Completion webhooks
Rather than poll, register an endpoint and receive a signed event the moment a job finishes — the recommended pattern for backlog-scale automation.| Operation | Method | Endpoint |
|---|---|---|
| Register endpoint | POST | /webhooks/endpoints |
| List endpoints | GET | /webhooks/endpoints |
| Send a test event | POST | /webhooks/endpoints/{endpoint_id}/test |
| Delivery log | GET | /webhooks/endpoints/{endpoint_id}/deliveries |
| Redeliver | POST | /webhooks/deliveries/{delivery_id}/redeliver |
| Header | Meaning |
|---|---|
X-AudioPod-Signature | sha256=<hex> HMAC of the payload (see below) |
X-AudioPod-Timestamp | Unix seconds; part of the signed material (replay protection) |
X-AudioPod-Event | job.completed or job.failed |
X-AudioPod-Event-Id | Stable UUID per event — dedupe on this (delivery is at-least-once) |
"<timestamp>.<raw-body>" with your endpoint secret as the key:
Status values
draft → parsing → narrating → producing → completed. failed if a stage cannot complete.pending → processing → completed, or failed with an error_message.job.failed webhook fires if a job ultimately can’t complete, so an automated integrator can react deterministically.
Error Handling
400 — Project is not ACX-compliant
400 — Project is not ACX-compliant
The export found blocking issues (e.g. a chapter under the 30-second minimum). The response lists each issue under
detail.issues. Fix the flagged chapters (lengthen, merge, or skip_narration) and re-export.A chapter failed narration (content too short)
A chapter failed narration (content too short)
Chapters under ~50 characters can’t be narrated (often a stray title-page fragment from PDF parsing). Mark it
skip_narration: true via PUT .../chapters/{id}, or merge it into an adjacent chapter, then continue.402 — Insufficient balance
402 — Insufficient balance
Narration/export draws from your prepaid balance. Top up the API Wallet and retry; reserved credits for a failed job are released automatically.
401 — Not authenticated
401 — Not authenticated
Send a valid
X-API-Key header. Keys are created in the dashboard and can expire — check the key’s status if previously working calls start returning 401.Pricing
Usage-based on a prepaid wallet — you pay per minute of finished output, with no per-seat fee or monthly minimum.| Service | Rate |
|---|---|
| Narration (text-to-speech) | $0.04 / min of output |
| Voice cloning | $0.04 / min of output |
| Transcription (if used) | $0.01 / min |
Next Steps
Voice Management
Create custom narrator voices from a short sample, then narrate with them.
API Wallet
Top up, check your balance, and see per-minute pricing.
Authentication
API keys, scopes, and how to authenticate every request.
Quickstart
The shortest path from zero to a generated audiobook.
