Skip to main content

Overview

The Audiobook API exposes the complete production pipeline end to end, so you can drop AI audiobook generation directly into your own platform or control panel: a file goes in, a finished, ACX-compliant package comes back, with no human in the loop. It is the same API surface our own Audiobook Studio runs on — nothing here is a private endpoint.
create project → add manuscript → parse into chapters → narrate → export ACX package → download
                                                              └──── completion webhook ────┘
Every endpoint is API-key authenticated and asynchronous: submit a job, get an ID back immediately, and receive a signed webhook when it finishes — so your control panel never blocks on a long render.

Key features

  • Inputs: upload PDF, EPUB, DOCX, or TXT; paste raw text; or generate a book from a prompt. Chapters are detected automatically.
  • Voices: 100+ languages with standard narration voices; custom voice cloning across 20+ languages.
  • ACX-compliant export: per-chapter MP3 at 192 kbps CBR / 44.1 kHz, a retail sample, optional narrated opening/closing credits, a cover slot, and a metadata.json manifest — packaged as a single ZIP via a presigned URL.
  • Automatic compliance check: file format, audio levels, silence, and per-chapter length are validated on export; a non-compliant book is rejected with the exact reason rather than shipped silently.
  • Asynchronous + webhooks: job.completed / job.failed callbacks for backlog-scale automation.
  • White-label output: delivered files carry the publisher’s own title/author/narrator metadata — nothing points back at AudioPod.
  • Prepaid usage pricing: 0.04perminuteofoutput( 0.04 per minute of output (~24 for a 10-hour book). See API Wallet.

Authentication

All endpoints require authentication. Use an API key (recommended for server-to-server integrations):
  • API Key (Recommended): X-API-Key: your_api_key header — create one in the dashboard.
  • JWT Token: Authorization: Bearer your_jwt_token (session-based auth).

API Endpoints Quick Reference

OperationMethodEndpoint
Create projectPOST/audiobook/projects
List projectsGET/audiobook/projects
Get / update / delete projectGET·PUT·DELETE/audiobook/projects/{project_id}
Upload manuscriptPOST/audiobook/projects/{project_id}/manuscript/upload
Parse manuscriptPOST/audiobook/projects/{project_id}/manuscript/parse
Paste textPOST/audiobook/projects/{project_id}/manuscript/paste
Generate bookPOST/audiobook/projects/{project_id}/manuscript/generate
List / get chaptersGET/audiobook/projects/{project_id}/chapters
List voicesGET/audiobook/voices/available
List languagesGET/audiobook/voices/languages
Cost estimateGET/audiobook/projects/{project_id}/cost-estimate
Start narrationPOST/audiobook/projects/{project_id}/narration/start
Batch narrationPOST/audiobook/projects/{project_id}/narration/batch
Narration progressGET/audiobook/projects/{project_id}/narration/progress
Regenerate a chapterPOST/audiobook/projects/{project_id}/chapters/{chapter_id}/regenerate
Production music/creditsGET·PUT/audiobook/projects/{project_id}/music-settings
Upload media (cover/music)POST/audiobook/projects/{project_id}/media/upload
Start ACX exportPOST/audiobook/projects/{project_id}/export
Export statusGET/audiobook/projects/{project_id}/export/{job_id}/status
Download packageGET/audiobook/projects/{project_id}/export/{job_id}/download
Dashboard statsGET/audiobook/dashboard

Quick Start (cURL)

export AUDIOPOD_API_KEY="ap_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
export AP="https://api.audiopod.ai/api/v1"

# 1. Create a project
PID=$(curl -s -X POST "$AP/audiobook/projects" -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"title":"The Lighthouse Keeper","author":"A. P. Tester"}' | jq -r .id)

# 2. Upload a manuscript (PDF / EPUB / DOCX / TXT)
FILE_KEY=$(curl -s -X POST "$AP/audiobook/projects/$PID/manuscript/upload" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" -F "[email protected]" | jq -r .file_key)

# 3. Parse into chapters (async — poll chapters until they appear)
curl -s -X POST "$AP/audiobook/projects/$PID/manuscript/parse" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" -F "file_key=$FILE_KEY" -F "filename=book.epub"

# 4. Narrate all chapters with a voice
curl -s -X POST "$AP/audiobook/projects/$PID/narration/start" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" -H "Content-Type: application/json" \
  -d '{"voice_id":387}'

# 5. Export an ACX package, then download it
JOB=$(curl -s -X POST "$AP/audiobook/projects/$PID/export" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" -H "Content-Type: application/json" -d '{}' | jq -r .job_id)
curl -s "$AP/audiobook/projects/$PID/export/$JOB/download" -H "X-API-Key: $AUDIOPOD_API_KEY"
# => { "download_url": "https://media.audiopod.ai/...zip?...", "file_size_bytes": ..., "chapter_count": ... }

1. Create a project

A project is the container for a single audiobook. Create it first, then attach a manuscript.
title
string
required
Audiobook title.
author
string
Author name (written into the package metadata).
description
string
Project description.
isbn
string
ISBN, if available.
genre
string
Book genre.
selected_voice_id
integer
Default narration voice for the project (can also be set per narration request).
narration_speed
number
default:"1.0"
Speech speed multiplier.
narration_style
string
default:"narrative"
Narration style.
export_format
string
default:"acx"
Export format. acx produces an Audible/ACX-ready package.
target_platform
string
default:"audible"
Target distribution platform.
curl -X POST "$AP/audiobook/projects" -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"title":"The Lighthouse Keeper","author":"A. P. Tester","genre":"Fiction"}'
Response (AudiobookProjectResponse):
id
string
Project UUID.
status
string
Lifecycle state (see Status values).
total_chapters
integer
Number of chapters once parsed.
estimated_duration_minutes
number
Estimated finished runtime.
manuscript_filename
string
Uploaded filename, once attached.
manuscript_format
string
pdf · epub · docx · txt.
manuscript_word_count
integer
Total words after parsing.
List existing projects with GET /audiobook/projects?skip=0&limit=20 (paginated).

2. Add the manuscript

Three ways to get text into a project — pick one.

Upload a file

POST /audiobook/projects/{project_id}/manuscript/upload — multipart form field file (PDF, EPUB, DOCX, or TXT). Returns a file_key to hand to the parser.
file_key
string
S3 key of the stored manuscript.
upload_url
string
Reference URL for the stored file.
max_file_size
integer
Maximum accepted size in bytes.
curl -X POST "$AP/audiobook/projects/$PID/manuscript/upload" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" -F "[email protected]"

Parse the uploaded file into chapters

POST /audiobook/projects/{project_id}/manuscript/parse — form fields file_key and filename. Parsing is asynchronous and returns a job_id; poll GET .../chapters until chapters appear.
curl -X POST "$AP/audiobook/projects/$PID/manuscript/parse" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" -F "file_key=$FILE_KEY" -F "filename=book.epub"
# => { "job_id": 99, "estimated_completion_time": "under 2 minutes" }

Paste text directly

POST /audiobook/projects/{project_id}/manuscript/paste — JSON. Best for content you already have as text.
text
string
required
Manuscript or chapter text.
source_type
string
default:"paste"
Source label.

Generate a book from a prompt

POST /audiobook/projects/{project_id}/manuscript/generate — JSON. Drafts an original manuscript, then chapters it.
prompt
string
required
What the book should be about.
title
string
Optional working title.
audience
string
Intended audience.
tone
string
default:"warm"
Writing tone.
chapter_count
integer
default:"3"
Number of chapters.
words_per_chapter
integer
default:"600"
Target words per chapter.
The paste and generate endpoints return a ManuscriptIntakeResponse with total_chapters, total_words, and estimated_duration_minutes.

3. Chapters

List chapters with GET /audiobook/projects/{project_id}/chapters?skip=0&limit=50, or fetch one with GET .../chapters/{chapter_id}. Chapter (AudiobookChapterResponse):
id
string
Chapter UUID.
chapter_number
integer
Order in the book.
title
string
Chapter title.
content
string
Chapter text.
word_count
integer
Words in the chapter.
character_count
integer
Characters in the chapter.
duration_seconds
number
Narrated length, once generated.
status
string
pending · processing · completed · failed.
audio_file_path
string
Stored audio key, once narrated.
error_message
string
Reason a chapter failed (e.g. content too short for narration).
skip_narration
boolean
If true, the chapter is excluded from narration and export.
voice_id
integer
Per-chapter voice override.
You can edit a chapter (title, content, skip_narration, per-chapter voice_id, narration notes) with PUT .../chapters/{chapter_id} before narrating.

4. Choose a voice

GET /audiobook/voices/available returns a VoiceSelectionResponse:
available_voices
array
All standard voices you can narrate with (each has an integer id).
A curated shortlist.
user_custom_voices
array
Your own cloned voices (see Voice Management).
GET /audiobook/voices/languages lists every supported narration locale (100+). Before committing, GET /audiobook/projects/{project_id}/cost-estimate returns a per-paragraph breakdown with word_count, estimated_seconds, and credits_to_charge.

5. Narrate

Narrate every chapter

POST /audiobook/projects/{project_id}/narration/start queues narration for all chapters with the selected voice.
voice_id
integer
required
Voice to narrate with (from voices/available).
chapter_ids
array
Optional — narrate only these chapters; omit for all.
narration_speed
number
default:"1.0"
Speed multiplier.
narration_style
string
default:"narrative"
Delivery style.
curl -X POST "$AP/audiobook/projects/$PID/narration/start" \
  -H "X-API-Key: $AUDIOPOD_API_KEY" -H "Content-Type: application/json" \
  -d '{"voice_id":387}'
# => {"success": true, "queued_chapters": 2, "total_chapters": 2, "project_status": "narrating"}

Track progress

GET /audiobook/projects/{project_id}/narration/progress returns a NarrationProgress:
total_chapters
integer
Total chapters.
completed_chapters
integer
Finished narrations.
failed_chapters
integer
Failed narrations.
in_progress_chapters
integer
Currently rendering.
chapter_status
array
Per-chapter status detail.
Narration is complete when completed_chapters + failed_chapters >= total_chapters. (Or skip polling and wait for the completion webhook.)

Paragraph-level & regeneration

For fine-grained control, POST .../narration/batch narrates specific paragraphs (BatchNarrationRequest: optional voice_id, chapter_ids, paragraph_ids, include_completed/locked/skipped). To re-render a single chapter — e.g. after an edit — POST .../chapters/{chapter_id}/regenerate.

6. Production extras (optional)

Add intro/outro music, narrated opening/closing credits, or a cover before export:
  • GET·PUT /audiobook/projects/{project_id}/music-settingsProductionMusicSettings (intro, outro, credits, cover). Narrated Chapter_00/99_*_Credits.mp3 are produced only when opening/closing credit speech is configured here.
  • POST /audiobook/projects/{project_id}/media/upload — upload a cover image or custom music track to reference from the settings above.

7. Export the ACX package

POST /audiobook/projects/{project_id}/export runs an automatic ACX compliance check, then assembles the package.
include_chapters
array
Optional — export only these chapters.
custom_settings
object
Optional — override export settings.
Response (ACXExportResponse):
job_id
integer
Export job ID.
compliance_check
object
An ACXComplianceCheck (see below).
credits_tts_cost
integer
Any additional TTS cost incurred by the export (0 when audio already exists).
ACXComplianceCheck:
is_compliant
boolean
Whether the book passes ACX requirements.
issues
array
Blocking problems (export is refused when non-empty).
warnings
array
Non-blocking advisories.
file_format_ok
boolean
Container/bitrate compliant.
audio_levels_ok
boolean
RMS/peak within ACX bounds.
silence_requirements_ok
boolean
Head/tail room present.
file_naming_ok
boolean
File names follow the ACX convention.
If a book is non-compliant, the call returns the specific issues instead of producing an invalid package:
{ "detail": { "message": "Project is not ACX-compliant. Fix these issues first.",
              "issues": ["Chapter 3: 28s < ACX minimum 30s"], "warnings": [] } }
Poll GET .../export/{job_id}/status until COMPLETED, then fetch the download: GET /audiobook/projects/{project_id}/export/{job_id}/download returns a JSON manifest with a 7-day presigned download_url to the ZIP:
{ "download_url": "https://media.audiopod.ai/...acx_package_xxx.zip?...",
  "file_size_bytes": 2079760, "file_format": "zip", "chapter_count": 2 }

The ACX package

FileContents
Chapter_NN.mp3One per chapter — MP3, 192 kbps CBR, 44.1 kHz
00_retail_sample.mp3Retail / audition sample
Chapter_00/99_*_Credits.mp3Optional narrated opening/closing credits (when configured)
metadata.jsonTitle, author, narrator, per-chapter durations, format spec
manifest.csvFlat file list
provenance.jsonGeneration provenance

Completion webhooks

Rather than poll, register an endpoint and receive a signed event the moment a job finishes — the recommended pattern for backlog-scale automation.
OperationMethodEndpoint
Register endpointPOST/webhooks/endpoints
List endpointsGET/webhooks/endpoints
Send a test eventPOST/webhooks/endpoints/{endpoint_id}/test
Delivery logGET/webhooks/endpoints/{endpoint_id}/deliveries
RedeliverPOST/webhooks/deliveries/{delivery_id}/redeliver
curl -X POST "$AP/webhooks/endpoints" -H "X-API-Key: $AUDIOPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://your-app.com/hooks/audiopod","events":["job.completed","job.failed"]}'
# => { "id": "...", "url": "...", "events": [...], "secret": "whsec_..." }   # secret shown ONCE
Every delivery carries these headers:
HeaderMeaning
X-AudioPod-Signaturesha256=<hex> HMAC of the payload (see below)
X-AudioPod-TimestampUnix seconds; part of the signed material (replay protection)
X-AudioPod-Eventjob.completed or job.failed
X-AudioPod-Event-IdStable UUID per event — dedupe on this (delivery is at-least-once)
Verify the signature — HMAC-SHA256 over "<timestamp>.<raw-body>" with your endpoint secret as the key:
import hmac, hashlib

def verify(secret: str, raw_body: bytes, signature_header: str, timestamp_header: str) -> bool:
    expected = hmac.new(
        secret.encode(),
        timestamp_header.encode() + b"." + raw_body,   # exact raw bytes received
        hashlib.sha256,
    ).hexdigest()
    return hmac.compare_digest(f"sha256={expected}", signature_header)
Delivery is retried with exponential backoff and parked in a dead-letter queue after repeated failures; inspect and redeliver from the delivery log. Endpoints are SSRF-guarded (the resolved IP is re-checked at delivery time; https-only; private/internal ranges blocked).

Status values

Project status
string
draftparsingnarratingproducingcompleted. failed if a stage cannot complete.
Chapter / job status
string
pendingprocessingcompleted, or failed with an error_message.
Jobs are prepaid: credits are reserved at job start, settled on success, and released on failure — a failed narration or export never leaves a stranded charge. Narration and export retry automatically; a job.failed webhook fires if a job ultimately can’t complete, so an automated integrator can react deterministically.

Error Handling

The export found blocking issues (e.g. a chapter under the 30-second minimum). The response lists each issue under detail.issues. Fix the flagged chapters (lengthen, merge, or skip_narration) and re-export.
Chapters under ~50 characters can’t be narrated (often a stray title-page fragment from PDF parsing). Mark it skip_narration: true via PUT .../chapters/{id}, or merge it into an adjacent chapter, then continue.
Narration/export draws from your prepaid balance. Top up the API Wallet and retry; reserved credits for a failed job are released automatically.
Send a valid X-API-Key header. Keys are created in the dashboard and can expire — check the key’s status if previously working calls start returning 401.

Pricing

Usage-based on a prepaid wallet — you pay per minute of finished output, with no per-seat fee or monthly minimum.
ServiceRate
Narration (text-to-speech)$0.04 / min of output
Voice cloning$0.04 / min of output
Transcription (if used)$0.01 / min
A 10-hour audiobook (~600 minutes) costs roughly **24,versus24**, versus 2,000–$4,000 for a human ACX narrator. For committed monthly volume across many titles, contact us about partner pricing below list. See API Wallet for top-ups and balance.

Next Steps

Voice Management

Create custom narrator voices from a short sample, then narrate with them.

API Wallet

Top up, check your balance, and see per-minute pricing.

Authentication

API keys, scopes, and how to authenticate every request.

Quickstart

The shortest path from zero to a generated audiobook.