Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.audiopod.ai/llms.txt

Use this file to discover all available pages before exploring further.

Overview

AudioPod AI’s Voice Changer API uses OpenVoice v2 technology to convert source audio to match target voice characteristics. Transform any speech recording to sound like a different voice while preserving the original speech content, timing, and emotional expression.

Key Features

  • Voice Conversion: Transform source audio to match target voice characteristics
  • Content Preservation: Maintains original speech content, timing, and emotional expression
  • Multiple Voice Sources: Use any completed voice profile as a target voice
  • High Quality Processing: Advanced OpenVoice v2 technology for natural-sounding results
  • Public Voice Support: Access to both user-owned and public voice profiles
  • Flexible Input: Support for various audio formats (WAV, MP3, M4A, etc.)
  • Real-time Processing: Fast conversion for production workflows

Authentication

All endpoints require authentication:
  • API Key (Recommended): X-API-Key: your_api_key header
  • JWT Token: Authorization: Bearer your_jwt_token (for session-based auth)

Voice Conversion

Convert Audio to Target Voice

Transform the voice characteristics in a source audio file to match a target voice profile.
POST /api/v1/voice/voice-convert
X-API-Key: {api_key}
Content-Type: multipart/form-data

file: (source audio file)
voice_uuid: "550e8400-e29b-41d4-a716-446655440000"
Parameters:
  • file (required): Source audio file containing speech to convert
  • voice_uuid (required): UUID of the target voice profile to match
Voice UUID Sources:
  • User’s own custom voice profiles (from voice cloning)
  • Public voice profiles available in the voice library
  • Voice must have status “COMPLETED” and available audio file
Response:
{
  "id": 123,
  "status": "PROCESSING",
  "progress": 0,
  "source_type": "FILE",
  "original_filename": "source_speech.wav",
  "target_voice_uuid": "550e8400-e29b-41d4-a716-446655440000",
  "target_voice_name": "Professional Speaker Voice",
  "conversion_method": "openvoice_v2",
  "created_at": "2024-01-15T10:30:00Z",
  "task_id": "celery_task_uuid_here",
  "user_id": "user_abc123def456"
}

Job Management

Get Conversion Status

Monitor the progress of voice conversion jobs.
GET /api/v1/voice/convert/{conversion_id}/status
X-API-Key: {api_key}
Response (Completed Job):
{
  "id": 123,
  "status": "COMPLETED",
  "created_at": "2024-01-15T10:30:00Z",
  "updated_at": "2024-01-15T10:33:45Z",
  "completed_at": "2024-01-15T10:33:45Z",
  "output_path": "outputs/converted/uuid/voice_converted.wav",
  "error_message": null,
  "task_id": "celery_task_uuid_here",
  "source_path": "temp/uploads/uuid/source_speech.wav",
  "original_filename": "source_speech.wav",
  "conversion_metadata": {
    "voice_uuid": "550e8400-e29b-41d4-a716-446655440000",
    "voice_id": 456,
    "voice_file_path": "voices/456/uuid/voice.wav",
    "duration": 45.2
  }
}

List Conversion Jobs

Get all voice conversion jobs for the authenticated user.
GET /api/v1/voice/convert/jobs?status=COMPLETED&limit=10
X-API-Key: <api_key>

Download Converted Audio

def download_converted_audio(conversion_id, api_key):
    """Download converted audio from completed job"""
    
    # Get conversion details
    response = requests.get(
        f"https://api.audiopod.ai/api/v1/voice/convert/{conversion_id}/status",
        headers={"X-API-Key": api_key}
    )
    
    if response.status_code != 200:
        return {"error": "Conversion not found"}
    
    conversion = response.json()
    
    if conversion["status"] != "COMPLETED":
        return {"error": f"Conversion not completed. Status: {conversion['status']}"}
    
    if not conversion.get("output_path"):
        return {"error": "No output file available"}
    
    # Note: The output_path is the S3 key. You would need to use your S3 service
    # or a separate download endpoint to get the actual file.
    
    return {
        "success": True,
        "output_path": conversion["output_path"],
        "original_filename": conversion.get("original_filename"),
        "target_voice_uuid": conversion.get("conversion_metadata", {}).get("voice_uuid"),
        "status": "File ready for download"
    }

# Usage
result = download_converted_audio(123, "your_api_key")
if result.get("success"):
    print(f"Conversion completed: {result['output_path']}")
    print(f"Original file: {result['original_filename']}")
    print(f"Target voice UUID: {result['target_voice_uuid']}")

Voice Profile Management

Find Available Voices

Before converting, you can browse available voice profiles to use as targets.
# Get available voices for conversion
response = requests.get(
    "https://api.audiopod.ai/api/v1/voice/voice-profiles",
    headers={"X-API-Key": api_key},
    params={
        "voice_type": "CUSTOM",      # Filter by voice type
        "include_public": "true",    # Include public voices
        "limit": 20
    }
)

if response.status_code == 200:
    voices_data = response.json()
    print("Available voices for conversion:")
    for voice in voices_data["voices"]:
        if voice["is_custom"] or voice.get("is_public"):
            print(f"- {voice['name']} (UUID: {voice['voice_id']})")
            print(f"  Category: {voice['category']}")
            print(f"  Language: {voice['language']}")
            print(f"  Type: {'Custom' if voice['is_custom'] else 'Public'}")
            print()

Use Cases & Examples

Podcast Voice Consistency

def convert_podcast_guest_to_host_voice(guest_audio, host_voice_uuid, api_key):
    """Convert guest audio to match host voice for consistency"""
    
    print("Converting guest audio to match host voice...")
    
    with open(guest_audio, "rb") as audio_file:
        response = requests.post(
            "https://api.audiopod.ai/api/v1/voice/voice-convert",
            headers={"X-API-Key": api_key},
            data={"voice_uuid": host_voice_uuid},
            files={"file": audio_file}
        )
    
    if response.status_code != 200:
        return {"error": "Failed to start conversion"}
    
    job_data = response.json()
    job_id = job_data["id"]
    
    # Wait for completion
    import time
    while True:
        status_response = requests.get(
            f"https://api.audiopod.ai/api/v1/voice/conversions/{job_id}",
            headers={"X-API-Key": api_key}
        )
        
        job_status = status_response.json()
        print(f"Progress: {job_status['progress']}%")
        
        if job_status["status"] == "COMPLETED":
            break
        elif job_status["status"] == "FAILED":
            return {"error": "Conversion failed"}
        
        time.sleep(5)
    
    # Download result
    output_url = job_status["output_url"]
    converted_filename = f"guest_as_host_{job_id}.wav"
    
    audio_response = requests.get(output_url)
    with open(converted_filename, "wb") as f:
        f.write(audio_response.content)
    
    return {
        "success": True,
        "job_id": job_id,
        "output_file": converted_filename,
        "original_duration": job_status["original_duration"],
        "processing_time": job_status["processing_stats"]["total_time"]
    }

# Usage
result = convert_podcast_guest_to_host_voice(
    "guest_interview.wav", 
    "host_voice_uuid_here",
    "your_api_key"
)

if result.get("success"):
    print(f"Guest voice converted successfully!")
    print(f"Output: {result['output_file']}")
    print(f"Processing time: {result['processing_time']:.1f}s")

Content Localization

def localize_content_voice(source_audio, target_locale_voices, api_key):
    """Convert content to different regional voice variations"""
    
    localized_versions = {}
    
    for locale, voice_uuid in target_locale_voices.items():
        print(f"Creating {locale} version...")
        
        with open(source_audio, "rb") as audio_file:
            response = requests.post(
                "https://api.audiopod.ai/api/v1/voice/voice-convert",
                headers={"X-API-Key": api_key},
                data={"voice_uuid": voice_uuid},
                files={"file": audio_file}
            )
        
        if response.status_code == 200:
            job_data = response.json()
            localized_versions[locale] = {
                "job_id": job_data["id"],
                "status": "processing",
                "target_voice": job_data["target_voice_name"]
            }
    
    # Monitor all jobs
    completed_versions = {}
    
    while len(completed_versions) < len(localized_versions):
        for locale, job_info in localized_versions.items():
            if locale in completed_versions:
                continue
            
            status_response = requests.get(
                f"https://api.audiopod.ai/api/v1/voice/conversions/{job_info['job_id']}",
                headers={"X-API-Key": api_key}
            )
            
            job_status = status_response.json()
            
            if job_status["status"] == "COMPLETED":
                # Download localized version
                audio_response = requests.get(job_status["output_url"])
                filename = f"content_{locale}_{job_info['job_id']}.wav"
                
                with open(filename, "wb") as f:
                    f.write(audio_response.content)
                
                completed_versions[locale] = {
                    "filename": filename,
                    "target_voice": job_status["target_voice_name"],
                    "duration": job_status["converted_duration"]
                }
                
                print(f"Completed {locale} version: {filename}")
        
        time.sleep(5)
    
    return completed_versions

# Usage - create multiple regional versions
locale_voices = {
    "US": "us_professional_voice_uuid",
    "UK": "uk_professional_voice_uuid", 
    "AU": "au_professional_voice_uuid"
}

versions = localize_content_voice(
    "original_content.wav",
    locale_voices,
    "your_api_key"
)

print("Localized versions created:")
for locale, info in versions.items():
    print(f"  {locale}: {info['filename']} ({info['duration']:.1f}s)")

Voice Anonymization

def anonymize_voice_recording(source_audio, api_key):
    """Convert sensitive audio to an anonymous voice for privacy"""
    
    # Use a generic public voice for anonymization
    anonymous_voice_uuid = "public_generic_voice_uuid"
    
    with open(source_audio, "rb") as audio_file:
        response = requests.post(
            "https://api.audiopod.ai/api/v1/voice/voice-convert",
            headers={"X-API-Key": api_key},
            data={"voice_uuid": anonymous_voice_uuid},
            files={"file": audio_file}
        )
    
    job_id = response.json()["id"]
    
    # Wait for completion
    while True:
        status_response = requests.get(
            f"https://api.audiopod.ai/api/v1/voice/conversions/{job_id}",
            headers={"X-API-Key": api_key}
        )
        
        job_status = status_response.json()
        if job_status["status"] == "COMPLETED":
            break
        time.sleep(3)
    
    # Download anonymized audio
    audio_response = requests.get(job_status["output_url"])
    anonymized_filename = f"anonymized_{job_id}.wav"
    
    with open(anonymized_filename, "wb") as f:
        f.write(audio_response.content)
    
    return {
        "anonymized_file": anonymized_filename,
        "original_content_preserved": True,
        "voice_anonymized": True,
        "duration": job_status["converted_duration"]
    }

Error Handling

Causes: - Target voice UUID doesn’t exist - Voice is not accessible to user - Voice is not completed or ready Solutions: - Verify voice UUID is correct - Ensure voice is completed and has audio file available - Use public voices or your own custom voices
Causes: - Source audio file has no speech content - Unsupported audio format - Audio file corrupted Solutions: - Ensure audio contains clear speech - Use supported formats (WAV, MP3, M4A) - Verify file integrity
Causes: - Source audio quality too poor - Incompatible voice characteristics - Processing timeout Solutions: - Improve source audio quality - Try different target voice - Use shorter audio segments
Causes: - Not enough credits for conversion duration Solutions: - Purchase additional credits - Check credit requirements for audio duration

Best Practices

Audio Quality Guidelines

For optimal voice conversion results:
# Audio requirements for best results
audio_guidelines = {
    "sample_rate": "16kHz or higher recommended",
    "format": "WAV preferred, MP3/M4A acceptable",
    "duration": "10 seconds to 10 minutes optimal",
    "content": "Clear speech without background music",
    "speaker": "Single speaker recommended",
    "noise": "Minimal background noise"
}

# Pre-processing recommendations
preprocessing_tips = [
    "Use noise reduction for noisy recordings",
    "Normalize audio levels for consistent volume",
    "Remove long silent periods to improve processing speed",
    "Separate speakers if multiple voices present"
]

Target Voice Selection

def choose_optimal_target_voice(source_characteristics, available_voices):
    """Choose the best target voice for conversion"""
    
    recommendations = []
    
    for voice in available_voices:
        score = 0
        
        # Language match
        if voice["language"] == source_characteristics["language"]:
            score += 30
        
        # Gender match (if applicable)
        if voice["gender"] == source_characteristics.get("gender"):
            score += 20
        
        # Age range compatibility
        if voice["age_range"] == source_characteristics.get("age_range"):
            score += 15
        
        # Style appropriateness
        if voice["style"] in source_characteristics.get("preferred_styles", []):
            score += 25
        
        # Quality indicators
        if voice.get("usage_stats", {}).get("avg_rating", 0) >= 4.5:
            score += 10
        
        recommendations.append({
            "voice": voice,
            "compatibility_score": score,
            "reasons": []
        })
    
    # Sort by compatibility score
    recommendations.sort(key=lambda x: x["compatibility_score"], reverse=True)
    
    return recommendations[:5]  # Top 5 recommendations

Pricing

Voice conversion pricing is based on audio duration:
ServiceCostDescription
Voice Conversion990 credits/minuteTransform voice characteristics using OpenVoice v2

Cost Examples

DurationCreditsUSD Cost
30 seconds495$0.0659
2 minutes1980$0.2634
5 minutes4950$0.6584
10 minutes9900$1.3167

Cost Optimization Tips

  1. Pre-process audio to remove silence and optimize duration
  2. Batch similar conversions using the same target voice
  3. Test with shorter clips before converting long content
  4. Use high-quality source audio to avoid re-processing

Next Steps

Voice Management

Browse and manage available voice profiles for conversion.

Speech-to-Text

Extract text from converted audio for verification.