Overview

AudioPod AI’s Voice Changer API uses OpenVoice v2 technology to convert source audio to match target voice characteristics. Transform any speech recording to sound like a different voice while preserving the original speech content, timing, and emotional expression.

Key Features

  • Voice Conversion: Transform source audio to match target voice characteristics
  • Content Preservation: Maintains original speech content, timing, and emotional expression
  • Multiple Voice Sources: Use any completed voice profile as a target voice
  • High Quality Processing: Advanced OpenVoice v2 technology for natural-sounding results
  • Public Voice Support: Access to both user-owned and public voice profiles
  • Flexible Input: Support for various audio formats (WAV, MP3, M4A, etc.)
  • Real-time Processing: Fast conversion for production workflows

Authentication

All endpoints require authentication:
  • API Key: Authorization: Bearer your_api_key
  • JWT Token: Authorization: Bearer your_jwt_token

Voice Conversion

Convert Audio to Target Voice

Transform the voice characteristics in a source audio file to match a target voice profile.
POST /api/v1/voice/voice-convert
Authorization: Bearer {api_key}
Content-Type: multipart/form-data

file: (source audio file)
voice_uuid: "550e8400-e29b-41d4-a716-446655440000"
Parameters:
  • file (required): Source audio file containing speech to convert
  • voice_uuid (required): UUID of the target voice profile to match
Voice UUID Sources:
  • User’s own custom voice profiles (from voice cloning)
  • Public voice profiles available in the voice library
  • Voice must have status “COMPLETED” and available audio file
Response:
{
  "id": 123,
  "status": "PROCESSING",
  "progress": 0,
  "source_type": "FILE",
  "original_filename": "source_speech.wav",
  "target_voice_uuid": "550e8400-e29b-41d4-a716-446655440000",
  "target_voice_name": "Professional Speaker Voice",
  "conversion_method": "openvoice_v2",
  "created_at": "2024-01-15T10:30:00Z",
  "task_id": "celery_task_uuid_here",
  "user_id": "user_abc123def456"
}

Job Management

Get Conversion Status

Monitor the progress of voice conversion jobs.
GET /api/v1/voice/convert/{conversion_id}/status
Authorization: Bearer {api_key}
Response (Completed Job):
{
  "id": 123,
  "status": "COMPLETED",
  "created_at": "2024-01-15T10:30:00Z",
  "updated_at": "2024-01-15T10:33:45Z",
  "completed_at": "2024-01-15T10:33:45Z",
  "output_path": "outputs/converted/uuid/voice_converted.wav",
  "error_message": null,
  "task_id": "celery_task_uuid_here",
  "source_path": "temp/uploads/uuid/source_speech.wav",
  "original_filename": "source_speech.wav",
  "conversion_metadata": {
    "voice_uuid": "550e8400-e29b-41d4-a716-446655440000",
    "voice_id": 456,
    "voice_file_path": "voices/456/uuid/voice.wav",
    "duration": 45.2
  }
}

List Conversion Jobs

Get all voice conversion jobs for the authenticated user.
GET /api/v1/voice/convert/jobs?status=COMPLETED&limit=10
Authorization: Bearer {api_key}

Download Converted Audio

def download_converted_audio(conversion_id, api_key):
    """Download converted audio from completed job"""
    
    # Get conversion details
    response = requests.get(
        f"https://api.audiopod.ai/api/v1/voice/convert/{conversion_id}/status",
        headers={"Authorization": f"Bearer {api_key}"}
    )
    
    if response.status_code != 200:
        return {"error": "Conversion not found"}
    
    conversion = response.json()
    
    if conversion["status"] != "COMPLETED":
        return {"error": f"Conversion not completed. Status: {conversion['status']}"}
    
    if not conversion.get("output_path"):
        return {"error": "No output file available"}
    
    # Note: The output_path is the S3 key. You would need to use your S3 service
    # or a separate download endpoint to get the actual file.
    
    return {
        "success": True,
        "output_path": conversion["output_path"],
        "original_filename": conversion.get("original_filename"),
        "target_voice_uuid": conversion.get("conversion_metadata", {}).get("voice_uuid"),
        "status": "File ready for download"
    }

# Usage
result = download_converted_audio(123, "your_api_key")
if result.get("success"):
    print(f"Conversion completed: {result['output_path']}")
    print(f"Original file: {result['original_filename']}")
    print(f"Target voice UUID: {result['target_voice_uuid']}")

Voice Profile Management

Find Available Voices

Before converting, you can browse available voice profiles to use as targets.
# Get available voices for conversion
response = requests.get(
    "https://api.audiopod.ai/api/v1/voice/voices",
    headers={"Authorization": f"Bearer {api_key}"},
    params={
        "category": "professional",  # Filter by category
        "language": "en",           # Filter by language
        "limit": 20
    }
)

if response.status_code == 200:
    voices_data = response.json()
    print("Available voices for conversion:")
    for voice in voices_data["voices"]:
        if voice["is_custom"] or voice.get("is_public"):
            print(f"- {voice['name']} (UUID: {voice['voice_id']})")
            print(f"  Category: {voice['category']}")
            print(f"  Language: {voice['language']}")
            print(f"  Type: {'Custom' if voice['is_custom'] else 'Public'}")
            print()

Use Cases & Examples

Podcast Voice Consistency

def convert_podcast_guest_to_host_voice(guest_audio, host_voice_uuid, api_key):
    """Convert guest audio to match host voice for consistency"""
    
    print("Converting guest audio to match host voice...")
    
    with open(guest_audio, "rb") as audio_file:
        response = requests.post(
            "https://api.audiopod.ai/api/v1/voice/voice-convert",
            headers={"Authorization": f"Bearer {api_key}"},
            data={"voice_uuid": host_voice_uuid},
            files={"file": audio_file}
        )
    
    if response.status_code != 200:
        return {"error": "Failed to start conversion"}
    
    job_data = response.json()
    job_id = job_data["id"]
    
    # Wait for completion
    import time
    while True:
        status_response = requests.get(
            f"https://api.audiopod.ai/api/v1/voice/conversions/{job_id}",
            headers={"Authorization": f"Bearer {api_key}"}
        )
        
        job_status = status_response.json()
        print(f"Progress: {job_status['progress']}%")
        
        if job_status["status"] == "COMPLETED":
            break
        elif job_status["status"] == "FAILED":
            return {"error": "Conversion failed"}
        
        time.sleep(5)
    
    # Download result
    output_url = job_status["output_url"]
    converted_filename = f"guest_as_host_{job_id}.wav"
    
    audio_response = requests.get(output_url)
    with open(converted_filename, "wb") as f:
        f.write(audio_response.content)
    
    return {
        "success": True,
        "job_id": job_id,
        "output_file": converted_filename,
        "original_duration": job_status["original_duration"],
        "processing_time": job_status["processing_stats"]["total_time"]
    }

# Usage
result = convert_podcast_guest_to_host_voice(
    "guest_interview.wav", 
    "host_voice_uuid_here",
    "your_api_key"
)

if result.get("success"):
    print(f"Guest voice converted successfully!")
    print(f"Output: {result['output_file']}")
    print(f"Processing time: {result['processing_time']:.1f}s")

Content Localization

def localize_content_voice(source_audio, target_locale_voices, api_key):
    """Convert content to different regional voice variations"""
    
    localized_versions = {}
    
    for locale, voice_uuid in target_locale_voices.items():
        print(f"Creating {locale} version...")
        
        with open(source_audio, "rb") as audio_file:
            response = requests.post(
                "https://api.audiopod.ai/api/v1/voice/voice-convert",
                headers={"Authorization": f"Bearer {api_key}"},
                data={"voice_uuid": voice_uuid},
                files={"file": audio_file}
            )
        
        if response.status_code == 200:
            job_data = response.json()
            localized_versions[locale] = {
                "job_id": job_data["id"],
                "status": "processing",
                "target_voice": job_data["target_voice_name"]
            }
    
    # Monitor all jobs
    completed_versions = {}
    
    while len(completed_versions) < len(localized_versions):
        for locale, job_info in localized_versions.items():
            if locale in completed_versions:
                continue
            
            status_response = requests.get(
                f"https://api.audiopod.ai/api/v1/voice/conversions/{job_info['job_id']}",
                headers={"Authorization": f"Bearer {api_key}"}
            )
            
            job_status = status_response.json()
            
            if job_status["status"] == "COMPLETED":
                # Download localized version
                audio_response = requests.get(job_status["output_url"])
                filename = f"content_{locale}_{job_info['job_id']}.wav"
                
                with open(filename, "wb") as f:
                    f.write(audio_response.content)
                
                completed_versions[locale] = {
                    "filename": filename,
                    "target_voice": job_status["target_voice_name"],
                    "duration": job_status["converted_duration"]
                }
                
                print(f"Completed {locale} version: {filename}")
        
        time.sleep(5)
    
    return completed_versions

# Usage - create multiple regional versions
locale_voices = {
    "US": "us_professional_voice_uuid",
    "UK": "uk_professional_voice_uuid", 
    "AU": "au_professional_voice_uuid"
}

versions = localize_content_voice(
    "original_content.wav",
    locale_voices,
    "your_api_key"
)

print("Localized versions created:")
for locale, info in versions.items():
    print(f"  {locale}: {info['filename']} ({info['duration']:.1f}s)")

Voice Anonymization

def anonymize_voice_recording(source_audio, api_key):
    """Convert sensitive audio to an anonymous voice for privacy"""
    
    # Use a generic public voice for anonymization
    anonymous_voice_uuid = "public_generic_voice_uuid"
    
    with open(source_audio, "rb") as audio_file:
        response = requests.post(
            "https://api.audiopod.ai/api/v1/voice/voice-convert",
            headers={"Authorization": f"Bearer {api_key}"},
            data={"voice_uuid": anonymous_voice_uuid},
            files={"file": audio_file}
        )
    
    job_id = response.json()["id"]
    
    # Wait for completion
    while True:
        status_response = requests.get(
            f"https://api.audiopod.ai/api/v1/voice/conversions/{job_id}",
            headers={"Authorization": f"Bearer {api_key}"}
        )
        
        job_status = status_response.json()
        if job_status["status"] == "COMPLETED":
            break
        time.sleep(3)
    
    # Download anonymized audio
    audio_response = requests.get(job_status["output_url"])
    anonymized_filename = f"anonymized_{job_id}.wav"
    
    with open(anonymized_filename, "wb") as f:
        f.write(audio_response.content)
    
    return {
        "anonymized_file": anonymized_filename,
        "original_content_preserved": True,
        "voice_anonymized": True,
        "duration": job_status["converted_duration"]
    }

Error Handling

Best Practices

Audio Quality Guidelines

For optimal voice conversion results:
# Audio requirements for best results
audio_guidelines = {
    "sample_rate": "16kHz or higher recommended",
    "format": "WAV preferred, MP3/M4A acceptable",
    "duration": "10 seconds to 10 minutes optimal",
    "content": "Clear speech without background music",
    "speaker": "Single speaker recommended",
    "noise": "Minimal background noise"
}

# Pre-processing recommendations
preprocessing_tips = [
    "Use noise reduction for noisy recordings",
    "Normalize audio levels for consistent volume",
    "Remove long silent periods to improve processing speed",
    "Separate speakers if multiple voices present"
]

Target Voice Selection

def choose_optimal_target_voice(source_characteristics, available_voices):
    """Choose the best target voice for conversion"""
    
    recommendations = []
    
    for voice in available_voices:
        score = 0
        
        # Language match
        if voice["language"] == source_characteristics["language"]:
            score += 30
        
        # Gender match (if applicable)
        if voice["gender"] == source_characteristics.get("gender"):
            score += 20
        
        # Age range compatibility
        if voice["age_range"] == source_characteristics.get("age_range"):
            score += 15
        
        # Style appropriateness
        if voice["style"] in source_characteristics.get("preferred_styles", []):
            score += 25
        
        # Quality indicators
        if voice.get("usage_stats", {}).get("avg_rating", 0) >= 4.5:
            score += 10
        
        recommendations.append({
            "voice": voice,
            "compatibility_score": score,
            "reasons": []
        })
    
    # Sort by compatibility score
    recommendations.sort(key=lambda x: x["compatibility_score"], reverse=True)
    
    return recommendations[:5]  # Top 5 recommendations

Pricing

Voice conversion pricing is based on audio duration:
ServiceCostDescription
Voice Conversion990 credits/minuteTransform voice characteristics using OpenVoice v2

Cost Examples

DurationCreditsUSD Cost
30 seconds495$0.0659
2 minutes1980$0.2634
5 minutes4950$0.6584
10 minutes9900$1.3167

Cost Optimization Tips

  1. Pre-process audio to remove silence and optimize duration
  2. Batch similar conversions using the same target voice
  3. Test with shorter clips before converting long content
  4. Use high-quality source audio to avoid re-processing

Next Steps