This guide provides a step-by-step approach to using the PlayDialog API to create a multi-turn scripted conversation between two distinct speakers.

In this example, we’ll have PlayDialog create a scripted conversation between two speakers - Country Mouse and Town Mouse.

Prerequisites

  • Access credentials (Secret key and User ID) for the PlayDialog API.
  • Python environment for executing the API request.

Setup your API Key

To keep your API key secure and avoid hardcoding it directly into your code, you can store it as an environment variable. This way, your script can access it securely without exposing the key.

Step 1: Set the Environment Variable

For macOS and Linux

  • Open your terminal.
  • Add this line to your ~/.bashrc or ~/.zshrc file to make it persistent across sessions
bash
echo 'export PLAYDIALOG_API_KEY="your_api_key_here"' >> ~/.bashrc
echo 'export PLAYDIALOG_USER_ID="your_user_id_here"' >> ~/.bashrc

  • Run ~/.bashrc (or source ~/.zshrc for zsh) to load the variables into your current session.

For Windows

  • Open Command Prompt or PowerShell.
  • Use the setx command to create each environment variable individually:
cmd
setx PLAYDIALOG_API_KEY "your_api_key_here"
cmd
setx PLAYDIALOG_USER_ID "your_user_id_here"
  • Restart your terminal to apply the changes.

Step 2: Access the Variables in Python

In your Python script, use the os module to access the environment variables:

python

import os

api_key = os.getenv("PLAYDIALOG_API_KEY")
user_id = os.getenv("PLAYDIALOG_USER_ID")

headers = {
    'Authorization': f'Bearer {api_key}',
    'Content-Type': 'application/json',
    'X-USER-ID': user_id
}

# Add the following API request with the payload

Key API Parameters

The following API payload define the conversation, speaker details, and audio generation options:

  • model: Specifies the PlayDialog API model to be used. Here, PlayDialog supports multi-turn conversation generation.
  • text: Contains the scripted conversation, with each turn prefixed by the speaker’s name (e.g., "Country Mouse" & "Town Mouse").
  • voice: URL path to the voice manifest for the first speaker.
  • voice_2: URL path to the voice manifest for the second speaker.
  • turn_prefix / turn_prefix_2: Used to specify each speaker’s dialogue turns within the text field. For example: turn_prefix says Country Mouse to indicate the position where Speaker 1’s dialogue and turn_prefix_2 says Town Mouse that indicates the position where Speaker 2’s dialogue parts are.
  • output_format: Format for the generated audio file, typically wav or mp3.

Call the API

The following Python script demonstrates how to request the PlayDialog API and save the generated audio file.

python
import requests
import os

# Set up headers with your API authentication token and user ID

api_key = os.getenv("PLAYDIALOG_API_KEY")
user_id = os.getenv("PLAYDIALOG_USER_ID")

headers = {
    'Authorization': f'Bearer {api_key}',
    'Content-Type': 'application/json',
    'X-USER-ID': user_id
}


# JSON payload containing the script and configuration settings

json_data = {
    'model': 'PlayDialog',
    'text': "Country Mouse: Welcome to my humble home, cousin! Please, make yourself comfortable. Town Mouse: Thank you, cousin. It's quite peaceful here. Country Mouse: It is indeed. I hope you're hungry. I've prepared a simple meal of beans, barley, and fresh roots. Town Mouse: Well, it's earthy. Do you eat this every day?",
    'voice': 's3://voice-cloning-zero-shot/baf1ef41-36b6-428c-9bdf-50ba54682bd8/original/manifest.json',
    'voice2': 's3://voice-cloning-zero-shot/e040bd1b-f190-4bdb-83f0-75ef85b18f84/original/manifest.json',
    'turnPrefix': 'Country Mouse:',
    'turnPrefix2': 'Town Mouse:',
    'prompt': None,
    'prompt2': None,
    'outputFormat': 'wav',
    'voiceConditioningSeconds2': 0,
    'voiceConditioningSeconds': 0,
    'quality': 'draft',
}

# Send the POST request to the PlayDialog API endpoint
response = requests.post('https://api.play.ai/api/v1/tts/stream', headers=headers, json=json_data)

# Handle response and save audio file
if response.status_code == 200:
    with open('dialogue.wav', 'wb') as f:
        f.write(response.content)
    print("Audio file saved as dialogue.wav")
else:
    print(f"Request failed with status code {response.status_code}: {response.text}")

If you happen to save the code as country_mouse.py then Run the code using python3 country_mouse.py pointing your terminal to the directory where the country_mouse.py file is stored. This will save the dialogue.wav in the same working directory.

Code Explanation

This script uses the PlayDialog API to generate a multi-turn conversation between two characters. The AUTHORIZATION token and X-USER-ID are required for authentication, which you’ll need to replace with your own credentials.

Each line of dialogue is labeled by character name (e.g., “Country Mouse” or “Town Mouse”) to simulate a natural conversation. The script assigns a unique voice to each character using voice and voice2. On a successful API call, the generated audio is saved as dialogue.wav. Any errors are reported with status details.

To run the script:

  • Replace placeholders in the headers with your API key and user ID.
  • Update the text with your scripted conversation
  • Update the Speaker Details and their respective voices
  • Run the script. If successful, an audio file, dialogue.wav, will be saved in the current directory, capturing the dialogue as configured.
  • This setup can easily adapt to more complex dialogues or different speakers.

Troubleshooting

  • Authentication Issues: Verify your API key and user ID. Ensure the AUTHORIZATION header includes “Bearer ” followed by your token.
  • API Endpoint Errors: Confirm you’re using the correct PlayDialog API endpoint URL and the model name is PlayDialog