Overview
- This is an async API endpoint.
- You will make a request to trigger podcast generation.
- You will then request another endpoint to see if the podcast is ready.
Prerequisites
- Access your credentials (secret key and user ID).
- Development environment for your chosen programming language.
Steps
1
Set up environment variables
Choose your operating system and set up the environment variables:
2
Create a new script
Create a new file and add the following code:
3
Configure the model and voices
Define the model and select voices for your hosts:
4
Add your podcast transcript
Add your scripted conversation in the correct format:
5
Configure the API payload
Set up the payload with your configuration:
6
Send the request and monitor progress
Add the code to send the request and check the status:
Complete Code
Key API Parameters
The following API payload define the conversation, speaker details, and audio generation options:-
model
: Specifies the PlayAI’s Dialog 1.0 model to be used. Here,PlayDialog
supports multi-turn conversation generation. -
text
: Contains the scripted conversation, with each turn prefixed by the speaker’s name (e.g.,"Country Mouse"
&"Town Mouse"
). -
voice
: URL path to the voice manifest for the first speaker. -
voice_2
: URL path to the voice manifest for the second speaker. -
turn_prefix
/turn_prefix_2
: Used to specify each speaker’s dialogue turns within thetext
field. For example:turn_prefix
saysCountry Mouse
to indicate the position where Speaker 1’s dialogue andturn_prefix_2
saysTown Mouse
that indicates the position where Speaker 2’s dialogue parts are. -
output_format
: Format for the generated audio file, typicallywav
ormp3
.
country_mouse.py
then Run the code using python3 country_mouse.py
pointing your terminal to the directory where the country_mouse.py
file is stored. This will save the dialogue.wav
in the same working directory.
Code Explanation
This script uses the Dialog 1.0 model to generate a multi-turn conversation between two characters. TheAUTHORIZATION token
and X-USER-ID
are required for authentication, which you’ll need to replace with your own credentials.
Each line of dialogue is labeled by character name (e.g., “Country Mouse
” or “Town Mouse
”) to simulate a natural conversation. The script assigns a unique voice to each character using voice
and voice2
. On a successful API call, the generated audio is saved as dialogue.wav
. Any errors are reported with status details.
To run the script:
- Replace placeholders in the headers with your API key and user ID.
-
Update the
text
with your scripted conversation - Update the Speaker Details and their respective voices
-
Run the script. If successful, an audio file,
dialogue.wav
, will be saved in the current directory, capturing the dialogue as configured. - This setup can easily adapt to more complex dialogues or different speakers.
Troubleshooting
-
Authentication Issues: Verify your
API key
anduser ID
. Ensure theAUTHORIZATION
header includes “Bearer ” followed by your token. -
API Endpoint Errors: Confirm you’re using the correct PlayAI’s Dialog 1.0 API endpoint URL and the
model
name isPlayDialog