How to use PlayDialog Async Text-to-Speech API
PlayDialog Async (Non-Streaming) API Guide
This guide provides a step-by-step approach to using the PlayDialog API to convert text into natural human-like sounding audio using the Async (non-streaming) API Endpoint.
In this example, we’ll have PlayDialog create a simple audio from the given input text.
Prerequisites
-
Access credentials (Secret key and User ID) for the PlayDialog API.
-
Python environment for executing the API request.
Setup your API Key
To keep your API key secure and avoid hardcoding it directly into your code, you can store it as an environment variable. This way, your script can access it securely without exposing the key.
Step 1: Set the Environment Variable
For macOS and Linux
-
Open your terminal.
-
Add this line to your ~/.bashrc or ~/.zshrc file to make it persistent across sessions:
- Run
source ~/.bashrc
(orsource ~/.zshrc
for zsh) to load the variables into your current session.
For Windows
-
Open Command Prompt or PowerShell.
-
Use the
setx
command to create each environment variable individually:
- Restart your terminal to apply the changes.
Step 2: Access the Variables in Python
In your Python script, use the os module to access the environment variables:
Key API Parameters
The following API payload defines the conversation, speaker details, and audio generation options:
-
model
: Specifies the PlayDialog API model to be used. -
text
: Contains the input text for which the speech audio has to be generated. -
voice
: URL path to the voice manifest for the first speaker. -
outputFormat
: Format for the generated audio file, typicallywav
ormp3
. -
speed
: (Optional) Adjust the speaking speed. -
language
: (Optional) Language of the input text.
Call the API - Non-Streaming Endpoint with Polling
The new non-streaming API requires submitting a job and polling to check its status. Once the job is completed, you can retrieve the audio file.
Submit a Job
Once this code is executed, You will get a job id which means:
-
The Async Job has been successfully submitted.
-
The
job_id
can be used to poll the status of the TTS job.
Sample Job ID - '9726f318-410a-4c8e-99c6-e2e1da6615e1'
Poll for Job Status
After submitting the job, poll the status until it is completed.
Sample output object:
Save the Generated Audio File
Once the job is completed, download and save the audio file from the provided URL.
If all the above code blocks were successfully executed, at this stage, you’d have an audio file saved in your local computer’s current working directory as output.mp3
.
Troubleshooting
-
Authentication Issues: Verify your
API key
anduser ID
. Ensure the header is correctly set up. -
Job Status Polling: Ensure you use the correct job ID to check the status.
-
API Endpoint Errors: Confirm you’re using the correct PlayDialog API endpoint URL and model name.