This guide will walk you through generating a conversational-style podcast from a PDF using the PlayNote API. We’ll use the PlayNote API to take a PDF source file, synthesize it into an audio conversation between two voices, and retrieve the generated podcast URL. This tutorial assumes basic knowledge of Python and API requests.

Prerequisites

Before you start, ensure you have the following:

  1. API Key and User ID: You need your PLAYDIALOG_API_KEY and PLAYDIALOG_USER_ID from PlayDialog. Set these as environment variables to avoid exposing them directly in your code.

  2. Python Libraries: Install the requests library if you haven’t already.

bash
 pip install requests

Setup your API Key

To keep your API key secure and avoid hardcoding it directly into your code, you can store it as an environment variable. This way, your script can access it securely without exposing the key.

Step 1: Set the Environment Variable

For macOS and Linux

  • Open your terminal.

  • Add this line to your ~/.bashrc or ~/.zshrc file to make it persistent across sessions

bash
echo 'export PLAYDIALOG_API_KEY="your_api_key_here"' >> ~/.bashrc
echo 'export PLAYDIALOG_USER_ID="your_user_id_here"' >> ~/.bashrc

  • Run ~/.bashrc (or source ~/.zshrc for zsh) to load the variables into your current session.

For Windows

  • Open Command Prompt or PowerShell.

  • Use the setx command to create each environment variable individually:

cmd
setx PLAYDIALOG_API_KEY "your_api_key_here"
cmd
setx PLAYDIALOG_USER_ID "your_user_id_here"
  • Restart your terminal to apply the changes.

Step 1: Setup and Initialize

Start by importing necessary libraries and setting up the API endpoint and headers.

python
import requests
import os

# Define the URL of your PDF file
SOURCE_FILE_URL = "https://godinton.kent.sch.uk/media/2601/goldilocks-story.pdf"

# PlayNote API URL
url = "https://api.play.ai/api/v1/playnotes"

# Retrieve API key and User ID from environment variables
api_key = os.getenv("PLAYDIALOG_API_KEY")
user_id = os.getenv("PLAYDIALOG_USER_ID")

# Set up headers with authorization details
headers = {
    'AUTHORIZATION': api_key,
    'X-USER-ID': user_id,
    'accept': 'application/json'
}

Step 2: Send Request to Generate PlayNote

This step initiates the podcast creation process. Define the parameters such as synthesisStyle, sourceFileUrl, and the voices you want to use.

python
# Configure the request parameters
files = {
    'sourceFileUrl': (None, SOURCE_FILE_URL),
    'synthesisStyle': (None, 'podcast'),
    'voice1': (None, 's3://voice-cloning-zero-shot/baf1ef41-36b6-428c-9bdf-50ba54682bd8/original/manifest.json'),
    'voice1Name': (None, 'Angelo'),
    'voice2': (None, 's3://voice-cloning-zero-shot/e040bd1b-f190-4bdb-83f0-75ef85b18f84/original/manifest.json'),
    'voice2Name': (None, 'Deedee'),
}

# Send the POST request
response = requests.post(url, headers=headers, files=files)

# Check the response
if response.status_code == 201:
    print("Request sent successfully!")
    playNoteId = response.json().get('id')
    print(f"Generated PlayNote ID: {playNoteId}")
else:
    print(f"Failed to generate PlayNote: {response.text}")


Note: The PlayNote ID will be used to poll for the podcast’s generation status.

At the end of this step, you’d get the following message:

output
Request sent successfully!
Generated PlayNote ID: goldilocks-story.pdf

Step 3: Poll for Completion

After requesting a PlayNote, wait a few minutes before checking the status. Use the PlayNote ID to poll for the audio’s readiness.

python
import urllib.parse
import time

# Double-encode the PlayNote ID for the URL
double_encoded_id = urllib.parse.quote(playNoteId, safe='')

# Construct the final URL to check the status
status_url = f"https://api.play.ai/api/v1/playnotes/{double_encoded_id}"

# Poll for completion
while True:
    response = requests.get(status_url, headers=headers)
    if response.status_code == 200:
        playnote_data = response.json()
        status = playnote_data['status']
        if status == 'completed':
            print("PlayNote generation complete!")
            print("Audio URL:", playnote_data['audioUrl'])
            break
        elif status == 'generating':
            print("Please wait, your PlayNote is still generating...")
            time.sleep(120)  # Wait for 2 minute before polling again
        else:
            print("PlayNote creation failed, please try again.")
            break
    else:
        print(f"Error polling for PlayNote status: {response.text}")
        break

Final Output

Once the status shows as completed, the audio URL will be printed. You can use this link to access the generated podcast-style conversation.

Example Output

After completion, expect the output to look like this:

output
PlayNote generation complete!
Audio URL: https://path.to.generated.audio/podcast.mp3

Troubleshooting

  • Authentication Errors: Ensure that api_key and user_id are correctly set in your environment.

  • Source File Issues: Ensure your PDFs are publicly accessible

  • Generation Time: If the response indicates the PlayNote is still being generated, allow more time before retrying.

This guide provides a simple yet powerful way to turn text content from a PDF into a rich, conversational podcast format using PlayNote API. Modify the voice parameters to customize the conversation to match your desired style.