# Get Agent Stats

GET /api/v1/agent-stats/{agentId}
Returns all available statistics for a specific agent.


# Get Agent Conversations

GET /api/v1/agents/{agentId}/conversations
Get all conversations for an agent.

Retrieve all information about an agent's conversations.

### Response Headers for Pagination

| Header Name          | Type    | Description                                   |
| -------------------- | ------- | --------------------------------------------- |
| `X-Page-Size`        | integer | The number of items per page.                 |
| `X-Start-After`      | string  | The ID of the last item on the previous page. |
| `X-Next-Start-After` | string  | The ID of the last item on the current page.  |
| `X-Total-Count`      | integer | The total number of items.                    |

These headers are included in the response to help manage pagination when retrieving conversations for a specific agent.


# Get Agent Conversation

GET /api/v1/agents/{agentId}/conversations/{conversationId}
Get a conversation for an agent.

Retrieve all information about an agent conversation.


# Get Agent Conversation Transcript

GET /api/v1/agents/{agentId}/conversations/{conversationId}/transcript
Get a conversation transcript for an agent.

Retrieve the transcript of a specific agent conversation.

### Response Headers for Pagination

| Header Name          | Type    | Description                                   |
| -------------------- | ------- | --------------------------------------------- |
| `X-Page-Size`        | integer | The number of items per page.                 |
| `X-Start-After`      | string  | The ID of the last item on the previous page. |
| `X-Next-Start-After` | string  | The ID of the last item on the current page.  |
| `X-Total-Count`      | integer | The total number of items.                    |

These headers are included in the response to help manage pagination when retrieving conversation transcript for a specific agent conversation.


# Get Agent

GET /api/v1/agents/{agentId}
Retrieve all information about an agent.

Retrieve all information about an agent.


# Update Agent

PATCH /api/v1/agents/{agentId}
Updates a Play.ai Agent.

Updates the properties of the agent with the specified ID.


# Create Agent

POST /api/v1/agents
Creates a new Play.ai Agent.

Use this endpoint to create new agents. Required parameters include the agent's name and the agent's prompt.

After you create your agent, you can proceed to start a conversation using our [Websocket API](/api-reference/websocket), or you can try it
out through our web interface at `https://play.ai/agent/<your-agent-id>`.

To update the agents see the [Update Agent](/api-reference/endpoints/v1/agents/patch) endpoint.


# Delete External Function

Delete /api/v1/external-functions/{functionId}
Delete an external function.

Deletes the external function with the specified ID.


# Get All External Functions

GET /api/v1/external-functions
Get all external functions.

Retrieve all information about all external functions that you have created.


# Get External Function

GET /api/v1/external-functions/{functionId}
Get an external function.

Retrieve all information about the external function with the specified ID.


# Update External Function

PATCH /api/v1/external-functions/{functionId}
Update an external function.

Updates the properties of the external function with the specified ID.


# Create External Function

POST /api/v1/external-functions
Create a new external function.

Use this endpoint to create new external functions. Required parameters include the external function's name and the external function's description.

After you create your agent, you can attach the external function to an agent.

To update the external functions see the [Update External Function](/api-reference/endpoints/v1/external-functions/patch) endpoint.


# Introduction

HTTP API endpoints

Play.ai provides a simple and easy to use HTTP API to create and manage AI Agents.

After you create your agent, you can proceed to start a conversation using our [Websocket API](/api-reference/websocket), or you can try it
out through our web interface at `https://play.ai/agent/<your-agent-id>`.

## Authentication

All API endpoints are authenticated using a User ID and API Key. After you have created an account and logged in, you
can get your API Key from the [For Developers](https://play.ai/developers) page.


# Web Embed API

API for embedding a play.ai agent on your website

This document provides detailed information about the key components of the play.ai web embed API: the events array, the onEvent handler, and the openEmbed function.

## Installation

```bash
npm install @play-ai/web-embed
```

## `openEmbed` Function

The `openEmbed` function initializes and opens the play.ai web embed on the webpage. It is imported from the `@play-ai/web-embed` package as follows

```typescript
import { open as openEmbed } from '@play-ai/web-embed';
```

It has the following signature:

```typescript
function openEmbed(
  webEmbedId: string,
  options: {
    events?: ReadonlyArray<Event>;
    onEvent?: OnEventHandler;
    prompt?: string;
    customGreeting?: string;
  },
): { setMinimized: (minimize?: boolean) => void };
```

### Parameters

*   `webEmbedId`: A string representing your unique web embed ID provided by play.ai.
*   `options`: An object containing:
    *   `events?`: The array of custom events your application can handle.
    *   `onEvent?`: The event handler function.
    *   `customGreeting?`: A custom greeting that replaces the default greeting.
    *   `prompt?`: A prompt to give the agent in addition to the default prompt. Use this to give context that is page-specific or user-specific, e.g. "The form fields on the current page are X, Y, and Z".

### Return type

*   `setMinimized(minimize?: boolean)`: A function that allows you to toggle the minimized state of the web embed. Pass in `true` to minimize and `false` to maximize the web embed. Toggle the minimize state by passing in `undefined`.

### Example

```javascript
import { open as openEmbed } from '@play-ai/web-embed';

// ... (events and onEvent definitions) ...

useEffect(() => {
  const { setMinimized } = openEmbed(webEmbedId, {
    events,
    onEvent,
    customGreeting: "Let's fill out this form together!",
    prompt: 'The form fields on the current page are name, email, and shipping address',
  });
}, []);
```

In this example, the openEmbed function is called inside a useEffect hook to initialize the web embed when the component mounts.

## Events Array

The events array defines the custom events that your application can handle. Each event in the array is an object with the following structure:

```typescript
type Event = {
  name: string;
  when: string;
  data: {
    [key: string]: {
      type: string;
      description?: string;
      values?: string[];
    };
  };
};
```

### Properties

*   `name`: A string that uniquely identifies the event.
*   `when`: A string describing the condition that triggers this event.
*   `data`: An object describing the data that should be passed to the event handler. Each key in this object represents the name of the data field, and its value is an object with:
    *   `type`: The data type of the field (currently supports `string`, `number`, `boolean`, and `enum`).
    *   `description?`: A brief description of what this data field represents.
    *   `values?`: An array of strings representing the possible values for the field if the type is `enum`.

### Example

```javascript
const events = [
  {
    name: "change-text",
    when: "The user says what they want to change the text on the screen to",
    data: {
      text: { type: "string", description: "The text to change to" },
    },
  },
] as const;
```

## onEvent Handler

The onEvent handler is a function that processes events triggered by the play.ai web embed. It has the following signature:

```typescript
type OnEventHandler = (event: { name: string; data: Record<string, any> }) => void;
```

### Parameters

*   `event`: An object containing:
    *   `name`: The name of the triggered event (matching an event name from the events array).
    *   `data`: An object containing the data associated with the event (matching the data structure from the events array).

### Example

```javascript
const onEvent = (event) => {
  console.log('onEvent: ', event);
  if (event.name === 'change-text') {
    setText(event.data.text);
  }
};
```

In this example, the handler logs all events and updates the text state when a "change-text" event is received.

## Putting It All Together

Here's how these components work together:

1.  You define your custom events in the events array.
2.  You implement the onEvent handler to process these events.
3.  You call the openEmbed function, passing your web embed ID, the events array, the onEvent handler, and optionally customGreeting and prompt.
4.  When a user interacts with the AI agent, it may trigger one of your defined events.
5.  The onEvent handler receives the triggered event and processes it according to your implementation.

This structure allows for flexible, event-driven interactions between your web application and the play.ai web embed.

[Learn more about how to use the web embed API in this guide](/documentation/web-embed/embedding-an-agent-on-your-website).


# Websocket API

Enhance your app with our audio-in, audio-out API, enabling seamless, natural conversations with your PlayAI agent. Transform your user experience with the power of voice.

<Tip>
  To use our WebSocket, you will need beforehand:

  *   A [Play.ai account](https://play.ai/pricing)
  *   An [API key to authenticate](https://play.ai/developers) with the Play.ai API
  *   An agent ID of a Play.ai Agent (created via our [Web UI](https://play.ai/my-agents) or our [Create Agent endpoint](/api-reference/endpoints/v1/agents/post))

  To fully leverage our WebSocket API, the steps are:

  *   Connect to our `wss://api.play.ai/v1/talk/<your_agent_id>` URL
  *   Send a `{"type":"setup","apiKey":"yourKey"}` message as first message
  *   Send audio input as base64 encoded string in `{"type":"audioIn","data":"base64Data"}` messages
  *   Receive audio output in `{"type":"audioStream","data":"base64Data"}` messages
</Tip>

# Establishing a Connection

To initiate a conversation, establish a websocket connection to our `talk` URL, including the `agentId` as a path parameter:

```text
wss://api.play.ai/v1/talk/<your_agent_id>
```

For example, assuming `Agent-XP5tVPa8GDWym6j` is the ID of an agent
you have created via our [Web UI](https://play.ai/my-agents) or through our [Create Agent endpoint](/api-reference/endpoints/v1/agents/post),
the WebSocket URL should look like:

```js
const myWs = new WebSocket('wss://api.play.ai/v1/talk/Agent-XP5tVPa8GDWym6j');
```

# Initial Setup Message

Before you can start sending and receiving audio data, you must first send a `setup` message to authenticate
and configure your session.

<Frame caption="WebSocket basic connection, setup and message flow" type="glass">
  ```mermaid
  graph TB
  subgraph "conversation"
  C --> D[Send 'audioIn' messages containing your user's audio data]
  D --> C
  end
  B --> C[Receive 'audioStream' messages containing Agent's audio data]
  subgraph setup
  A[Establish WebSocket Connection] --> B[Send 'setup' message]
  end

  ```
</Frame>

The only required field in the setup message is the `apiKey`. This assumes you are comfortable with the default
values for audio input and audio output formats. In this scenario, your first setup message could be as simple as:

```json
{ "type": "setup", "apiKey": "yourKey" }
```

<Note>Get your API Key at our [Developers](https://play.ai/developers) page</Note>

Code example:

```js
const myWs = new WebSocket('wss://api.play.ai/v1/talk/Agent-XP5tVPa8GDWym6j');
myWs.onopen = () => {
  console.log('connected!');
  myWs.send(JSON.stringify({ type: 'setup', apiKey: 'yourApiKey' }));
};
```

## Setup Options

The setup message configures important details of the session,
including the format/encoding of the audio that you intend to send us and the format that you expect to receive.

```json Example setup messages with various options:
// mulaw 16KHz as input
{ "type": "setup", "apiKey": "...", "inputEncoding": "mulaw", "inputSampleRate": 16000 }
// 24Khz mp3 output
{ "type": "setup", "apiKey": "...", "outputFormat": "mp3", "outputSampleRate": 24000 }
// mulaw 8KHz in and out
{ "type": "setup", "apiKey": "...", "inputEncoding": "mulaw", "inputSampleRate": 8000, "outputFormat": "mulaw", "outputSampleRate": 8000 }
```

The following fields are available for configuration:

<table>
  <thead>
    <tr>
      <th>Property</th>
      <th>Accepted values</th>
      <th>Description</th>
      <th>Default value</th>
    </tr>
  </thead>

  <tbody>
    <tr>
      <td>`type`<br />(required)</td>
      <td>`"setup"`</td>
      <td>Specifies that the message is a setup command.</td>
      <td>-</td>
    </tr>

    <tr>
      <td>`apiKey`<br />(required)</td>
      <td>`string`</td>
      <td>[Your API Key](https://play.ai/developers).</td>
      <td>-</td>
    </tr>

    <tr>
      <td>`outputFormat`<br />(optional)</td>

      <td>
        *   `"mp3"`
        *   `"raw"`
        *   `"wav"`
        *   `"ogg"`
        *   `"flac"`
        *   `"mulaw"`
      </td>

      <td>
        The format of audio you want our agent to output in the `audioStream` messages.

        *   `mp3` = 128kbps MP3
        *   `raw` = PCM\_FP32
        *   `wav` = 16-bit (uint16) PCM
        *   `ogg` = 80kbps OGG Vorbis
        *   `flac` = 16-bit (int16) FLAC
        *   `mulaw` = 8-bit (uint8) PCM headerless
      </td>

      <td>`"mp3"`</td>
    </tr>

    <tr>
      <td>`outputSampleRate`<br />(optional)</td>
      <td>`number`</td>
      <td>The sample rate of the audio you want our agent to output in the `audioStream` messages</td>
      <td>`44100`</td>
    </tr>

    <tr>
      <td>`inputEncoding`<br />(optional)</td>

      <td>
        For non-headerless formats:

        `"media-container"`

        For headerless formats:

        *   `"mulaw"`
        *   `"linear16"`
        *   `"flac"`
        *   `"amr-nb"`
        *   `"amr-wb"`
        *   `"opus"`
        *   `"speex"`
        *   `"g729"`
      </td>

      <td>
        The encoding of the audio you intend to send in the `audioIn` messages.

        If your are sending audio formats that use media containers (that is, audio that contain headers, such as `mp4`,
        `m4a`, `mp3`, `ogg`, `flac`, `wav`, `mkv`, `webm`, `aiff`), just use `"media-container"` as value for
        `inputEncoding` (or don't pass any value at all since `"media-container"` is the default).
        This will instruct our servers to process the audio based on the data headers.

        If, on the other hand, you will send us audio in headerless formats, you have to specify the format you will
        be sending. In this case, specify it by, e.g., setting `inputEncoding` to `"mulaw"`, `"flac"`, etc.
      </td>

      <td>`"media-container"`</td>
    </tr>

    <tr>
      <td>`inputSampleRate`<br />(optional)</td>
      <td>`number`</td>
      <td>The sample rate of the audio you intend to send. Required if you are specifying an `inputEncoding` different
      than `"media-container"`. Optional, otherwise</td>
      <td>-</td>
    </tr>

    <tr>
      <td>`customGreeting`<br />(optional)</td>
      <td>`string`</td>

      <td>
        Your agent will say this message to start every conversation.
        This overrides the agent's greeting.
      </td>

      <td>-</td>
    </tr>

    <tr>
      <td>`prompt`<br />(optional)</td>
      <td>`string`</td>

      <td>
        Give instructions to your AI about how it should behave and interact with others in conversation.
        This is appended to the agent's prompt.
      </td>

      <td>`""`</td>
    </tr>

    <tr>
      <td>`continueConversation`<br />(optional)</td>
      <td>`string`</td>

      <td>
        If you want to continue a conversation from a previous session, pass the `conversationId` here.
        The agent will continue the conversation from where it left off.
      </td>

      <td>-</td>
    </tr>
  </tbody>
</table>

<br />

<br />

# `audioIn`: Sending Audio Input

After the setup, you can send audio input in the form of an `audioIn` message.

The audio must be sent as a base64 encoded string in the `data` field. The message format is:

```json
{ "type": "audioIn", "data": "<base64Data>" }
```

<Tip>The audio you send must match the `inputEncoding` and `inputSampleRate` you configured in the setup options.</Tip>

## Sample Code for Sending Audio

Assuming `myWs` is a WebSocket connected to our `/v1/talk` endpoint, the sample code below would
send audio directly from the browser:

```javascript
const stream = await navigator.mediaDevices.getUserMedia({
  audio: {
    channelCount: 1,
    echoCancellation: true,
    autoGainControl: true,
    noiseSuppression: true,
  },
});
const mediaRecorder = new MediaRecorder(stream);
mediaRecorder.ondataavailable = async (event) => {
  const base64Data = await blobToBase64(event.data);

  // Relevant:
  myWs.send(JSON.stringify({ type: 'audioIn', data: base64Data }));
};

async function blobToBase64(blob) {
  const reader = new FileReader();
  reader.readAsDataURL(blob);
  return new Promise((resolve) => {
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
  });
}
```

# `audioStream`: Receiving Audio Output

Audio output from the server will be received in an `audioStream` message. The message format is:

```json
{ "type": "audioStream", "data": "<base64Data>" }
```

<Tip>
  The audio you receive will match the `outputFormat` and `outputSampleRate` you configured in the setup options.
</Tip>

## Sample Code for Receiving Audio

```javascript
myWs.on('message', (message) => {
  const event = JSON.parse(message);
  if (event.type === 'audioStream') {
    // deserialize event.data from a base64 string to binary
    // enqueue/play the binary data at your player
    return;
  }
});
```

# Voice Activity Detection: `voiceActivityStart` and `voiceActivityEnd`

During the conversation, you will receive `voiceActivityStart` and `voiceActivityEnd` messages indicating the detection
of speech activity in the audio input. These messages help in understanding when the user starts and stops speaking.

When our service detects that the user started to speak, it will emit a `voiceActivityStart` event.
Such a message will have the format:

```json
{ "type": "voiceActivityStart" }
```

It is up to you to decide how to react to this event.
We highly recommend you stop playing whatever audio is being played, since the `voiceActivityStart` generally indicates
the user wanted to interrupt the agent.

Similarly, when our service detects that the user stopped speaking, it emits a `voiceActivityEnd` event:

```json
{ "type": "voiceActivityEnd" }
```

# `newAudioStream`: Handling New Audio Streams

A `newAudioStream` message indicates the start the audio of a new response.
It is recommended to clear your player buffer and start playing the new stream content upon receiving this message.
This message contains no additional fields.

# Error Handling

Errors from the server are sent as `error` message type, a numeric code and a message in the following format:

```json
{ "type": "error", "code": <errorCode>, "message": "<errorMessage>" }
```

The table below provides a quick reference to the various error codes and their corresponding messages for the Agent Websocket API.

| Error Code | Error Message                                                                                                                                                                                                                                                       |
| ---------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| 1001       | Invalid authorization token.                                                                                                                                                                                                                                        |
| 1002       | Invalid agent id.                                                                                                                                                                                                                                                   |
| 1003       | Invalid authorization credentials.                                                                                                                                                                                                                                  |
| 1005       | Not enough credits.                                                                                                                                                                                                                                                 |
| 4400       | Invalid parameters. Indicates the message sent to the server failed to match the expected format. Double check the logic and try again.                                                                                                                             |
| 4401       | Unauthorized. Invalid authorization token or invalid authorization credentials for specified agent.                                                                                                                                                                 |
| 4429       | You have reached the maximum number of concurrent connections allowed by your current plan. Please consider upgrading your plan or reducing the number of active connections to continue.                                                                           |
| 4500       | Generic error code for internal errors, such as failures to generate responses.<br />Generally, the user is not at fault when these happen. An appropriate reaction is to wait a few moments and try again. If the problem persists, contacting support is advised. |

***

This documentation covers the essential aspects of interacting with the PlayAI Websocket API for agent conversations. Ensure that your implementation handles the specified message types and follows the outlined protocols for a seamless integration.


# Bring Your Own LLM


<RequestExample>
  ```bash Updating an agent to use your own LLM
  curl --request PATCH \
    -- url 'https://api.play.ai/api/v1/agents/{{agentId}}' \
    --header 'Authorization : Bearer {{yourApiKey}}' \
    --header 'X-User-Id: {{yourUserId}}' \
    --header 'Content-Type: application/json' \
    --data-raw '{
      "llm": {
        "baseURL": "https://my.own.llm.api.example.com",
        "apiKey": "sk-the-api-key-for-the-llm-api",
      }
    }'
  ```
</RequestExample>

<Card icon="sparkles" iconType="duotone" color="#b4fd83">
  Use Play.ai agents with responses generated by your own LLM.
</Card>

At [Play.ai](https://play.ai), we provide a built-in LLM+RAG system that you can use to power your agents.
However, we understand that you may want to use your own LLM for various reasons, such as:

*   You have a custom LLM that you have trained on your own data.
*   You have a license to use a specific LLM that your company has purchased.
*   You want to use a specific LLM that we do not provide.

In these cases, you can have your own LLM generate responses for your agents at Play.ai.
We provide an API that allows you to configure your agent with your own LLM, as long as it
is [OpenAI API-Compatible](#what-are-the-compatibility-requirements-for-my-llm-api), and a simple API call to update
your agent is all you need to start using it.

## How to bring your own LLM

<Tip>
  To bring your own LLM, you need:

  *   A [Play.ai account](https://play.ai/pricing)
  *   An [API key to authenticate](https://play.ai/developers) with the Play.ai API
  *   An LLM that provides an [OpenAI-Compatible API](#what-are-the-compatibility-requirements-for-my-llm-api)
  *   Calling the API with the `llm` field set in the [Update Agent endpoint](/api-reference/endpoints/v1/agents/patch)
</Tip>

To bring your own [OpenAI API-Compatible](#what-are-the-compatibility-requirements-for-my-llm-api) LLM, you will
need to provide us with <Tooltip tip="Our app will send prompts to this URL">the base URL of your LLM API</Tooltip> and <Tooltip tip="So your LLM knows it's us. You can use dummy values if it doesn't matter">an API key to authenticate with it</Tooltip>.
We will use this information to communicate with your LLM when your agent needs to generate responses.

### Updating an agent to use your own LLM

Once you have created an agent, you can update it with your LLM. To do so, call our [Update Agent endpoint](/api-reference/endpoints/v1/agents/patch) with the `llm` field set.

<Note>
  For a full step-by-step guide on how to bring your own LLM, see the [tutorial below](#bring-your-own-llm-tutorial).
</Note>

For example, assuming `agentId` is the ID of an agent you have created via our [Web UI](https://play.ai/my-agents) or through our [Create Agent endpoint](/api-reference/endpoints/v1/agents/post), the request below would update the agent:

<CodeGroup>
  ```http request
  PATCH https://api.play.ai/api/v1/agents/{{agentId}}
  Authorization: Bearer {{yourApiKey}}
  X-User-Id: {{yourUserId}}
  Content-Type: application/json

  {
    "llm": {
      "baseURL": "https://my.own.llm.api.example.com",
      "apiKey": "sk-the-api-key-for-the-llm-api"
    }
  }
  ```

  ```bash curl
  curl --request PATCH \
    -- url 'https://api.play.ai/api/v1/agents/{{agentId}}' \
    --header 'Authorization : Bearer {{yourApiKey}}' \
    --header 'X-User-Id: {{yourUserId}}' \
    --header 'Content-Type: application/json' \
    --data-raw '{
    "llm": {
        "baseURL": "https://my.own.llm.api.example.com",
        "apiKey": "sk-the-api-key-for-the-llm-api",
      }
    }'
  ```
</CodeGroup>

For a full step-by-step guide on how to bring your own LLM, make sure to check the [tutorial below](#bring-your-own-llm-tutorial).

<br />

# Frequently Asked Questions

## Can I set custom headers or other configurations?

Yes, you can set custom headers or other configurations
through <Tooltip tip="The `llm.baseParams` field on the Create (POST) and Update (PATCH) endpoints">the `baseParams` field</Tooltip>.
Current configuration options include: <Tooltip tip="The `llm.baseParams.defaultHeaders` field">custom headers (`defaultHeaders`)</Tooltip>
, <Tooltip tip="The `llm.baseParams.model` field">`model` </Tooltip>
, <Tooltip tip="The `llm.baseParams.temperature` field">`temperature` </Tooltip>, and more. Check the details on the `llm`
field in the [Update Agent](/api-reference/endpoints/v1/agents/patch) or [Create Agent](/api-reference/endpoints/v1/agents/post) API pages.

## What are the compatibility requirements for my LLM API?

We expect the LLM URL provided
at <Tooltip tip="The `llm` field on the Create (POST) and Update (PATCH) endpoints">`llm.baseURL` </Tooltip> to be OpenAI API-Compatible.
In other words, it must be an API that serves an endpoint at `/chat/completions` that returns an SSE response when prompted.
Typically, these APIs can be used directly with OpenAI's or LangChain's SDKs.

Below, you can find an example of a request and response to an OpenAI-Compatible API.

<CodeGroup>
  ```http Example Request
  POST https://my.own.llm.api.example.com/chat/completions
  Authorization: Bearer sk-the-api-key-for-the-llm-api
  Content-Type: application/json

  {
    "model": "gpt-3.5-turbo",
    "messages": [
      {
        "role": "user",
        "content": "Repeat with me: the smoke test has passed."
      }
    ],
    "stream": true
  }
  ```

  ```http Example Response
  HTTP 200 OK
  content-type: text/event-stream

  data: {"id":"chatcmpl-fewiojLJ6ZYcGhMs1FCp","object":"chat.completion.chunk","created":1714599207,"model":"gpt-3.5-turbo-0125","system_fingerprint":"fp_123456","choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]}

  data: {"id":"chatcmpl-fewiojLJ6ZYcGhMs1FCp","object":"chat.completion.chunk","created":1714599207,"model":"gpt-3.5-turbo-0125","system_fingerprint":"fp_123456","choices":[{"index":0,"delta":{"content":"The"},"logprobs":null,"finish_reason":null}]}

  data: {"id":"chatcmpl-fewiojLJ6ZYcGhMs1FCp","object":"chat.completion.chunk","created":1714599207,"model":"gpt-3.5-turbo-0125","system_fingerprint":"fp_123456","choices":[{"index":0,"delta":{"content":" smoke"},"logprobs":null,"finish_reason":null}]}

  data: {"id":"chatcmpl-fewiojLJ6ZYcGhMs1FCp","object":"chat.completion.chunk","created":1714599207,"model":"gpt-3.5-turbo-0125","system_fingerprint":"fp_123456","choices":[{"index":0,"delta":{"content":" test"},"logprobs":null,"finish_reason":null}]}

  data: {"id":"chatcmpl-fewiojLJ6ZYcGhMs1FCp","object":"chat.completion.chunk","created":1714599207,"model":"gpt-3.5-turbo-0125","system_fingerprint":"fp_123456","choices":[{"index":0,"delta":{"content":" has"},"logprobs":null,"finish_reason":null}]}

  data: {"id":"chatcmpl-fewiojLJ6ZYcGhMs1FCp","object":"chat.completion.chunk","created":1714599207,"model":"gpt-3.5-turbo-0125","system_fingerprint":"fp_123456","choices":[{"index":0,"delta":{"content":" passed"},"logprobs":null,"finish_reason":null}]}

  data: {"id":"chatcmpl-fewiojLJ6ZYcGhMs1FCp","object":"chat.completion.chunk","created":1714599207,"model":"gpt-3.5-turbo-0125","system_fingerprint":"fp_123456","choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}]}

  data: {"id":"chatcmpl-fewiojLJ6ZYcGhMs1FCp","object":"chat.completion.chunk","created":1714599207,"model":"gpt-3.5-turbo-0125","system_fingerprint":"fp_123456","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}

  data: [DONE]


  ```
</CodeGroup>

This is, therefore, the contract your LLM's API must fulfill to be compatible with Play.ai.

## How can I test if my LLM API is compatible?

Besides manually verifying if your API returns a response similar to the one mentioned above, <Tooltip tip="You achieve that by calling our Update Agent (PATCH) endpoint with the `llm` field set.">when
you try to update an agent with your LLM</Tooltip>, our backend will test your API for compatibility before returning a successful response.

If the <Tooltip tip="The URL you pass on `llm.baseURL`">provided LLM API</Tooltip> fails the compatibility check, our servers will
return a `400` error with a message describing the issues found.

## How to use my own RAG and integrations?

Currently, you can use the built-in RAG and integrations provided by Play.ai, having your LLM work only as a response
generator.

Alternatively, it is entirely possible to use your own RAG and integrations under the hood of your LLM API, having your
LLM interact with them before generating responses.

<Note>
  We are working on supporting custom RAGs and integrations, in a similar way to how we support custom LLMs. Stay tuned
  for updates on this page.
</Note>

<br />

<br />

# Bring Your Own LLM Tutorial

<Tip>
  To bring your own LLM, you need:

  *   A [Play.ai account](https://play.ai/pricing)
  *   An [API key to authenticate](https://play.ai/developers) with the Play.ai API
  *   An LLM that provides an [OpenAI-Compatible API](#what-are-the-compatibility-requirements-for-my-llm-api)
  *   Calling the API with the `llm` field set in the [Update Agent endpoint](/api-reference/endpoints/v1/agents/patch)
</Tip>

To bring your own LLM to Play.ai, follow the steps below:

<Steps>
  <Step title="Create your Play.ai account and get your User ID and API Key">
    <Accordion title="Click to expand">
      First, you need to have a Play.ai account. If you don't have one, you can sign up [here](https://play.ai/pricing).

      Next, you need to have an API key to authenticate with the Play.ai API. You can get your User ID and create an API
      key in your [developers page](https://play.ai/developers).

      Example user ID: `WNLMXuPV1y123b7Tusx7ZftkpJE3`

      Example API Key: `ak-4912345671ea4db7943be123f5212953`
    </Accordion>
  </Step>

  <Step title="Create your agent">
    To set the LLM, you will need the ID of an existing agent, e.g. `myagent-123tVPa7XMBX5Dmym613j`. Your existing
    agents (created via the API or the Web UI) can be found in your [my agents page](https://play.ai/my-agents).

    To create a new agent, you can use our [Web UI](https://play.ai/create-agent) or our
    [Create Agent (POST) endpoint](/api-reference/endpoints/v1/agents/post).

    <Note>Agent creation via the [Web UI](https://play.ai/my-agents) or the [API](/api-reference/endpoints/v1/agents/post)
    has the same effect. You can use either method to create an agent that will use your LLM. You can also edit an
    agent created via the Web UI using the API and vice versa.</Note>
  </Step>

  <Step title="Update the agent with your LLM">
    <Tip>Creating an agent through the [API](/api-reference/endpoints/v1/agents/post) with the `llm` field set would have the same effect described in this step.</Tip>

    Finally, you need to have an LLM that provides an [OpenAI-Compatible API](#what-are-the-compatibility-requirements-for-my-llm-api) and call our endpoint to update your agent.

    <Info>If you don't have a compatible LLM right now -- or just want to try our API out --, you can use a Fake LLM we have built for this tutorial.
    Visit and fork [https://replit.com/@antonio-play/fake-llm](https://replit.com/@antonio-play/fake-llm)   </Info>

    With your URL and API key in hand, you can update your agent with your LLM. To do this, call our [Update Agent endpoint](/api-reference/endpoints/v1/agents/patch) with the `llm` field set, as described next.

    Assuming `agentId` is the ID of an agent -- created either via our [Web UI](https://play.ai/my-agents) or
    through our [Create Agent endpoint](/api-reference/endpoints/v1/agents/post) --, the request below would update the agent:

    <CodeGroup>
      ```http request
      PATCH https://api.play.ai/api/v1/agents/{{agentId}}
      Authorization: Bearer {{yourApiKey}}
      X-User-Id: {{yourUserId}}
      Content-Type: application/json

      {
        "llm": {
          "baseURL": "https://my.own.llm.api.example.com",
          "apiKey": "sk-the-api-key-for-the-llm-api"
        }
      }
      ```

      ```bash curl
      curl --request PATCH \
      -- url 'https://api.play.ai/api/v1/agents/{{agentId}}' \
      --header 'Authorization : Bearer {{yourApiKey}}' \
      --header 'X-User-Id: {{yourUserId}}' \
      --header 'Content-Type: application/json' \
      --data-raw '{
        "llm": {
          "baseURL": "https://my.own.llm.api.example.com",
          "apiKey": "sk-the-api-key-for-the-llm-api",
        }
      }'
      ```
    </CodeGroup>

    If you get a `200` response, your agent is now configured to use your LLM. The new settings will affect all
    conversations *started after* the update.

    <Info>
      There are checks in place: When updating the `llm` property of an agent (or creating a new one with `llm` set), our backend will test your
      API for compatibility before returning a successful response. If
      the <Tooltip tip="The URL you pass on `llm.baseURL`">provided LLM API</Tooltip> fails the compatibility check,
      our servers will return a `400` error with a message describing the issues found.
    </Info>
  </Step>

  <Step title="Try it out">
    To test your agent, you can talk to it using our Web UI at the direct URL `https://play.ai/agent/{{yourAgentId}}` or
    you can leverage our [Websocket API](/api-reference/websocket).

    <Tip>Agents created via the API also appear in the [Agents list page at the Web UI](https://play.ai/my-agents).
    You can click on the agent there to access the direct URL and have conversations.</Tip>
  </Step>
</Steps>


# Built-in LLMs


When creating an agent with Play.ai, you have the flexibility to choose from a variety of language models to power your agent's intelligence. This customization allows you to tailor your agent's capabilities to your specific needs, and have fun experimenting! Let's explore the options and their pros and cons.

## Available Models

Play.ai offers a range of models, including:

*   GPT models (e.g., gpt-4o, gpt-4o-mini, gpt-3.5-turbo, gpt-4, gpt-4-turbo)
*   Llama models (e.g., llama3-70b-8192, llama3-8b-8192)
*   Hermes models (e.g., hermes-3-llama-3.1-405b-fp8, hermes-3-llama-3.1-405b-fp8-128k)
*   Gemma models (e.g., gemma-7b-it, gemma2-9b-it)
*   Mixtral model (mixtral-8x7b-32768)

Each model has its own strengths and trade-offs, which we'll discuss below.

## Pros and Cons of Different Model Types

### GPT Models

Pros:

*   High-quality outputs and strong general knowledge
*   Excellent at understanding context and nuance
*   Constantly updated with new information

Cons:

*   May have longer response times for larger models (e.g., GPT-4)
*   Can be overly formal and assistant like.

### Llama Models

Pros:

*   Various sizes available to balance performance and speed
*   More conversational, funny, and creative
*   Some versions optimized for specific tasks (e.g., tool use)

Cons:

*   May not have as broad knowledge as GPT models
*   Performance can vary depending on the specific version and task
*   Generally less accurate at action execution than gpt models. And

### Hermes Models

Pros:

*   Optimized for conversational use cases
*   Based on Llama architecture with additional training on massive conversational datasets

Cons:

*   May be less versatile than more general-purpose models
*   Limited availability of different sizes and variants
*   Larger model -> higher latency
*   No support for actions

## Choosing the Right Model

We're still very early in the world of AI and LLMS. There is no "one size fits all" model, and we're still learning what works best for different use cases.
That said, it is clear to us that gpt-4o is the best all-around model for agents that use actions, and the llama models seem to be the funniest on average.

Happy agent building!


# Actions and Integrations


<Card icon="sparkles" iconType="duotone" color="#b4fd83">
  Supercharge your Play.ai agents with the capacity to interact with external services.
</Card>

Prerequisites: Basic understanding of REST APIs. If you’re new to APIs, check out [this video](https://www.youtube.com/watch?v=Yzx7ihtCGBs) first, then come back.

## What is an Action?

An action is a task that your agent can execute that involves interfacing with the outside world. For example:

*   Calling the Google Calendar API to book an appointment for a user.
*   Fetching a funny quote from a public API and reciting it to a user.

Fundamentally, actions are flexible and discrete pieces of functionality that you can build into any agent — and unlock limitless utility!

## How do I create an Action?

### Step 1 - Get to the right place

Navigate to our [Actions page](https://play.ai/integrations)

Where you will find a form that looks like this:

<img height="200" src="https://framerusercontent.com/images/mammpETxoaWagG2qV5gaYkvtOM.png" />

### Step 2 - Describe your action

Agents determine when to invoke actions using the description that you assign when creating an action. If your action description is “Get a funny quote”, when you ask your agent during a conversation “hi, can you get me a funny quote”, your agent will happily do so. However, if your action description is “fry me a hot dog”, the agent will have no idea what to do.

<img height="150" src="https://framerusercontent.com/images/lHD3aeZFVVaQSUMOKWtJKuReapk.png" />

### Step 3 - Enter your endpoint and choose your request method

This is where the (slightly technical work) comes in. At this stage, you’ll enter the URL for the API you want your agent to retrieve data from or send data to. It can be a public API, an API you built yourself, or anything in between.

<img height="150" src="https://framerusercontent.com/images/kaPvyWwOC691hQBVZJ3XQ9XGOXc.png" />

### Step 4 - Configure your parameters (optional)

This is where the fun begins. Actions can be configured with headers, body parameters, query, parameters, and URL parameters using any combination of static and dynamic parameters.

#### What are static parameters?

Static parameters are parameters that you want to remain the same every time the action is invoked. This is often useful
for parameters such as API keys, which generally remain the same for each call:

<img height="150" src="https://framerusercontent.com/images/azmsZBUCkJDXe0MR3UuR1G50.png" />

For static values, be sure to uncheck the “Conversation time parameter” checkbox.

#### What are dynamic parameters?

Dynamic parameters are parameters that you’d like the agent to determine based off of the conversation. For example, the user’s name, or email address.

<img height="150" src="https://framerusercontent.com/images/2SQgef6hWYYHxwnpR3znxVIAkY.png" />

### Step 5 - Enable your action

To enable your action, either [edit an existing agent](https://play.ai/my-agents), or [create your own](https://play.ai/create-agent). At the bottom of the composition form, you should see your new action!

<img height="150" src="https://framerusercontent.com/images/7RhIhSK0V4zh827BYtbQi5pWB2g.png" />

Simply check the box corresponding to the action you’d like to enable - and you’re all set.

#### How do I test my action?

Once your action is enabled, your agents will intelligently invoke it during conversations. To test your first action, navigate to [your agents](https://play.ai/my-agents), start a conversation, and ask the agent to do whatever it is that your action description entails. If the action succeeds, you’re done!

However, if for some reason the action initially fails, you debug by navigating to the [conversations page](https://play.ai/conversations) and viewing the transcript of the relevant conversation.

<img height="300" src="https://framerusercontent.com/images/eNawEi6tLf7eaE8Dgkmk8FS10rQ.png" />

Note that a conversation ID will be included in the body of the request to your endpoint. This allows you to identify requests coming from the same conversation.

That’s all for now. Feel free to shoot any questions or provide any feedback you have to [support@play.ht](mailto:support@play.ht) Thanks for
reading!


# Introduction

Welcome to Play.ai documentation

<Frame>
  <img className="block dark:hidden" src="https://mintlify.s3.us-west-1.amazonaws.com/playhtinc/media/playai-light.svg" alt="Hero Light" />

  <img className="hidden dark:block" src="https://mintlify.s3.us-west-1.amazonaws.com/playhtinc/media/playai-dark.svg" alt="Hero Dark" />
</Frame>

<Tip>
  ### Coding with LLMs?

  You can access the entire docs as one plain text file here: [https://docs.play.ai/llms-full.txt ](https://docs.play.ai/llms-full.txt). Save the file and add it as context to any LLM for easier development.
</Tip>

<Info>
  ### Ready-to-use code snippets

  You can easily get ready-to-use code snippets from our playground here: [https://play.ai/playground](https://play.ai/playground) by switching from "form" to "code"

  ![](https://mintlify.s3.us-west-1.amazonaws.com/playhtinc/documentation/get-started/playground-code-snippet)
</Info>

## Setting up

The first step to world-class agents and conversational AI. Our API enables every business, every developer, every tinkerer to easily build capable and useful conversational AI voice Solutions.

<CardGroup cols={3}>
  <Card title="Get your keys" icon="key-skeleton-left-right" href="https://play.ai/developers">
    Get your User ID and API Key
  </Card>

  <Card title="Create your agent" icon="comment-plus" href="/api-reference/endpoints/v1/agents/post">
    Create your first Play.ai agent
  </Card>

  <Card title="Start a conversation" icon="messages" href="/api-reference/websocket">
    Start a conversation with your agent
  </Card>

  <Card title="Text-to-speech" icon="circle-play" href="/tts-api-reference/endpoints/v1/tts/stream/post-playdialog">
    Create single or multi speaker speech from text with our latest PlayDialog model.
  </Card>

  <Card title="PlayNote (File to Podcast)" icon="microphone-lines" href="/playnote-api-reference/endpoints/v1/playnotes/post">
    Turn any PDF, document, video, or image file into a multi-speaker podcast.
  </Card>
</CardGroup>


# Build your own AI Voice Agents


<Card icon="sparkles" iconType="duotone" color="#b4fd83">
  Learn how to build your own AI-powered Voice Agents using Play.ai
</Card>

# Creating Your First AI Voice Agent

This guide will walk you through creating a custom AI voice agent using the Play.AI platform. These agents can be deployed to your website or a phone number, opening up a wide range of possibilities for interactive experiences.

## Prerequisites

*   A Play.AI account. If you don't have one, you can create one at [play.ai](play.ai).

## Step-by-Step Guide

1.  **Agent Creation:**
    *   Navigate to [play.ai](https://play.ai/) and log in.
    *   Click "Create an Agent." This will open the agent creation portal.
2.  **Agent Identity:**
    *   **Name:** Give your agent a unique and descriptive name. This name will be visible to users.
    *   **Voice:** Choose a pre-built voice from the available library. Voices are categorized by gender, accent, and style (conversational, storytelling, etc.). You can audition each voice before selection. Alternatively, create a clone of your own voice for a personalized touch (this feature may have specific requirements).
    *   **Speed:** Adjust the speaking speed of the chosen voice. The default is 1.0x, with options to increase or decrease the speed.
    *   **Avatar:** Select a visual representation for your agent. Choose from a library of pre-designed avatars or upload a custom image.
    *   **Privacy:** Configure the agent's accessibility. Options include:
        *   **Private:** Only you can access and interact with the agent.
        *   **Unlisted:** Accessible to anyone with the unique link to the agent. Ideal for sharing with a select group.
        *   **Public:** Anyone can interact with and clone the agent.
3.  **Agent Behavior:**
    *   **Agent Greeting:** Craft the initial message your agent will use to greet users. This sets the tone for the conversation. Example: "Hello! I'm Jarvis. How can I help you today?"
    *   **Agent Prompt:** This is the core instruction set that defines your agent's purpose and behavior. Be clear and specific about the agent's role and the types of interactions it should handle. This acts similarly to a system prompt in large language models. Example: "You are a helpful assistant. You will provide information and answer questions about the services our company offers."
4.  **Agent Knowledge (Optional):**
    *   **LLM Selection:** Choose the underlying language model that powers your agent's intelligence. Several options may be available, each with varying capabilities and costs.
    *   **Custom Knowledge:** Enhance your agent's knowledge with specific information relevant to its purpose. Upload files (PDFs, FAQs, Epub, .txt) containing information like product details, company policies, or specialized knowledge.
    *   **Guardrails (Optional):** Restrict the agent's responses to only the uploaded custom knowledge base, preventing it from relying on its general knowledge training. This is important for ensuring accuracy and control in specific domains.
    *   **Dynamic Context (Optional):** Provide the agent with contextually relevant information. This might include:
        *   Current Date & Time
        *   Caller Information (phone number, email) - if applicable to the deployment method.
5.  **Agent Actions (Optional):**
    *   Select specific actions that your agent can perform. These actions might integrate with external services like Zapier and Make or tools. Examples:
        *   Get a funny quote
        *   Get Weather information
        *   Schedule a meeting (integration with Google Calendar)
        *   Send an email/SMS message
        *   Upload an order (integration with backend systems)
        *   Get Financial Data
6.  **Deployment:**
    *   **Phone:** Deploy the agent to a phone number. This allows users to call and interact with the agent through voice calls. Pricing details for call minutes may apply.
    *   **Web:** Embed the agent on your website by copying and pasting provided code into the `<head>` section of your HTML. This enables text-based or voice interaction directly on your site. Customize styling and content as needed.
7.  **Testing & Iteration:** Once deployed, thoroughly test your agent to refine its behavior, prompts, and knowledge base for optimal performance.

## Additional Notes

*   The "Actions" functionality is a powerful way to extend your agent's capabilities and integrate it with your existing workflows.
*   Carefully consider the privacy setting for your agent to control its accessibility.
*   The Agent Prompt is critical for shaping your agent's responses. Experiment with different prompts to achieve the desired behavior.

This comprehensive guide should enable you to create and deploy your first AI voice agent. For advanced functionalities and integrations, refer to the [Play.AI documentation](https://docs.play.ai/) for further details.

## Video Guide to build a Production-grade Voice Agent

<iframe width="560" height="315" src="https://www.youtube.com/embed/HFXJwCenmLA" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen />


# Create an AI podcast with multiple hosts using PlayDialog

PlayDialog API Guide

This guide provides a step-by-step approach to using the PlayDialog voice model to create a multi-turn scripted conversation between two distinct speakers.

## Overview&#x20;

*   This is an async API endpoint.&#x20;

*   You will make a request to trigger podcast generation.

*   You will then request another endpoint to see if the podcast is ready.

## Get your API keys to get started

*   Get your [API keys](https://play.ai/developers) (Secret key and User ID) to be able to access the API.

## Let's jump right into it

The following Python script demonstrates how to request the PlayDialog API to create a podcast between 2 hosts and save the generated audio file.

```python python
import requests
import os
import time

# Set up headers with your API secrety key and user ID
user_id = os.getenv("PLAYDIALOG_USER_ID")
secret_key = os.getenv("PLAYDIALOG_SECRET_KEY")

headers = {
    'X-USER-ID': user_id,
    'Authorization': secret_key,
    'Content-Type': 'application/json',
}

# define the model
model = 'PlayDialog'

# define voices for the 2 hosts
# find all voices here https://docs.play.ai/tts-api-reference/voices
voice_1 = 's3://voice-cloning-zero-shot/baf1ef41-36b6-428c-9bdf-50ba54682bd8/original/manifest.json'
voice_2 = 's3://voice-cloning-zero-shot/e040bd1b-f190-4bdb-83f0-75ef85b18f84/original/manifest.json'

# podcast transcript should be in the format of Host 1: ... Host 2:
transcript = """
Host 1: Welcome to The Tech Tomorrow Podcast! Today we're diving into the fascinating world of voice AI and what the future holds.
Host 2: And what a topic this is. The technology has come so far from those early days of basic voice commands.
Host 1: Remember when we thought it was revolutionary just to ask our phones to set a timer?
Host 2: Now we're having full conversations with AI that can understand context, emotion, and even cultural nuances. It's incredible.
Host 1: Though it does raise some interesting questions about privacy and ethics. Where do we draw the line?
Host 2: Exactly. The potential benefits for accessibility and education are huge, but we need to be thoughtful about implementation.
Host 1: Well, we'll be exploring all of these aspects today. Stay with us as we break down the future of voice AI.
"""

payload = {
    'model': model,
    'text': transcript,
    'voice': voice_1,
    'voice2': voice_2,
    'turnPrefix': 'Host 1:',
    'turnPrefix2': 'Host 2:',
    'outputFormat': 'mp3',
}

# Send the POST request to trigger podcast generation
response = requests.post('https://api.play.ai/api/v1/tts/', headers=headers, json=payload)

# get the job id to check the status
job_id = response.json().get('id')

# use the job id to check completion status
url = f'https://api.play.ai/api/v1/tts/{job_id}'
delay_seconds = 2

# keep checking until status is COMPLETED.
# longer transcripts take more time to complete.
while True:
    response = requests.get(url, headers=headers)

    if response.ok:
        status = response.json().get('output', {}).get('status')
        print(status)
        if status == 'COMPLETED':
            # once completed audio url will be avaialable
            podcast_audio = response.json().get('output', {}).get('url')
            break

    time.sleep(delay_seconds)

print(podcast_audio)
```

## Key API Parameters

The following API payload define the conversation, speaker details, and audio generation options:

*   **`model`**: Specifies the PlayDialog API model to be used. Here, `PlayDialog` supports multi-turn conversation generation.

*   **`text`**: Contains the scripted conversation, with each turn prefixed by the speaker’s name (e.g., `"Country Mouse"` & `"Town Mouse"`).

*   **`voice`**: URL path to the voice manifest for the first speaker.

*   **`voice_2`**: URL path to the voice manifest for the second speaker.

*   **`turn_prefix` / `turn_prefix_2`**: Used to specify each speaker’s dialogue turns within the `text` field. For example: `turn_prefix` says `Country Mouse` to indicate the position where Speaker 1's dialogue and `turn_prefix_2` says `Town Mouse` that indicates the position where Speaker 2's dialogue parts are.

*   **`output_format`**: Format for the generated audio file, typically `wav` or `mp3`.

If you happen to save the code as `country_mouse.py` then Run the code using `python3 country_mouse.py` pointing your terminal to the directory where the `country_mouse.py` file is stored. This will save the `dialogue.wav` in the same working directory.

# Code Explanation

This script uses the PlayDialog API to generate a multi-turn conversation between two characters. The `AUTHORIZATION token` and `X-USER-ID` are required for authentication, which you’ll need to replace with your own credentials.

Each line of dialogue is labeled by character name (e.g., "`Country Mouse`" or "`Town Mouse`") to simulate a natural conversation. The script assigns a unique voice to each character using `voice` and `voice2`. On a successful API call, the generated audio is saved as `dialogue.wav`. Any errors are reported with status details.

**To run the script:**

*   Replace placeholders in the headers with your API key and user ID.

*   Update the `text` with your scripted conversation

*   Update the Speaker Details and their respective voices

*   Run the script. If successful, an audio file, `dialogue.wav`, will be saved in the current directory, capturing the dialogue as configured.

*   This setup can easily adapt to more complex dialogues or different speakers.

# Troubleshooting

*   Authentication Issues: Verify your `API key` and `user ID`. Ensure the `AUTHORIZATION` header includes "Bearer " followed by your token.

*   API Endpoint Errors: Confirm you’re using the correct PlayDialog API endpoint URL and the `model` name is `PlayDialog`


# Generate Conversation from PDF with PlayNote API

NotebookLM-style Podcast with PlayNote API Guide

This guide will walk you through generating a conversational-style podcast from a PDF using the PlayNote API. We’ll use the `PlayNote API` to take a PDF source file, synthesize it into an audio conversation between two voices, and retrieve the generated podcast URL. This tutorial assumes basic knowledge of Python and API requests.

### Prerequisites

Before you start, ensure you have the following:

1.  **API Key and User ID**: You need your `PLAYDIALOG_API_KEY` and `PLAYDIALOG_USER_ID` from PlayDialog. Set these as environment variables to avoid exposing them directly in your code.

2.  **Python Libraries**: Install the `requests` library if you haven’t already.

```bash bash
 pip install requests
```

## Setup your API Key

To keep your API key secure and avoid hardcoding it directly into your code, you can store it as an environment variable. This way, your script can access it securely without exposing the key.

### Step 1: Set the Environment Variable

#### For macOS and Linux

*   Open your terminal.

*   Add this line to your \~/.bashrc or \~/.zshrc file to make it persistent across sessions

```bash bash
echo 'export PLAYDIALOG_API_KEY="your_api_key_here"' >> ~/.bashrc
echo 'export PLAYDIALOG_USER_ID="your_user_id_here"' >> ~/.bashrc

```

*   Run `~/.bashrc (or source ~/.zshrc for zsh)` to load the variables into your current session.

#### For Windows

*   Open Command Prompt or PowerShell.

*   Use the `setx` command to create each environment variable individually:

```cmd cmd
setx PLAYDIALOG_API_KEY "your_api_key_here"
```

```cmd cmd
setx PLAYDIALOG_USER_ID "your_user_id_here"
```

*   Restart your terminal to apply the changes.

## Step 1: Setup and Initialize

Start by importing necessary libraries and setting up the API endpoint and headers.

```python python
import requests
import os

# Define the URL of your PDF file
SOURCE_FILE_URL = "https://godinton.kent.sch.uk/media/2601/goldilocks-story.pdf"

# PlayNote API URL
url = "https://api.play.ai/api/v1/playnotes"

# Retrieve API key and User ID from environment variables
api_key = os.getenv("PLAYDIALOG_API_KEY")
user_id = os.getenv("PLAYDIALOG_USER_ID")

# Set up headers with authorization details
headers = {
    'AUTHORIZATION': api_key,
    'X-USER-ID': user_id,
    'accept': 'application/json'
}

```

## Step 2: Send Request to Generate PlayNote

This step initiates the podcast creation process. Define the parameters such as synthesisStyle, sourceFileUrl, and the voices you want to use.

```python python
# Configure the request parameters
files = {
    'sourceFileUrl': (None, SOURCE_FILE_URL),
    'synthesisStyle': (None, 'podcast'),
    'voice1': (None, 's3://voice-cloning-zero-shot/baf1ef41-36b6-428c-9bdf-50ba54682bd8/original/manifest.json'),
    'voice1Name': (None, 'Angelo'),
    'voice2': (None, 's3://voice-cloning-zero-shot/e040bd1b-f190-4bdb-83f0-75ef85b18f84/original/manifest.json'),
    'voice2Name': (None, 'Deedee'),
}

# Send the POST request
response = requests.post(url, headers=headers, files=files)

# Check the response
if response.status_code == 201:
    print("Request sent successfully!")
    playNoteId = response.json().get('id')
    print(f"Generated PlayNote ID: {playNoteId}")
else:
    print(f"Failed to generate PlayNote: {response.text}")


```

Note: The PlayNote ID will be used to poll for the podcast’s generation status.

At the end of this step, you'd get the following message:

```bash output
Request sent successfully!
Generated PlayNote ID: goldilocks-story.pdf
```

## Step 3: Poll for Completion

After requesting a PlayNote, wait a few minutes before checking the status. Use the PlayNote ID to poll for the audio’s readiness.

```python python
import urllib.parse
import time

# Double-encode the PlayNote ID for the URL
double_encoded_id = urllib.parse.quote(playNoteId, safe='')

# Construct the final URL to check the status
status_url = f"https://api.play.ai/api/v1/playnotes/{double_encoded_id}"

# Poll for completion
while True:
    response = requests.get(status_url, headers=headers)
    if response.status_code == 200:
        playnote_data = response.json()
        status = playnote_data['status']
        if status == 'completed':
            print("PlayNote generation complete!")
            print("Audio URL:", playnote_data['audioUrl'])
            break
        elif status == 'generating':
            print("Please wait, your PlayNote is still generating...")
            time.sleep(120)  # Wait for 2 minute before polling again
        else:
            print("PlayNote creation failed, please try again.")
            break
    else:
        print(f"Error polling for PlayNote status: {response.text}")
        break

```

### Final Output

Once the status shows as completed, the audio URL will be printed. You can use this link to access the generated podcast-style conversation.&#x9;

## Example Output

After completion, expect the output to look like this:

```bash output
PlayNote generation complete!
Audio URL: https://path.to.generated.audio/podcast.mp3
```

### Troubleshooting

*   **Authentication Errors:** Ensure that api\_key and user\_id are correctly set in your environment.

*   **Source File Issues:** Ensure your PDFs are publicly accessible

*   **Generation Time:** If the response indicates the PlayNote is still being generated, allow more time before retrying.

This guide provides a simple yet powerful way to turn text content from a PDF into a rich, conversational podcast format using PlayNote API. Modify the voice parameters to customize the conversation to match your desired style.


# How to use PlayDialog Async Text-to-Speech API

PlayDialog Async (Non-Streaming) API Guide

This guide provides a step-by-step approach to using the PlayDialog API to convert text into natural human-like sounding audio using [the Async (non-streaming) API Endpoint](https://docs.play.ai/tts-api-reference/endpoints/v1/tts/post-async-playdialog).

In this example, we’ll have PlayDialog create a simple audio from the given input text.

## Prerequisites

*   [Access credentials](https://play.ai/developers) (Secret key and User ID) for the PlayDialog API.

*   Python environment for executing the API request.

## Setup your API Key

To keep your API key secure and avoid hardcoding it directly into your code, you can store it as an environment variable. This way, your script can access it securely without exposing the key.

### Step 1: Set the Environment Variable

#### For macOS and Linux

*   Open your terminal.

*   Add this line to your \~/.bashrc or \~/.zshrc file to make it persistent across sessions:

```bash bash
echo 'export PLAYDIALOG_API_KEY="your_api_key_here"' >> ~/.bashrc
echo 'export PLAYDIALOG_USER_ID="your_user_id_here"' >> ~/.bashrc
```

*   Run `source ~/.bashrc` (or `source ~/.zshrc` for zsh) to load the variables into your current session.

#### For Windows

*   Open Command Prompt or PowerShell.

*   Use the `setx` command to create each environment variable individually:

```cmd cmd
setx PLAYDIALOG_API_KEY "your_api_key_here"
```

```cmd cmd
setx PLAYDIALOG_USER_ID "your_user_id_here"
```

*   Restart your terminal to apply the changes.

### Step 2: Access the Variables in Python

In your Python script, use the os module to access the environment variables:

```python python
import os

api_key = os.getenv("PLAYDIALOG_API_KEY")
user_id = os.getenv("PLAYDIALOG_USER_ID")

headers = {
    'AUTHORIZATION': api_key,
    'Content-Type': 'application/json',
    'X-USER-ID': user_id
}
```

## Key API Parameters

The following API payload defines the conversation, speaker details, and audio generation options:

*   **`model`**: Specifies the PlayDialog API model to be used.

*   **`text`**: Contains the input text for which the speech audio has to be generated.

*   **`voice`**: URL path to the voice manifest for the first speaker.

*   **`outputFormat`**: Format for the generated audio file, typically `wav` or `mp3`.

*   **`speed`**: (Optional) Adjust the speaking speed.

*   **`language`**: (Optional) Language of the input text.

## Call the API - Non-Streaming Endpoint with Polling

The new non-streaming API requires submitting a job and polling to check its status. Once the job is completed, you can retrieve the audio file.

### Submit a Job

```python python
import requests
import os

# Set up headers with your API authentication token and user ID
api_key = os.getenv("PLAYDIALOG_API_KEY")
user_id = os.getenv("PLAYDIALOG_USER_ID")

headers = {
    'AUTHORIZATION': api_key,
    'Content-Type': 'application/json',
    'X-USER-ID': user_id
}

# JSON payload for job submission
json_data = {
    'model': 'PlayDialog',
    'text': "This is the greatest moment to be alive",
    'voice': 's3://voice-cloning-zero-shot/baf1ef41-36b6-428c-9bdf-50ba54682bd8/original/manifest.json',
    'outputFormat': 'mp3',
    'speed': 1,
    'language': 'english',
}

# Submit the job to the API
response = requests.post('https://api.play.ai/api/v1/tts', headers=headers, json=json_data)

if response.status_code == 201:
    job_id = response.json().get('id')
    print(f"Job submitted successfully. Job ID: {job_id}")
else:
    print(f"Job submission failed with status code {response.status_code}: {response.text}")
```

Once this code is executed, You will get a job id which means:

1.  The Async Job has been successfully submitted.

2.  The `job_id` can be used to poll the status of the TTS job.

Sample Job ID - `'9726f318-410a-4c8e-99c6-e2e1da6615e1'`

### Poll for Job Status

After submitting the job, [poll the status](https://docs.play.ai/tts-api-reference/endpoints/v1/tts/get-async) until it is completed.

```python python
import time

# job_id = "your_job_id_here" job_id generated from `response.json().get('id')`
polling_url = f'https://api.play.ai/api/v1/tts/{job_id}'

while True:
    response = requests.get(polling_url, headers=headers)
    status = response.json()['output']['status']

    if status == 'COMPLETED':
        audio_url = response.json()['output']['url']
        print(f"Job completed. Audio URL: {audio_url}")
        break
    elif status == 'IN_PROGRESS':
        print("Job is still in progress. Retrying in 5 seconds...")
        time.sleep(5)
    else:
        print(f"Job failed or encountered an unknown status: {status}")
        break
```

Sample output object:

```bash output
{'id': '9726f318-410a-4c8e-99c6-e2e1da6615e1',
 'createdAt': '2024-11-18T13:58:32.252Z',
 'input': {'model': 'PlayDialog',
  'text': 'This is the greatest moment to be alive',
  'voice': 's3://voice-cloning-zero-shot/baf1ef41-36b6-428c-9bdf-50ba54682bd8/original/manifest.json',
  'outputFormat': 'mp3',
  'speed': 1,
  'language': 'english'},
 'completedAt': '2024-11-18T13:58:50.468Z',
 'output': {'status': 'COMPLETED',
  'url': 'https://fal-api-audio-uploads.s3.amazonaws.com/8e340848-88e7-492c-a712-f7d41c9c4693.mp3',
  'contentType': 'audio/mpeg',
  'fileSize': 52461,
  'duration': 2.14}}
```

### Save the Generated Audio File

Once the job is completed, download and save the audio file from the provided URL.

```python python
import requests

audio_url = response.json()['output']['url']
audio_response = requests.get(audio_url)

if audio_response.status_code == 200:
    with open('output.mp3', 'wb') as f:
        f.write(audio_response.content)
    print("Audio file saved as output.mp3")
else:
    print(f"Failed to download audio. Status code: {audio_response.status_code}")
```

If all the above code blocks were successfully executed, at this stage, you'd have an audio file saved in your local computer's current working directory as `output.mp3`.

## Troubleshooting

*   **Authentication Issues**: Verify your `API key` and `user ID`. Ensure the header is correctly set up.

*   **Job Status Polling**: Ensure you use the correct job ID to check the status.

*   **API Endpoint Errors**: Confirm you’re using the correct PlayDialog API endpoint URL and model name.


# How to use PlayDialog Text-to-Speech API

PlayDialog API Guide

This guide provides a step-by-step approach to using the PlayDialog API to convert text into natural human-like sounding audio.

In this example, we’ll have PlayDialog create a simple audio from the given input text.

## Prerequisites

*   [Access credentials](https://play.ai/developers) (Secret key and User ID) for the PlayDialog API.

*   Python environment for executing the API request.

## Setup your API Key

To keep your API key secure and avoid hardcoding it directly into your code, you can store it as an environment variable. This way, your script can access it securely without exposing the key.

### Step 1: Set the Environment Variable

#### For macOS and Linux

*   Open your terminal.

*   Add this line to your \~/.bashrc or \~/.zshrc file to make it persistent across sessions

```bash bash
echo 'export PLAYDIALOG_API_KEY="your_api_key_here"' >> ~/.bashrc
echo 'export PLAYDIALOG_USER_ID="your_user_id_here"' >> ~/.bashrc

```

*   Run `~/.bashrc (or source ~/.zshrc for zsh)` to load the variables into your current session.

#### For Windows

*   Open Command Prompt or PowerShell.

*   Use the `setx` command to create each environment variable individually:

```cmd cmd
setx PLAYDIALOG_API_KEY "your_api_key_here"
```

```cmd cmd
setx PLAYDIALOG_USER_ID "your_user_id_here"
```

*   Restart your terminal to apply the changes.

### Step 2: Access the Variables in Python

In your Python script, use the os module to access the environment variables:

```python python

import os

api_key = os.getenv("PLAYDIALOG_API_KEY")
user_id = os.getenv("PLAYDIALOG_USER_ID")

headers = {
    'Authorization': f'Bearer {api_key}',
    'Content-Type': 'application/json',
    'X-USER-ID': user_id
}

# Add the following API request with the payload

```

## Key API Parameters

The following API payload define the conversation, speaker details, and audio generation options:

*   **`model`**: Specifies the PlayDialog API model to be used. Here, `PlayDialog` supports multi-turn conversation generation.

*   **`text`**: Contains the input text for which the speech audio has to be generated.

*   **`voice`**: URL path to the voice manifest for the first speaker.

*   **`output_format`**: Format for the generated audio file, typically `wav` or `mp3`.

## Call the API

The following Python script demonstrates how to request the PlayDialog API and save the generated audio file.

```python python
import requests
import os

# Set up headers with your API authentication token and user ID

api_key = os.getenv("PLAYDIALOG_API_KEY")
user_id = os.getenv("PLAYDIALOG_USER_ID")

headers = {
    'Authorization': f'Bearer {api_key}',
    'Content-Type': 'application/json',
    'X-USER-ID': user_id
}


# JSON payload containing the script and configuration settings

json_data = {
    'model': 'PlayDialog',
    'text': "All human wisdom is summed up in these two words: Wait and hope.",
    'voice': 's3://voice-cloning-zero-shot/baf1ef41-36b6-428c-9bdf-50ba54682bd8/original/manifest.json',
    'outputFormat': 'wav'
}

# Send the POST request to the PlayDialog API endpoint
response = requests.post('https://api.play.ai/api/v1/tts/stream', headers=headers, json=json_data)

# Handle response and save audio file
if response.status_code == 200:
    with open('dialogue.wav', 'wb') as f:
        f.write(response.content)
    print("Audio file saved as dialogue.wav")
else:
    print(f"Request failed with status code {response.status_code}: {response.text}")
```

If you happen to save the code as `playdialog_tts.py` then Run the code using `python3 playdialog_tts.py` pointing your terminal to the directory where the `playdialog_tts.py` file is stored. This will save the `dialogue.wav` in the same working directory.

# Code Explanation

This script uses the PlayDialog API to generate an audio file from a text input. The script requires an `AUTHORIZATION token` and `X-USER-ID` for authentication, which should be replaced with your own credentials in the environment variables `PLAYDIALOG_API_KEY` and `PLAYDIALOG_USER_ID`.

The main script sends a single line of text to the PlayDialog API, where it is processed into audio/speech with a specified voice configuration. The `voice` parameter contains the path to the desired voice profile, allowing customization of the spoken output. The audio is then generated in the specified `wav` format.

On a successful API call, the resulting audio is saved as `dialogue.wav` in the current directory, ready for playback. If the request fails, the script outputs an error message with the HTTP status code and any additional details from the API response.

**To run the script:**

*   Replace placeholders in the headers with your API key and user ID.

*   Update the `text` with your input text (that needs to converted into speech)

*   Update the Speaker Details and their respective voice

*   Run the script. If successful, an audio file, `dialogue.wav`, will be saved in the current directory, capturing the dialogue as configured.

*   This setup can easily adapt to more complex dialogues or multiple speakers as well.

# Troubleshooting

*   Authentication Issues: Verify your `API key` and `user ID`. Ensure the `AUTHORIZATION` header includes "Bearer " followed by your token.

*   API Endpoint Errors: Confirm you’re using the correct PlayDialog API endpoint URL and the `model` name is `PlayDialog`


# Rate Limits


To prevent abuse, our APIs are rate-limited. The specific limits depend on the API you are using and the plan you are on.

The limits applied to our API are shown in the table below. They are applied per API key and are reset every minute.

| Endpoint                           | Rate Limit      |
| ---------------------------------- | --------------- |
| `POST /api/v1/agents`              | 20 requests/min |
| `PATCH /api/v1/agents/:agentId`    | 20 requests/min |
| `GET /api/v1/agents/:agentId`      | 60 requests/min |
| `GET /api/v1/agent-stats/:agentId` | 60 requests/min |

## How are rate limits enforced?

Our API enforces rate limits to ensure fair usage and maintain system stability. The rate limits are calculated per
minute, based on the total number of requests made within a 60-second window.

## What happens when the rate limits are exceeded?

If you exceed your allotted rate limit, your subsequent requests will receive a `429 - Too Many Requests` HTTP status
code. This response indicates that the request quota for the current minute has been exceeded. You can resume sending
requests after the minute has elapsed, at which point the rate limit counter resets.

## Can I request higher rate limits?

Need higher call volumes? We can accommodate that! Rate limits are configurable on a per-client basis to accommodate
varying needs. If you require higher limits, please [contact us](https://play.ht/contact-us/) with details about your
specific use case requirements., and we'll work together to find the most suitable rate limit strategy.


# Crawl a Website


## Introduction

The play.ai web embed can be used to crawl a website and answer questions about the content and navigate the user to the relevant pages.

## Crawl a website

<Steps>
  <Step title="Enable crawling">
    On the last page of the create/edit agent flow, check the "Crawl website" checkbox and enter the URL of the website you want to crawl.

    <Frame>
      <img src="https://mintlify.s3.us-west-1.amazonaws.com/playhtinc/media/crawl-a-website/enable-crawling.png" alt="Enable crawling" />
    </Frame>
  </Step>

  <Step title="Click the 'Crawl' button to start the crawling process">
    Stay on the page until the crawling process is complete. This can take a while if the website has many pages.

    <Frame>
      <img src="https://mintlify.s3.us-west-1.amazonaws.com/playhtinc/media/crawl-a-website/click-crawl.png" alt="Click 'Crawl'" />
    </Frame>
  </Step>

  <Step title="Wait for crawling to complete">
    Your embed should now be able to answer questions about the crawled website and navigate the user to relevant pages on the site.

    <Frame>
      <img src="https://mintlify.s3.us-west-1.amazonaws.com/playhtinc/media/crawl-a-website/crawling-completed.png" alt="Click 'Crawl'" />
    </Frame>
  </Step>
</Steps>

## Delete a crawled website

Go into the "Knowledge" tab of the create/edit agent flow and click the trash icon next to the website you want to delete under the "Custom Knowledge" section.

<Frame>
  <img src="https://mintlify.s3.us-west-1.amazonaws.com/playhtinc/media/crawl-a-website/delete-website.png" alt="Delete website" />
</Frame>


# Embedding an Agent on your Website


## Introduction

The play.ai web embed allows you to integrate AI-powered interactions into your web application. This guide will walk you through the process of setting up and using the play.ai web embed in your React application.

## Installation

First, install the `@play-ai/web-embed` package:

```bash
npm install @play-ai/web-embed
```

## Basic Usage

Here's a step-by-step guide to implement the play.ai web embed in your React application:

<Steps>
  <Step title="Create an agent">
    If you haven't already, create an agent on [https://play.ai](https://play.ai).
  </Step>

  <Step title="Import the necessary dependencies">
    ```javascript
    import { useEffect, useState } from 'react';
    import { open as openEmbed } from '@play-ai/web-embed';
    ```
  </Step>

  <Step title="Define your web embed ID">
    ```javascript
    const webEmbedId = 'YOUR_WEB_EMBED_ID';
    ```

    Replace `YOUR_WEB_EMBED_ID` with the actual ID provided by play.ai, which can be found in the Agent editor.

    <Frame caption="Find your web embed ID on the last page of the Agent editor under 'Deploy Web'">
      <img src="https://mintlify.s3.us-west-1.amazonaws.com/playhtinc/media/embedding-an-agent-on-your-website/web-embed-id.png" alt="Web Embed ID" />
    </Frame>
  </Step>

  <Step title="Define custom events (optional)">
    ```javascript
    const events = [
      {
        name: "change-text",
        when: "The user says what they want to change the text on the screen to",
        data: {
          text: { type: "string", description: "The text to change to" },
        },
      },
    ] as const;
    ```

    Custom events are optional, but they allow you to define custom behavior for your agent, to allow it to execute javascript and interact with the page. Learn more about custom events <a target="_blank" href="/api-reference/web-embed#events-array">here</a>.

    This example defines a single event called "change-text" that will be triggered when the user specifies the text they want to display.
  </Step>

  <Step title="Implement the custom event handler (optional)">
    ```javascript
    const onEvent = (event: any) => {
      console.log("onEvent: ", event);
      if (event.name === "change-text") {
        setText(event.data.text);
      }
    };
    ```

    If you define custom events, you must implement an event handler to handle the events. Learn more about the custom event handler <a target="_blank" href="/api-reference/web-embed#onevent-handler">here</a>.

    This handler logs the event and updates the text state when a "change-text" event is received.
  </Step>

  <Step title="Initialize the web embed">
    ```javascript
    useEffect(() => {
      openEmbed(webEmbedId, { events, onEvent });
    }, []);
    ```

    This `useEffect` hook initializes the web embed when the component mounts. See all the parameters that can be passed to `openEmbed()` <a target="_blank" href="/api-reference/web-embed#openembed-function">here</a>.
  </Step>

  <Step title="Render the component">
    ```jsx
    return (
      <>
        <div className="flex justify-center items-center h-[70vh]">
          <div className="font-medium text-2xl">{text}</div>
        </div>
      </>
    );
    ```

    This example renders the current text in the center of the page.
  </Step>
</Steps>

## Full Example

Here's the complete example of a React component using the play.ai web embed.

```jsx
"use client";
import { useEffect, useState } from "react";
import { open as openEmbed } from "@play-ai/web-embed";

const webEmbedId = "YOUR_WEB_EMBED_ID";

export default function Home() {
  const [text, setText] = useState("Change this text");

  const events = [
    {
      name: "change-text",
      when: "The user says what they want to change the text on the screen to",
      data: {
        text: { type: "string", description: "The text to change to" },
      },
    },
  ] as const;

  const onEvent = (event: any) => {
    console.log("onEvent: ", event);
    if (event.name === "change-text") {
      setText(event.data.text);
    }
  };

  useEffect(() => {
    openEmbed(webEmbedId, { events, onEvent });
  }, []);

  return (
    <>
      <div className="flex justify-center items-center h-[70vh]">
        <div className="font-medium text-2xl">{text}</div>
      </div>
    </>
  );
}
```

View the live example [here](https://play.ai/embed/demo/change-text).

View the code [here](https://github.com/playht/web-embed-examples/blob/main/change-text/nextjs/app/page.tsx).

## Customization

You can customize the behavior of your AI agent by modifying the agent greeting and prompt. In this example, the agent is instructed to change the text on the page and end the call immediately after doing so.

<Frame caption="The greeting and prompt for the Change Text demo agent">
  <img src="https://mintlify.s3.us-west-1.amazonaws.com/playhtinc/media/embedding-an-agent-on-your-website/greeting-and-prompt.png" alt="Web Embed ID" />
</Frame>

## Other examples

*   [View other examples](/documentation/web-embed/web-embed-examples)

## Conclusion

This guide demonstrates how to integrate the play.ai web embed into your React application. You can extend this functionality by adding more events and customizing the agent's behavior to suit your specific needs.

## Next Steps

*   [Learn more about custom events](/api-reference/web-embed)


# Web Embed Examples


## Form filling

Assists a user in filling out a form. Showcases the ability to pass a custom prompt to the web embed from javascript.

[Live example](https://play.ai/embed/demo/form-filling)

[Code](https://github.com/playht/web-embed-examples/blob/main/form-filling/nextjs/app/page.tsx)

<Frame>
  <img src="https://mintlify.s3.us-west-1.amazonaws.com/playhtinc/media/web-embed-examples/form-filling.png" alt="Form Filling Demo" />
</Frame>

## Minimize web embed

Showcases the ability to minimize the web embed from javascript.

[Live example](https://play.ai/embed/demo/minimize)

[Code](https://github.com/playht/web-embed-examples/blob/main/minimize-embed/nextjs/app/page.tsx)

<Frame>
  <img src="https://mintlify.s3.us-west-1.amazonaws.com/playhtinc/media/web-embed-examples/minimize-embed.png" alt="Minimize Web Embed Demo" />
</Frame>

## Image generation

Generates an image based on the user's description.

[Live example](https://play.ai/embed/demo/image-gen)

[Code](https://github.com/playht/web-embed-examples/blob/main/image-gen/nextjs/app/page.tsx)

<Frame>
  <img src="https://mintlify.s3.us-west-1.amazonaws.com/playhtinc/media/web-embed-examples/image-gen.png" alt="Image Generation Demo" />
</Frame>

## Music mood

Plays music based on the user's mood.

[Live example](https://play.ai/embed/demo/music-mood)

<Frame>
  <img src="https://mintlify.s3.us-west-1.amazonaws.com/playhtinc/media/web-embed-examples/music-mood.jpeg" alt="Music Mood Demo" />
</Frame>

## Color math

Game where the user has to guess the color that is created by mixing two other colors.

[Live example](https://play.ai/embed/demo/color-math)

## Color painter

Changes the background color of the webpage based on the user's description.

[Live example](https://play.ai/embed/demo/color-painter)


# Get All PlayNotes for User

GET /api/v1/playnotes
Retrieve all PlayNotes

Retrieve all PlayNotes owned by the authenticated user.


# Get PlayNote

GET /api/v1/playnotes/{playNoteId}
Retrieve a specific PlayNote by ID

Retrieve all information about an PlayNote.


# Create PlayNote via File URL

openapi-playnote-sourcefileurl POST /api/v1/playnotes
Create a new PlayNote with either a source file URL or an uploaded file (but not both)

<Tip>
  Check out the [Generate Conversation from PDF with PlayNote API](/documentation/guides/generate-conversation-from-PDF-with-playnote-api) guide
  for a step-by-step approach to using the PlayNote API to create a podcast-style conversation (and more!) from a PDF.
</Tip>

Use this endpoint to create new PlayNotes. You can either provide a file URL or upload a file directly.

The endpoint presented in this page is used to create a new PlayNote by providing a file URL.
If you want to upload a file directly,
please refer to the [Create PlayNote via File Upload](/playnote-api-reference/endpoints/v1/playnotes/post-file-upload) endpoint.

After you create your PlayNotes, you can proceed to poll its status at
our [Get PlayNote API](/playnote-api-reference/endpoints/v1/playnotes/get-id).

Note: You can have only **one active generation**. If you face this error code `403` with the message `{"errorMessage":"User already has an active generation","errorId":"UNAUTHORIZED"}` then please wait for some time and try again later.


# Create PlayNote via File Upload

openapi-playnote-sourcefile-upload POST /api/v1/playnotes
Create a new PlayNote with either a source file URL or an uploaded file (but not both)

<Tip>
  Check out the [Generate Conversation from PDF with PlayNote API](/documentation/guides/generate-conversation-from-PDF-with-playnote-api) guide
  for a step-by-step approach to using the PlayNote API to create a podcast-style conversation (and more!) from a PDF.
</Tip>

Use this endpoint to create new PlayNotes. You can either provide a file URL or upload a file directly.

The endpoint presented in this page is used to create a new PlayNote by uploading a file directly.
If you want to provide a file URL,
please refer to the [Create PlayNote via File URL](/playnote-api-reference/endpoints/v1/playnotes/post) endpoint.

After you create your PlayNotes, you can proceed to poll its status at
our [Get PlayNote API](/playnote-api-reference/endpoints/v1/playnotes/get-id).

Note: You can have only **one active generation**. If you face this error code `403` with the message `{"errorMessage":"User already has an active generation","errorId":"UNAUTHORIZED"}` then please wait for some time and try again later.


# Get Text-to-Speech Async Job

openapi-tts-playdialog GET /api/v1/tts/{asyncTtsJobId}
Gets the current status of an asynchronous job created using our [Async Text-to-Speech API](/tts-api-reference/endpoints/v1/tts/post-async-playdialog).


# Text-to-Speech Async HTTP - Play3.0-mini

openapi-tts-play30mini POST /api/v1/tts
Convert text to speech with our top-of-the-line Play3.0-mini model.


`Play3.0-mini` is a lightweight model that generates high-quality audio with a focus on speed.

<Tip>
  Check out the [How to use PlayDialog Text-to-Speech API](/documentation/guides/how-to-use-playdialog-tts-api) guide
  for a step-by-step approach to using the PlayDialog API to convert text into natural human-like sounding audio.
</Tip>

## Get your Credentials

To use the HTTP API you will need an API Key and a User Id, [you can easily get those here](https://play.ai/developers).

## Explore our models: `Play3.0-mini` and `PlayDialog`

Our API currently supports two models: `Play3.0-mini` and `PlayDialog`.
Use the `model` parameter to select the model you want to use.
`Play3.0-mini` is a lightweight model that generates high-quality audio with a focus on speed.
`PlayDialog` is a more advanced model that can generate turn-based dialogues with multiple voices.

For details on the specific properties of each, see the examples below.

## Example

For code examples, see the interactive code snippets to the right.

For the complete list of supported parameters, see below.


# Text-to-Speech Async HTTP - PlayDialog

openapi-tts-playdialog POST /api/v1/tts
Convert text to speech with our top-of-the-line PlayDialog model.


<Tip>
  Check out the [How to use PlayDialog Text-to-Speech API](/documentation/guides/how-to-use-playdialog-tts-api) guide
  for a step-by-step approach to using the PlayDialog API to convert text into natural human-like sounding audio.

  Make sure to see the [Create a Multi-Turn Scripted Conversation with the PlayDialog API](/documentation/guides/create-a-multi-turn-scripted-conversation-with-tts-api)
  guide for examples on how to create a multi-turn scripted conversation between two distinct speakers.
</Tip>

## Get your Credentials

To use the HTTP API you will need an API Key and a User Id, [you can easily get those here](https://play.ai/developers).

## Explore our models: `Play3.0-mini` and `PlayDialog`

Our API currently supports two models: `Play3.0-mini` and `PlayDialog`.
Use the `model` parameter to select the model you want to use.
`Play3.0-mini` is a lightweight model that generates high-quality audio with a focus on speed.
`PlayDialog` is a more advanced model that can generate turn-based dialogues with multiple voices.

For details on the specific properties of each, see the examples below.

## Example

For code examples, see the interactive code snippets to the right.

For the complete list of supported parameters, see below.


# Text-to-Speech Real-time HTTP streaming - Play3.0-mini

openapi-tts-play30mini POST /api/v1/tts/stream
Streams the audio bytes with out ultra-fast text-in, audio-out API. 
Our HTTP streaming endpoint allows you to send text and receive audio bytes in real-time.


`Play3.0-mini` is a lightweight model that generates high-quality audio with a focus on speed.

<Tip>
  Check out the [How to use PlayDialog Text-to-Speech API](/documentation/guides/how-to-use-playdialog-tts-api) guide
  for a step-by-step approach to using the PlayDialog API to convert text into natural human-like sounding audio.
</Tip>

## Get your Credentials

To use the HTTP API you will need an API Key and a User Id, [you can easily get those here](https://play.ai/developers).

## Explore our models: `Play3.0-mini` and `PlayDialog`

Our API currently supports two models: `Play3.0-mini` and `PlayDialog`.
Use the `model` parameter to select the model you want to use.
`Play3.0-mini` is a lightweight model that generates high-quality audio with a focus on speed.
`PlayDialog` is a more advanced model that can generate turn-based dialogues with multiple voices.

For details on the specific properties of each, see the examples below.

## Example

For code examples, see the interactive code snippets to the right.
The provided examples will return an audio buffer stream that you can use to save locally or stream over the network to a browser, app, or telephony system.

For the complete list of supported parameters, see below.


# Text-to-Speech Real-time HTTP streaming - PlayDialog

openapi-tts-playdialog POST /api/v1/tts/stream
Streams the audio bytes with out ultra-fast text-in, audio-out API. 
Our HTTP streaming endpoint allows you to send text and receive audio bytes in real-time.


<Tip>
  Check out the [How to use PlayDialog Text-to-Speech API](/documentation/guides/how-to-use-playdialog-tts-api) guide
  for a step-by-step approach to using the PlayDialog API to convert text into natural human-like sounding audio.

  Make sure to see the [Create a Multi-Turn Scripted Conversation with the PlayDialog API](/documentation/guides/create-a-multi-turn-scripted-conversation-with-tts-api)
  guide for examples on how to create a multi-turn scripted conversation between two distinct speakers.
</Tip>

## Get your Credentials

To use the HTTP API you will need an API Key and a User Id, [you can easily get those here](https://play.ai/developers).

## Explore our models: `Play3.0-mini` and `PlayDialog`

Our API currently supports two models: `Play3.0-mini` and `PlayDialog`.
Use the `model` parameter to select the model you want to use.
`Play3.0-mini` is a lightweight model that generates high-quality audio with a focus on speed.
`PlayDialog` is a more advanced model that can generate turn-based dialogues with multiple voices.

For details on the specific properties of each, see the examples below.

## Example

For code examples, see the interactive code snippets to the right.
The provided examples will return an audio buffer stream that you can use to save locally or stream over the network to a browser, app, or telephony system.

For the complete list of supported parameters, see below.


# List of Voices

PlayAI Voices

# Default Voices

The following are some of the default PlayAI Voices that can be used with PlayDialog and PlayNote APIs.

Pick the voice you like and use its id for `voice` and `voice2`.

| Voice Name | Accent                 | Gender | Age    | Style          | ID                                                                                       |
| ---------- | ---------------------- | ------ | ------ | -------------- | ---------------------------------------------------------------------------------------- |
| Angelo     | US                     | M      | Young  | Conversational | s3://voice-cloning-zero-shot/baf1ef41-36b6-428c-9bdf-50ba54682bd8/original/manifest.json |
| Arsenio    | US African American    | M      | Middle | Conversational | s3://voice-cloning-zero-shot/65977f5e-a22a-4b36-861b-ecede19bdd65/original/manifest.json |
| Cillian    | Irish                  | M      | Middle | Conversational | s3://voice-cloning-zero-shot/1591b954-8760-41a9-bc58-9176a68c5726/original/manifest.json |
| Timo       | US                     | M      | Middle | Conversational | s3://voice-cloning-zero-shot/677a4ae3-252f-476e-85ce-eeed68e85951/original/manifest.json |
| Dexter     | US                     | M      | Middle | Conversational | s3://voice-cloning-zero-shot/b27bc13e-996f-4841-b584-4d35801aea98/original/manifest.json |
| Miles      | US African American    | M      | Young  | Conversational | s3://voice-cloning-zero-shot/29dd9a52-bd32-4a6e-bff1-bbb98dcc286a/original/manifest.json |
| Briggs     | US Southern (Oklahoma) | M      | Old    | Conversational | s3://voice-cloning-zero-shot/71cdb799-1e03-41c6-8a05-f7cd55134b0b/original/manifest.json |
| Deedee     | US African American    | F      | Middle | Conversational | s3://voice-cloning-zero-shot/e040bd1b-f190-4bdb-83f0-75ef85b18f84/original/manifest.json |
| Nia        | US                     | F      | Young  | Conversational | s3://voice-cloning-zero-shot/831bd330-85c6-4333-b2b4-10c476ea3491/original/manifest.json |
| Inara      | US African American    | F      | Middle | Conversational | s3://voice-cloning-zero-shot/adb83b67-8d75-48ff-ad4d-a0840d231ef1/original/manifest.json |
| Constanza  | US Latin American      | F      | Young  | Conversational | s3://voice-cloning-zero-shot/b0aca4d7-1738-4848-a80b-307ac44a7298/original/manifest.json |
| Gideon     | British                | M      | Old    | Narrative      | s3://voice-cloning-zero-shot/5a3a1168-7793-4b2c-8f90-aff2b5232131/original/manifest.json |
| Casper     | US                     | M      | Middle | Narrative      | s3://voice-cloning-zero-shot/1bbc6986-fadf-4bd8-98aa-b86fed0476e9/original/manifest.json |
| Mitch      | Australian             | M      | Middle | Narrative      | s3://voice-cloning-zero-shot/c14e50f2-c5e3-47d1-8c45-fa4b67803d19/original/manifest.json |
| Ava        | Australian             | F      | Middle | Narrative      | s3://voice-cloning-zero-shot/50381567-ff7b-46d2-bfdc-a9584a85e08d/original/manifest.json |

# Cloned Voices

Support for Cloned Voices in PlayAI API is coming soon!&#x20;


# Websocket API

Enhance your app with our audio-in, audio-out API, enabling seamless, natural conversations with your PlayAI agent. Transform your user experience with the power of voice.

<Tip>
  To use our WebSocket, you will need beforehand:

  *   A [Play.ai account](https://play.ai/pricing)
  *   An [API key to authenticate](https://play.ai/developers) with the Play.ai API

  To fully leverage our WebSocket API, the steps are:

  *   Send a POST request to `https://api.play.ai/api/v1/auth` with `Authorization: Bearer <your_api_key>` and `X-User-Id: <your_user_id>` headers
  *   Receive a JSON response with a `webSocketUrl` field containing the WebSocket URL
  *   Connect to the provided `webSocketUrl` URL
  *   Send TTS commands with the same options as our [TTS streaming API](/tts-api-reference/endpoints/v1/tts/stream/post), but in `snake_case`, e.g., `{"text":"Hello World","voice":"...","output_format":"mp3"}`
  *   Receive audio output as binary messages
</Tip>

<br />

# Quickstart - Runnable Demo

If you want to get started quickly, you can clone the [`play-showcase`](https://github.com/playht/playai-showcase) repository
and run the [`tts-websocket`](https://github.com/playht/playai-showcase/tree/main/tts-websocket) app locally.

```shell
# Clone this repository
git clone https://github.com/playht/playai-showcase.git
# Navigate to the tts-websocket demo app
cd tts-websocket
# NPM install
npm install
# Run the server and follow the instructions
npm start
```

<br />

# Establishing a WebSocket Connection

To establish a WebSocket connection, you will need to send a POST request to the `https://api.play.ai/api/v1/tts/auth` endpoint with the following headers:

```Text HTTP
Authorization: Bearer <your_api_key>
X-User-Id: <your_user_id>
Content-Type: application/json
```

You can obtain your `api_key` and `user_id` from your [PlayAI account](https://play.ai/developers).

The response will contain a JSON object with a `webSocketUrl` field that you can use to connect to the WebSocket server.

```json
{
  "Play3.0-mini": {
    "httpStreamingUrl": "https://fal.run/playht-fal/playht-tts/stream?fal_jwt_token=<your_session_token>",
    "webSocketUrl": "wss://fal.run/playht-fal/playht-tts/ws?fal_jwt_token=<your_session_token>"
  },
  "PlayDialog": {
    "httpStreamingUrl": "https://fal.run/playht-fal/playht-tts-ldm/stream?fal_jwt_token=<your_session_token>",
    "webSocketUrl": "wss://fal.run/playht-fal/playht-tts-ldm/ws?fal_jwt_token=<your_session_token>"
  },
  "expiresAt": "2024-10-28T05:13:04.650Z"
}
```

After this point, you can forward the `webSocketUrl` to your WebSocket client to establish a connection, such as in the following example:

```javascript
const ws = new WebSocket('wss://fal.run/playht-fal/playht-tts/ws?fal_jwt_token=<your_session_token>');
```

<br />

# Sending TTS Commands

Once connected to the WebSocket, you can send TTS commands as JSON messages.
The structure of these commands is similar to our [TTS streaming API](/tts-api-reference/endpoints/v1/tts/stream/post), but in `snake_case`.

Here's an example:

```javascript
const ttsCommand = {
  text: 'Hello, world! This is a test of the PlayAI TTS WebSocket API.',
  voice: 's3://voice-cloning-zero-shot/775ae416-49bb-4fb6-bd45-740f205d20a1/jennifersaad/manifest.json',
  output_format: 'mp3',
  temperature: 0.7,
};

ws.send(JSON.stringify(ttsCommand));
```

Examples of the [available options for the TTS command](/tts-api-reference/endpoints/v1/tts/stream/post) are:

*   `request_id` (optional): A unique identifier for the request, useful for correlating responses (see more details below).
*   `text` (required): The text to be converted to speech.
*   `voice` (required): The voice ID or URL to use for synthesis.
*   `output_format` (optional): The desired audio format (default is "mp3").
*   `temperature` (optional): Controls the randomness of the generated speech (0.0 to 1.0).
*   `speed` (optional): The speed of the generated speech (0.5 to 2.0).

For the complete list of parameters, refer to the [TTS API documentation](/tts-api-reference/endpoints/v1/tts/stream/post).

<br />

# Receiving Audio Output

After sending a TTS command, you'll receive two kinds of messages:

*   The audio output as a series of binary messages.
*   One final text message with the format `{"request_id":<request_id>}` to indicate the end of the audio stream.
*   In this response message, `request_id` is the unique identifier you provided in the TTS command, or `null` if you didn't provide one.

To handle these messages and play the audio, you can use the following approach:

```javascript
let audioChunks = [];

ws.onmessage = (event) => {
  if (event.data instanceof Blob) {
    // Received binary audio data
    audioChunks.push(event.data);
  } else {
    // Received a text message (e.g., request_id )
    const message = JSON.parse(event.data);
    if ('request_id' in message) {
      // If you provided a request_id, you can use it to correlate responses
      // End of audio stream, play the audio
      // If you specified a different output_format, you may need to adjust the audio player logic accordingly
      const audioBlob = new Blob(audioChunks, { type: 'audio/mpeg' });
      const audioUrl = URL.createObjectURL(audioBlob);
      const audio = new Audio(audioUrl);
      audio.play();

      // Clear the audio chunks for the next request
      audioChunks = [];
    }
  }
};
```

This code collects the binary audio chunks as they arrive and combines them into a single audio blob when the
*End or Request* message (`{"request_id":<request id>}`) is received. It then creates an audio URL and plays the audio using the Web Audio API.

<br />

# Error Handling

It's important to implement error handling in your WebSocket client. Here's an example of how to handle errors and connection closures:

```javascript
ws.onerror = (error) => {
  console.error('WebSocket Error:', error);
};

ws.onclose = (event) => {
  console.log('WebSocket connection closed:', event.code, event.reason);
  // Implement reconnection logic if needed
};
```

<br />

# Connection Timeout

To ensure optimal usage, WebSocket connections may be closed by intermediary proxies if they remain idle for longer than **10 seconds**. To keep the connection alive, you can send new TTS commands, which will generate audio in a similar way to the first request.

<br />

# Best Practices

1.  **Authentication**: Always keep your API key secure. While the WebSocket URL can be shared with client-side code, the API Key and User ID should be kept private.

2.  **Error Handling**: Implement robust error handling and reconnection logic in your WebSocket client.

3.  **Resource Management**: Close the WebSocket connection when it's no longer needed to free up server resources.

4.  **Rate Limiting**: Be aware of [rate limits](/documentation/guides/rate-limits) on the API and implement appropriate throttling in your application.

5.  **Testing**: Thoroughly test your implementation with various inputs and network conditions to ensure reliability.

By following these guidelines and using the provided examples, you can effectively integrate the PlayAI TTS WebSocket API into your application, enabling real-time text-to-speech functionality with low latency and high performance.
Property	Accepted values	Description	Default value
`type` (required)	`"setup"`	Specifies that the message is a setup command.	-
`apiKey` (required)	`string`	[Your API Key](https://play.ai/developers).	-
`outputFormat` (optional)	* `"mp3"` * `"raw"` * `"wav"` * `"ogg"` * `"flac"` * `"mulaw"`	The format of audio you want our agent to output in the `audioStream` messages. * `mp3` = 128kbps MP3 * `raw` = PCM\_FP32 * `wav` = 16-bit (uint16) PCM * `ogg` = 80kbps OGG Vorbis * `flac` = 16-bit (int16) FLAC * `mulaw` = 8-bit (uint8) PCM headerless	`"mp3"`
`outputSampleRate` (optional)	`number`	The sample rate of the audio you want our agent to output in the `audioStream` messages	`44100`
`inputEncoding` (optional)	For non-headerless formats: `"media-container"` For headerless formats: * `"mulaw"` * `"linear16"` * `"flac"` * `"amr-nb"` * `"amr-wb"` * `"opus"` * `"speex"` * `"g729"`	The encoding of the audio you intend to send in the `audioIn` messages. If your are sending audio formats that use media containers (that is, audio that contain headers, such as `mp4`, `m4a`, `mp3`, `ogg`, `flac`, `wav`, `mkv`, `webm`, `aiff`), just use `"media-container"` as value for `inputEncoding` (or don't pass any value at all since `"media-container"` is the default). This will instruct our servers to process the audio based on the data headers. If, on the other hand, you will send us audio in headerless formats, you have to specify the format you will be sending. In this case, specify it by, e.g., setting `inputEncoding` to `"mulaw"`, `"flac"`, etc.	`"media-container"`
`inputSampleRate` (optional)	`number`	The sample rate of the audio you intend to send. Required if you are specifying an `inputEncoding` different than `"media-container"`. Optional, otherwise	-
`customGreeting` (optional)	`string`	Your agent will say this message to start every conversation. This overrides the agent's greeting.	-
`prompt` (optional)	`string`	Give instructions to your AI about how it should behave and interact with others in conversation. This is appended to the agent's prompt.	`""`
`continueConversation` (optional)	`string`	If you want to continue a conversation from a previous session, pass the `conversationId` here. The agent will continue the conversation from where it left off.	-