> ## Documentation Index
> Fetch the complete documentation index at: https://docs.play.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Websocket API

> Enhance your app with our audio-in, audio-out API, enabling seamless, natural conversations with your PlayAI agent. Transform your user experience with the power of voice.

<Tip>
  To fully leverage our WebSocket API, the steps are:

  * Send a POST request to `https://api.play.ai/api/v1/websocket-auth` with `Authorization: Bearer <your_api_key>` and `X-User-Id: <your_user_id>` headers
  * Receive a JSON response with a `webSocketUrls` field containing the WebSocket URL according to the desired model
  * Connect to the provided websocket URL
  * Send TTS commands with the same options as our [TTS streaming API](/api-reference/text-to-speech/endpoints/v1/stream-speech), but in `snake_case`, e.g., `{"text":"Hello World","voice":"...","output_format":"mp3"}`
  * Receive audio output as binary messages
</Tip>

## Prerequisites

* [Access credentials](https://play.ai/api/keys) to get your API key and User ID.

# Quickstart - Runnable Demo

If you want to get started quickly, you can clone the [`play-showcase`](https://github.com/playht/playai-showcase) repository
and run the [`tts-websocket`](https://github.com/playht/playai-showcase/tree/main/tts-websocket) app locally.

```shell theme={null}
# Clone this repository
git clone https://github.com/playht/playai-showcase.git
# Navigate to the tts-websocket demo app
cd tts-websocket
# NPM install
npm install
# Run the server and follow the instructions
npm start
```

<br />

# Establishing a WebSocket Connection

To establish a WebSocket connection, you will need to send a POST request to the `https://api.play.ai/api/v1/tts/websocket-auth` endpoint with the following headers:

```Text HTTP theme={null}
Authorization: Bearer <your_api_key>
X-User-Id: <your_user_id>
Content-Type: application/json
```

You can obtain your `api_key` and `user_id` from your [PlayAI account](https://play.ai/api/keys).

The response will contain a JSON object with a `webSocketUrls` field that you can use to connect to the WebSocket server according to the desired model.

```json theme={null}
{
  "webSocketUrls": {
    "Play3.0-mini": "wss://ws.fal.run/playht-fal/playht-tts/stream?fal_jwt_token=<your_session_token>",
    "PlayDialog": "wss://ws.fal.run/playht-fal/playht-tts-ldm/stream?fal_jwt_token=<your_session_token>",
    "PlayDialogArabic": "wss://ws.fal.run/playht-fal/playht-tts-arabic-ldm/stream?fal_jwt_token=<your_session_token>",
    "PlayDialogHindi": "wss://ws.fal.run/playht-fal/playht-tts-hindi-ldm/stream?fal_jwt_token=<your_session_token>",
    "PlayDialogLora": "wss://ws.fal.run/playht-fal/playht-tts-ldm-lora/stream?fal_jwt_token=<your_session_token>",
    "PlayDialogMultilingual": "wss://ws.fal.run/playht-fal/playht-tts-multilingual-ldm/stream?fal_jwt_token=<your_session_token>"
  },
  "expiresAt": "2025-01-06T05:13:04.650Z"
}
```

After this point, you can forward the `webSocketUrls[<desired model>]` to your WebSocket client to establish a connection, such as in the following example:

```javascript theme={null}
const ws = new WebSocket('wss://ws.fal.run/playht-fal/playht-tts/stream?fal_jwt_token=<your_session_token>');
```

<Warning>
  The WebSocket connection duration is **1 hour**.
  After this period, you will need to re-authenticate and establish a new connection.
</Warning>

<br />

# Sending TTS Commands

Once connected to the WebSocket, you can send TTS commands as JSON messages.
The structure of these commands is similar to our [TTS streaming API](/api-reference/text-to-speech/endpoints/v1/stream-speech), but in `snake_case`.

Here's an example:

```javascript theme={null}
const ttsCommand = {
  text: 'Hello, world! This is a test of the PlayAI TTS WebSocket API.',
  voice: 's3://voice-cloning-zero-shot/775ae416-49bb-4fb6-bd45-740f205d20a1/jennifersaad/manifest.json',
  output_format: 'mp3',
  temperature: 0.7,
};

ws.send(JSON.stringify(ttsCommand));
```

Examples of the [available options for the TTS command](/api-reference/text-to-speech/endpoints/v1/stream-speech) are:

* `request_id` (optional): A unique identifier for the request, useful for correlating responses (see more details below).
* `text` (required): The text to be converted to speech.
* `voice` (required): The voice ID or URL to use for synthesis.
* `output_format` (optional): The desired audio format (default is "mp3").
* `temperature` (optional): Controls the randomness of the generated speech (0.0 to 1.0).
* `speed` (optional): The speed of the generated speech (0.5 to 2.0).

For the complete list of parameters, refer to the [TTS API documentation](/api-reference/text-to-speech/endpoints/v1/stream-speech).

<br />

# Receiving Audio Output

<Tip>
  If you send a sequence of TTS commands, the audio output will be in the same order as the requests.
</Tip>

After sending a TTS command, you'll receive two kinds of messages:

* One initial text message with the format `{"type":"start","request_id":<request_id>}` to acknowledge the request.
* The audio output as a series of binary messages.
* One final text message with the format `{"type":"end","request_id":<request_id>}` to indicate the end of the audio stream.
* In this response message, `request_id` is the unique identifier you provided in the TTS command, or `null` if you didn't provide one.

To handle these messages and play the audio, you can use the following approach:

```javascript theme={null}
let audioChunks = [];

ws.onmessage = (event) => {
  if (event.data instanceof Blob) {
    // Received binary audio data
    audioChunks.push(event.data);
  } else {
    // Received a text message (e.g., request_id )
    const message = JSON.parse(event.data);
    if (message.type === 'end') {
      // If you provided a request_id, you can use it to correlate responses
      // End of audio stream, play the audio
      // If you specified a different output_format, you may need to adjust the audio player logic accordingly
      const audioBlob = new Blob(audioChunks, { type: 'audio/mpeg' });
      const audioUrl = URL.createObjectURL(audioBlob);
      const audio = new Audio(audioUrl);
      audio.play();

      // Clear the audio chunks for the next request
      audioChunks = [];
    }
  }
};
```

This code collects the binary audio chunks as they arrive and combines them into a single audio blob when the
*End or Request* message (`{"type":"end","request_id":<request id>}`) is received.
It then creates an audio URL and plays the audio using the Web Audio API.

<br />

# Error Handling

It's important to implement error handling in your WebSocket client. Here's an example of how to handle errors and connection closures:

```javascript theme={null}
ws.onerror = (error) => {
  console.error('WebSocket Error:', error);
};

ws.onclose = (event) => {
  console.log('WebSocket connection closed:', event.code, event.reason);
  // Implement reconnection logic if needed
};
```

<br />

# Best Practices

1. **Authentication**: Always keep your API key secure. While the WebSocket URL can be shared with client-side code, the API Key and User ID should be kept private.

2. **Error Handling**: Implement robust error handling and reconnection logic in your WebSocket client.

3. **Resource Management**: Close the WebSocket connection when it's no longer needed to free up server resources.

4. **Rate Limiting**: Be aware of [rate limits](/documentation/resources/rate-limits) on the API and implement appropriate throttling in your application.

5. **Testing**: Thoroughly test your implementation with various inputs and network conditions to ensure reliability.

By following these guidelines and using the provided examples, you can effectively integrate the PlayAI TTS WebSocket API into your application, enabling real-time text-to-speech functionality with low latency and high performance.
