Enhance your app with our audio-in, audio-out API, enabling seamless, natural conversations with your PlayAI agent. Transform your user experience with the power of voice.
wss://api.play.ai/v1/talk/<your_agent_id>
URL{"type":"setup","apiKey":"yourKey"}
message as first message{"type":"audioIn","data":"base64Data"}
messages{"type":"audioStream","data":"base64Data"}
messagestalk
URL, including the agentId
as a path parameter:
Agent-XP5tVPa8GDWym6j
is the ID of an agent
you have created via our Web UI or through our Create Agent endpoint,
the WebSocket URL should look like:
setup
message to authenticate
and configure your session.
WebSocket basic connection, setup and message flow
apiKey
. This assumes you are comfortable with the default
values for audio input and audio output formats. In this scenario, your first setup message could be as simple as:
Property | Accepted values | Description | Default value |
---|---|---|---|
type (required) | "setup" | Specifies that the message is a setup command. | - |
apiKey (required) | string | Your API Key. | - |
outputFormat (optional) |
| The format of audio you want our agent to output in the audioStream messages.
| "mp3" |
outputSampleRate (optional) | number | The sample rate of the audio you want our agent to output in the audioStream messages | 44100 |
inputEncoding (optional) | For non-headerless formats:"media-container" For headerless formats:
| The encoding of the audio you intend to send in the audioIn messages.If your are sending audio formats that use media containers (that is, audio that contain headers, such as mp4 ,
m4a , mp3 , ogg , flac , wav , mkv , webm , aiff ), just use "media-container" as value for
inputEncoding (or don’t pass any value at all since "media-container" is the default).
This will instruct our servers to process the audio based on the data headers.If, on the other hand, you will send us audio in headerless formats, you have to specify the format you will
be sending. In this case, specify it by, e.g., setting inputEncoding to "mulaw" , "flac" , etc. | "media-container" |
inputSampleRate (optional) | number | The sample rate of the audio you intend to send. Required if you are specifying an inputEncoding different
than "media-container" . Optional, otherwise | - |
customGreeting (optional) | string | Your agent will say this message to start every conversation. This overrides the agent’s greeting. | - |
prompt (optional) | string | Give instructions to your AI about how it should behave and interact with others in conversation. This is appended to the agent’s prompt. | "" |
continueConversation (optional) | string | If you want to continue a conversation from a previous session, pass the conversationId here.
The agent will continue the conversation from where it left off. | - |
audioIn
: Sending Audio InputaudioIn
message.
The audio must be sent as a base64 encoded string in the data
field. The message format is:
inputEncoding
and inputSampleRate
you configured in the setup options.myWs
is a WebSocket connected to our /v1/talk
endpoint, the sample code below would
send audio directly from the browser:
audioStream
: Receiving Audio OutputaudioStream
message. The message format is:
outputFormat
and outputSampleRate
you configured in the setup options.voiceActivityStart
and voiceActivityEnd
voiceActivityStart
and voiceActivityEnd
messages indicating the detection
of speech activity in the audio input. These messages help in understanding when the user starts and stops speaking.
When our service detects that the user started to speak, it will emit a voiceActivityStart
event.
Such a message will have the format:
voiceActivityStart
generally indicates
the user wanted to interrupt the agent.
Similarly, when our service detects that the user stopped speaking, it emits a voiceActivityEnd
event:
newAudioStream
: Handling New Audio StreamsnewAudioStream
message indicates the start the audio of a new response.
It is recommended to clear your player buffer and start playing the new stream content upon receiving this message.
This message contains no additional fields.
error
message type, a numeric code and a message in the following format:
Error Code | Error Message |
---|---|
1001 | Invalid authorization token. |
1002 | Invalid agent id. |
1003 | Invalid authorization credentials. |
1005 | Not enough credits. |
4400 | Invalid parameters. Indicates the message sent to the server failed to match the expected format. Double check the logic and try again. |
4401 | Unauthorized. Invalid authorization token or invalid authorization credentials for specified agent. |
4429 | You have reached the maximum number of concurrent connections allowed by your current plan. Please consider upgrading your plan or reducing the number of active connections to continue. |
4500 | Generic error code for internal errors, such as failures to generate responses. Generally, the user is not at fault when these happen. An appropriate reaction is to wait a few moments and try again. If the problem persists, contacting support is advised. |