A successful non-streamed response includes the following members:
Unique ID for each request (not message). Same ID for all responses in a streaming response.
The model used to generate the response.
One or more responses, depending on the n
parameter from the request.
Each response includes the following members.
Why the message ended.
The token counts for this request. Per-token billing is based on the prompt token and completion token counts and rates.
Setting stream = true
in the request will return a stream of messages, each containing one token. You can read more about streaming calls using the SDK.
The final message will be data: [DONE]
. All other messages will have data
set to a JSON object with the following fields:
An object containing either an object with the following members, or the string “DONE” for the last message.
Unique ID for each request (not message). Same ID for all streaming responses.
An array with one object containing the following fields:
Always zero.
{"role":"assistant"}
.{"content": **token**}
with the generated token.Why the message ended.
usage
will be null
except for the last chunk which contains the token usage statistics for the entire request.
500 - Internal Server Error
429 - Too Many Requests (You are sending requests too quickly.)
503 - Service Unavailable (The engine is currently overloaded, please try again later)
401 - Unauthorized (Incorrect API key provided/Invalid Authentication)
403 - Access Denied
422 - Unprocurable Entity (Request body is malformed)
A successful non-streamed response includes the following members:
Unique ID for each request (not message). Same ID for all responses in a streaming response.
The model used to generate the response.
One or more responses, depending on the n
parameter from the request.
Each response includes the following members.
Why the message ended.
The token counts for this request. Per-token billing is based on the prompt token and completion token counts and rates.
Setting stream = true
in the request will return a stream of messages, each containing one token. You can read more about streaming calls using the SDK.
The final message will be data: [DONE]
. All other messages will have data
set to a JSON object with the following fields:
An object containing either an object with the following members, or the string “DONE” for the last message.
Unique ID for each request (not message). Same ID for all streaming responses.
An array with one object containing the following fields:
Always zero.
{"role":"assistant"}
.{"content": **token**}
with the generated token.Why the message ended.
usage
will be null
except for the last chunk which contains the token usage statistics for the entire request.
500 - Internal Server Error
429 - Too Many Requests (You are sending requests too quickly.)
503 - Service Unavailable (The engine is currently overloaded, please try again later)
401 - Unauthorized (Incorrect API key provided/Invalid Authentication)
403 - Access Denied
422 - Unprocurable Entity (Request body is malformed)