Responses API
Overview
Section titled “Overview”The Responses API is OpenAI’s newer API surface — an alternative to Chat
Completions. It uses a different wire format: system messages become an
instructions parameter, user and assistant messages become an input array,
tools use a flat structure, and structured output is configured via text.format
instead of response_format.
To use it, set apiType: responses in your .prompty file. No code changes are
needed — the runtime handles the wire format conversion automatically.
flowchart LR
subgraph CC["Chat Completions"]
direction LR
A1["messages array\n(system + user\n+ assistant)"] --> B1["chat.completions\n.create()"]
B1 --> C1["choices[0]\n.message.content"]
end
subgraph RA["Responses API"]
direction LR
A2["instructions\n+ input array"] --> B2["responses\n.create()"]
B2 --> C2["output[]\nitems"]
end
style CC fill:#dbeafe,stroke:#3b82f6,color:#1e40af
style RA fill:#d1fae5,stroke:#10b981,color:#065f46
style A1 fill:#bfdbfe,stroke:#3b82f6,color:#1e3a8a
style B1 fill:#bfdbfe,stroke:#3b82f6,color:#1e3a8a
style C1 fill:#bfdbfe,stroke:#3b82f6,color:#1e3a8a
style A2 fill:#a7f3d0,stroke:#10b981,color:#065f46
style B2 fill:#a7f3d0,stroke:#10b981,color:#065f46
style C2 fill:#a7f3d0,stroke:#10b981,color:#065f46
Basic Configuration
Section titled “Basic Configuration”Set apiType: responses in the model section of your .prompty frontmatter:
---name: responses-examplemodel: id: gpt-4o provider: openai apiType: responses connection: kind: key apiKey: ${env:OPENAI_API_KEY} options: temperature: 0.7 maxOutputTokens: 1000inputs: question: kind: string default: What is Prompty?---system:You are a helpful assistant.
user:{{question}}For Microsoft Foundry:
---name: foundry-responsesmodel: id: gpt-4o provider: foundry apiType: responses connection: kind: key endpoint: ${env:AZURE_AI_PROJECT_ENDPOINT} apiKey: ${env:AZURE_AI_PROJECT_KEY}---system:You are a helpful assistant.
user:{{question}}The Responses API is transparent to your calling code. The same invoke() and
invoke_async() functions work — the runtime dispatches to responses.create()
based on the apiType in the .prompty file.
from prompty import invoke, invoke_async
# Syncresult = invoke("my-prompt.prompty", inputs={"question": "Hello!"})print(result) # "Hi there! How can I help?"
# Asyncresult = await invoke_async("my-prompt.prompty", inputs={"question": "Hello!"})print(result)import { invoke } from "@prompty/core";import "@prompty/openai"; // or "@prompty/foundry"
const result = await invoke("my-prompt.prompty", { question: "Hello!" });console.log(result); // "Hi there! How can I help?"using Prompty.Core;
var result = await Pipeline.InvokeAsync("my-prompt.prompty", new() { ["question"] = "Hello!" });Console.WriteLine(result); // "Hi there! How can I help?"Rust does not yet support the OpenAI Responses API. Use the standard Chat Completions API instead. Contributions welcome!
How It Differs from Chat Completions
Section titled “How It Differs from Chat Completions”Under the hood, the runtime converts your messages to a different wire format
when apiType: responses is set. You don’t need to handle this yourself — but
understanding the differences helps when debugging.
Request Format
Section titled “Request Format”| Aspect | Chat Completions | Responses API |
|---|---|---|
| API call | client.chat.completions.create() | client.responses.create() |
| System messages | In messages array with role: system | Separate instructions parameter |
| User/assistant messages | In messages array | In input array |
| Tool definitions | Nested: {type: "function", function: {name, parameters}} | Flat: {type: "function", name, parameters} |
| Structured output | response_format parameter | text.format parameter |
| Max tokens option | max_completion_tokens | max_output_tokens |
Response Format
Section titled “Response Format”| Aspect | Chat Completions | Responses API |
|---|---|---|
| Response object | object: "chat.completion" | object: "response" |
| Content location | choices[0].message.content | output[] items or output_text |
| Tool calls | choices[0].message.tool_calls | output[] items with type: "function_call" |
| Finish indicator | finish_reason: "stop" | No function_call items in output |
Wire Format Example
Section titled “Wire Format Example”Here’s what the runtime sends for a simple prompt with apiType: responses:
{ "model": "gpt-4o", "instructions": "You are a helpful assistant.", "input": [ { "role": "user", "content": "What is Prompty?" } ], "max_output_tokens": 1000, "temperature": 0.7}Compare with the equivalent Chat Completions request:
{ "model": "gpt-4o", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "What is Prompty?" } ], "max_completion_tokens": 1000, "temperature": 0.7}Tool Calling
Section titled “Tool Calling”Tool calling works with the Responses API through the same turn() /
turn() functions used for Chat Completions.The agent loop
automatically detects the Responses API response format and handles it
correctly.
Prompty File with Tools
Section titled “Prompty File with Tools”---name: weather-agentmodel: id: gpt-4o provider: openai apiType: responses connection: kind: key apiKey: ${env:OPENAI_API_KEY} options: temperature: 0inputs: question: kind: string default: What's the weather?tools: - name: get_weather kind: function description: Get the current weather for a city parameters: - name: city kind: string description: City name required: true strict: true---system:You are a helpful assistant with access to weather tools.
user:{{question}}Running the Agent Loop
Section titled “Running the Agent Loop”from prompty import load, turn, tool, bind_tools
@tooldef get_weather(city: str) -> str: """Get the current weather for a city.""" return f"72°F and sunny in {city}"
agent = load("responses-agent.prompty")tools = bind_tools(agent, [get_weather])
result = turn( agent, inputs={"question": "What's the weather in Seattle?"}, tools=tools, max_iterations=10,)print(result) # "It's currently 72°F and sunny in Seattle!"import { load, turn, tool, bindTools } from "@prompty/core";
const getWeather = tool( (city: string) => `72°F and sunny in ${city}`, { name: "get_weather", description: "Get the current weather for a city", parameters: [{ name: "city", kind: "string", required: true }], },);
const agent = await load("responses-agent.prompty");const tools = bindTools(agent, [getWeather]);
const result = await turn(agent, { question: "What's the weather in Seattle?",}, { tools, maxIterations: 10 });console.log(result);using Prompty.Core;
public class WeatherTools{ [Tool(Name = "get_weather", Description = "Get the current weather")] public string GetWeather(string city) { return $"72°F and sunny in {city}"; }}
var agent = PromptyLoader.Load("responses-agent.prompty");var service = new WeatherTools();var tools = ToolAttribute.BindTools(agent, service);
var result = await Pipeline.TurnAsync( agent, new() { ["question"] = "What's the weather in Seattle?" }, tools: tools, maxIterations: 10);Console.WriteLine(result); // "It's currently 72°F and sunny in Seattle!"Rust does not yet support the OpenAI Responses API. Use the standard Chat Completions API instead. Contributions welcome!
How Tool Calls Work with the Responses API
Section titled “How Tool Calls Work with the Responses API”The Responses API uses a different format for tool calls than Chat Completions. The runtime handles this automatically, but here’s what happens under the hood:
flowchart TD
A["Send instructions + input\nto responses.create()"] --> B["Receive response\nwith output[] items"]
B --> C{"output contains\nfunction_call items?"}
C -- Yes --> D["Execute tool functions"]
D --> E["Build input with\nfunction_call +\nfunction_call_output items"]
E -.-> A
C -- No --> F["Extract output_text\nand return result"]
style A fill:#3b82f6,stroke:#1d4ed8,color:#fff
style B fill:#3b82f6,stroke:#1d4ed8,color:#fff
style C fill:#f59e0b,stroke:#d97706,color:#fff
style D fill:#10b981,stroke:#059669,color:#fff
style E fill:#10b981,stroke:#059669,color:#fff
style F fill:#1d4ed8,stroke:#1e3a8a,color:#fff
The wire format for tool interactions differs from Chat Completions:
| Step | Chat Completions | Responses API |
|---|---|---|
| Tool call from LLM | message.tool_calls[] with id, function.name, function.arguments | output[] item with type: "function_call", call_id, name, arguments |
| Tool result to LLM | Message with role: "tool" and tool_call_id | function_call_output input item with call_id and output |
| Context preservation | Assistant message with tool_calls | Original function_call items re-sent in input |
Structured Output
Section titled “Structured Output”When outputs is defined, the runtime converts it to the Responses API’s
text.format parameter (instead of Chat Completions’ response_format). The
processor automatically parses the JSON response.
---name: weather-reportmodel: id: gpt-4o provider: openai apiType: responses connection: kind: key apiKey: ${env:OPENAI_API_KEY}outputs: - name: city kind: string description: The city name - name: temperature kind: integer description: Temperature in Fahrenheit - name: conditions kind: string description: Current weather conditions---system:Return the current weatherfor the requested city.
user:Weather in {{city}}?from prompty import invoke
result = invoke("structured-responses.prompty", inputs={"city": "Seattle"})
# result is already a parsed dictprint(result["city"]) # "Seattle"print(result["temperature"]) # 62print(result["conditions"]) # "Partly cloudy"print(type(result)) # <class 'dict'>import { invoke } from "@prompty/core";import "@prompty/openai";
const result = await invoke("structured-responses.prompty", { city: "Seattle" });
// result is already a parsed objectconsole.log(result.city); // "Seattle"console.log(result.temperature); // 62console.log(result.conditions); // "Partly cloudy"using Prompty.Core;
var result = await Pipeline.InvokeAsync("structured-responses.prompty", new() { ["city"] = "Seattle" });
// result is a parsed JsonElement when outputs is presentConsole.WriteLine(result.GetProperty("city")); // "Seattle"Console.WriteLine(result.GetProperty("temperature")); // 62Console.WriteLine(result.GetProperty("conditions")); // "Partly cloudy"Rust does not yet support the OpenAI Responses API. Use the standard Chat Completions API instead. Contributions welcome!
The runtime sends structured output as text.formatinstead of response_format:
{ "model": "gpt-4o", "instructions": "Return the current weather for the requested city.", "input": [{ "role": "user", "content": "Weather in Seattle?" }], "text": { "format": { "type": "json_schema", "name": "weather_report", "strict": true, "schema": { "type": "object", "properties": { "city": { "type": "string", "description": "The city name" }, "temperature": { "type": "integer", "description": "Temperature in Fahrenheit" }, "conditions": { "type": "string", "description": "Current weather conditions" } }, "required": ["city", "temperature", "conditions"], "additionalProperties": false } } }}Provider Support
Section titled “Provider Support”Not all providers support every API type. Here’s what’s available:
| Provider | apiType: chat | apiType: responses | apiType: embedding | apiType: image |
|---|---|---|---|---|
| OpenAI | ✅ | ✅ | ✅ | ✅ |
| Microsoft Foundry | ✅ | ✅ | ✅ | ❌ |
| Anthropic | ✅ | ❌ | ❌ | ❌ |
When to Use Responses API vs Chat Completions
Section titled “When to Use Responses API vs Chat Completions”| Use Case | Recommendation |
|---|---|
| Maximum provider compatibility | apiType: chat (default) |
| OpenAI or Foundry only, want the latest API features | apiType: responses |
| Anthropic models | apiType: chat (only option) |
| Existing prompts that work fine | Keep apiType: chat |
| Starting a new project on OpenAI | Either works — responses is newer |