-
Notifications
You must be signed in to change notification settings - Fork 785
Description
Which component is this bug for?
OpenAI Instrumentation
π Description
When using the Responses API and the OpenAI instrumentor, messages are missing from the trace.
Other collected attributes, like the token count, have correctly increased. I also sniffed the requests directly, and can confirm additional input messages sent in the request to OpenAI's API, but these were not in the trace.
Additional context: I'm using store=False
and appending responses manually to the API calls due to ZDR. In other words, I'm using the Responses API in a stateless matter.
π Reproduction steps
Here's a self-contained reproducible example:
import json
import pytest
from openai import AsyncOpenAI
def add_two_numbers(a: int, b: int) -> int:
"""Adds two numbers together."""
return a + b
@pytest.mark.asyncio
async def test_openai_responses_tmp():
client = AsyncOpenAI()
tools = [
{
"type": "function",
"name": "add_two_numbers",
"description": "Adds two numbers together.",
"parameters": {
"type": "object",
"properties": {
"a": {
"description": "The first number to add.",
"type": "integer",
},
"b": {
"description": "The second number to add.",
"type": "integer",
},
},
"additionalProperties": False,
"required": ["a", "b"],
},
"strict": True,
},
]
context = [{"role": "user", "content": "Please add 1 + 2 and 3 + 4 for me."}]
initial_params = {
"model": "gpt-5",
"instructions": "You are a helpful assistant. Use the add_two_numbers function when asked to add numbers. Make parallel tool calls where possible.",
"input": context,
"tools": tools,
"include": ["reasoning.encrypted_content"],
"parallel_tool_calls": True,
"store": False,
"text": {"verbosity": "low"},
"reasoning": {"effort": "medium", "summary": "auto"},
}
response = await client.responses.create(**initial_params)
context += response.output
# Extract tool calls
tool_calls = []
for item in response.output:
if hasattr(item, "type") and item.type == "function_call":
tool_calls.append(item)
assert tool_calls, "No tool calls found"
# Step 2: Execute tool calls
for tool_call in tool_calls:
# Parse arguments and execute the function
args = json.loads(tool_call.arguments)
assert tool_call.name == "add_two_numbers"
result = add_two_numbers(args["a"], args["b"])
context.append(
{
"type": "function_call_output",
"call_id": tool_call.call_id,
"output": str(result),
}
)
continue_params = {
"model": "gpt-5",
"instructions": "You are a helpful assistant. Use the add_two_numbers function when asked to add numbers.",
"input": context,
"tools": tools,
"include": ["reasoning.encrypted_content"],
"parallel_tool_calls": True,
"store": False,
"text": {"verbosity": "low"},
"reasoning": {"effort": "medium", "summary": "auto"},
}
final_response = await client.responses.create(**continue_params)
print(f"Final response: {final_response}")
When tracing the second request, we see the input
field has value:
"input": [
{
"role": "user",
"content": "Please add 1 + 2 and 3 + 4 for me.",
"type": "message"
},
{
"id": "rs_ABC",
"summary": [],
"type": "reasoning",
"encrypted_content": "content"
},
{
"arguments": "{\"a\":1,\"b\":2}",
"call_id": "call_1",
"name": "add_two_numbers",
"type": "function_call",
"id": "fc_1",
"status": "completed"
},
{
"arguments": "{\"a\":3,\"b\":4}",
"call_id": "call_2",
"name": "add_two_numbers",
"type": "function_call",
"id": "fc_2",
"status": "completed"
},
{
"type": "function_call_output",
"call_id": "call_1",
"output": "3"
},
{
"type": "function_call_output",
"call_id": "call_2",
"output": "7"
}
],
π Expected behavior
The logged span should contain the tool calls, tool responses, and reasoning summaries (if available) from previous steps.
When tracing the second LLM request, we should see gen_ai fields logged for the tool call and response provided in the input. In practice, these are not included.
π Actual Behavior with Screenshots
Relevant metadata for second LLM trace (note: no tool calls traced):
gen_ai.completion.0.content: |-
1 + 2 = 3
3 + 4 = 7
gen_ai.completion.0.role: assistant
gen_ai.prompt.0.content: You are a helpful assistant. Use the add_two_numbers function when asked to add numbers.
gen_ai.prompt.0.role: system
gen_ai.prompt.1.content: Please add 1 + 2 and 3 + 4 for me.
gen_ai.prompt.1.role: user
gen_ai.request.model: gpt-5
gen_ai.response.id: resp_xxx
gen_ai.response.model: gpt-5-2025-08-07
gen_ai.system: openai
gen_ai.usage.cache_read_input_tokens: 0
gen_ai.usage.input_tokens: 351
gen_ai.usage.output_tokens: 19
π€ Python Version
No response
π Provide any additional context for the Bug.
- When I used OpenAI's Chat Completions API, the instrumentation worked correctly. So I believe this is down to the difference between trace collection for Responses vs Chat Completions.
π Have you spent some time to check if this bug has been raised before?
- I checked and didn't find similar issue
Are you willing to submit PR?
None