Skip to content

[Feature Request]: Expose Ollama "thinking" in FunctionAgent's AgentStream #19270

@RakeshReddyKondeti

Description

@RakeshReddyKondeti

Feature Description

Ollama models now expose a thinking field in their streaming responses, providing insight into the model's intermediate reasoning steps. This was introduced in the Ollama class here.

I am requesting that the FunctionAgent (specifically in function_agent.py) and the AgentStream class be updated to include the thinking field from the LLM response in the streamed events. This would allow downstream application and UI to display the model's "thinking" process in real time.

ctx.write_event_to_stream(
    AgentStream(
        delta=last_chat_response.delta or "",
        response=last_chat_response.message.content or "",
        tool_calls=tool_calls or [],
        raw=raw,
        current_agent_name=self.name,
        thinking_delta=last_chat_response.additional_kwargs.get("thinking_delta", ""),
    )
)

Reason

Currently, the AgentStream object does not include or expose the thinking field, even though it is available in the Ollama LLM response (last_chat_response.additional_kwargs["thinking_delta"]). As a result, there is no way for downstream application to display this information. Since LlamaIndex is a framework supporting multiple LLM providers, and not all providers expose a thinking field, this field is not currently part of the standard agent streaming interface.

Value of Feature

Exposing the thinking field in AgentStream would enable developers to build interactive UIs that can show the model's intermediate reasoning ("thinking") as it streams, improving transparency and user experience.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesttriageIssue needs to be triaged/prioritized

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions