Skip to content

[FEATURE] Prompt caching support for LiteLLM #937

@Didir19

Description

@Didir19

Problem Statement

While LiteLLM does support prompt caching via Bedrock, Strands does not support prompt caching via LiteLLM.

Proposed Solution

LiteLLM supports prompt caching for Bedrock by following the OpenAI prompt caching usage object format: https://docs.litellm.ai/docs/completion/prompt_caching

"usage": {
  "prompt_tokens": 2006,
  "completion_tokens": 300,
  "total_tokens": 2306,
  "prompt_tokens_details": {
    "cached_tokens": 1920
  },
  "completion_tokens_details": {
    "reasoning_tokens": 0
  }
  # ANTHROPIC_ONLY #
  "cache_creation_input_tokens": 0
}

Strands should support that format as well.

Use Case

The same way as it used via a bedrock model directly.

Alternatives Solutions

No response

Additional Context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestrefinedIssue is discussed with the team and the team has come to an effort estimate consensus

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions