AI Semantic Caching Policy

Phases

onRequest	onResponse	onMessageRequest	onMessageResponse
✅

Description

The ai-semantic-caching policy enables semantic caching of responses based on the similarity of request content. It uses an embedding model to transform incoming requests into vector representations, then compares them against previously cached vectors in a vector store. If a similar context is found, the cached response can be reused, saving computation and latency.

This policy integrates with AI resources such as:

Text embedding models
Vector stores

Semantic caching decisions and storage can be customized using Gravitee EL expressions.

ℹ️ This policy works best when used with stateless APIs or when identical responses can be safely reused for similar requests.

Configuration

You can configure the policy with the following options:

Property	Required	Description	Type	Default
`modelName`	✅	The name of the AI embedding model resource to use.	string	—
`vectorStoreName`	✅	The name of the vector store resource used to store and retrieve semantic embeddings.	string	—
`promptExpression`		EL expression to extract the content to embed (e.g. request body).	string	`{#request.content}`
`cacheCondition`		EL expression that determines whether the response is cacheable.	string	`{#response.status >= 200 && #response.status < 300}`
`parameters`		List of key-value pairs to store as metadata with the vector and/or in the query. Values support EL expressions.	array	—

Parameter Object Structure

Each parameter item contains:

Property	Description	Type
`key`	Name of the metadata field.	string
`value`	EL expression to extract the value from the context.	string
`encode`	Whether the value should be hashed using a secure encoding (e.g. for indexing sensitive data).	boolean

Example Configuration

{
  "name": "AI Semantic Caching",
  "enabled": true,
  "policy": "ai-semantic-caching",
  "configuration": {
    "modelName": "ai-model-text-embedding-resource",
    "vectorStoreName": "vector-store-redis-resource",
    "promptExpression": "{#jsonPath(#request.content, '$.messages[-1:].content')}",
    "cacheCondition": "{#response.status >= 200}",
    "parameters": [
      {
        "key": "retrieval_context_key",
        "value": "{#context.attributes['api']}",
        "encode": true
      }
    ]
  }
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.circleci		.circleci
.github		.github
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI Semantic Caching Policy

Phases

Description

Configuration

Parameter Object Structure

Example Configuration

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

gravitee-io/gravitee-policy-ai-context-caching

Folders and files

Latest commit

History

Repository files navigation

AI Semantic Caching Policy

Phases

Description

Configuration

Parameter Object Structure

Example Configuration

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages