Adding a very simple implementation of an embeddings cache #18864

AstraBert · 2025-05-27T15:49:45Z

Description

Added a very simple implementation of an embeddings cache that:

Extends the BaseEmbedding class with an embeddings_cache, that is a SimpleKVStore
SimpleKVStore has now a maximum of data points that can be uploaded to it, to avoid the cache swamping the memory (since the cache is never cleared at runtime).
The clearing of the cache when it overflows the max number of data points (default is 1000) follows a FIFO logic
If caching is enabled, prior to the embedding we will have a search within the cache: the search within the cache will not be semantic, but it will be exact key-value match as the embeddings will be stored with the text to be embedded as the key and the embeddings vector as a the value. If there is a match within the cache, then the corresponding embeddings are returned, else we compute the embeddings, cache them and return them.

Example of a cache:

{
   "embeddings": {
        "text-to-be-embedded": {
              "uuid": [0, 0, 1, 0.3]
         },
        "text-to-be-embedded-2": {
              "uuid": [0, 0, 1, 0.3]
         },
    }
}

Fixes #18849

Type of Change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Your pull-request will likely not be merged unless it is covered by some form of impactful unit testing.

I added new unit tests to cover this change
I believe this change is already covered by existing unit tests

Suggested Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added Google Colab support for the newly added notebooks.
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I ran uv run make format; uv run make lint to appease the lint gods

Florian-BACHO · 2025-05-27T16:24:02Z

Hi @AstraBert,

Thanks for taking care of this. However, I believe that it would be better to accept a generic BaseKVStore object in BaseEmbedding instead of restricting the user to SimpleKVStore. Especially because there are many ways to implement caches. I invite you to have a look at the documentation of the cachetools package or simply the use of Redis to have an idea. I think it would be better to leave the SimpleKVStore class unchanged and only modify BaseEmbedding. Therefore, we could implement different caches (FIFO, LRU, TTL, and others) as Llamaindex integrations in future PRs.

logan-markewich · 2025-05-27T16:35:09Z

llama-index-core/llama_index/core/storage/kvstore/simple_kvstore.py

+        if collection in self._data and len(collection) > self._maximum_data_point:
+            if self._strict:
+                raise ValueError(f"Exceeded the maximum number of data points that can be uploaded to collection {collection}")
+            first_key = next(iter(self._data[collection].keys()))


I guess this relies on the dict being ordered 👀 (I can't remember how dict ordering works in python or if OrderedDict is needed or not)

Claude suggests that python dicts follow the order of the addition of the keys (since python 3.7) , but maybe we can follow @Florian-BACHO's suggestion and revert this adding support for BaseKVStore, and avoid modifying the SimpleKVStore at all

Yea supporting the BaseKVStore makes total sense (I missed that it was hardcoded to SimpleKVStore)

AstraBert · 2025-05-27T17:06:20Z

For some reasons ruff-format passes locally on pre-commit but does not pass here @logan-markewich :/

…SimpleKVStore

AstraBert · 2025-05-28T14:57:38Z

I'd say with this last commit this PR can be merged? @logan-markewich
I can see that @Florian-BACHO is already baking some cache integrations to implement this!

AstraBert added 5 commits May 26, 2025 21:40

first rough changes (untested)

c525b14

using only embeddings_cache in BaseEmbedding to enable/disable caching

803e988

using only embeddings_cache in BaseEmbedding to enable/disable caching

3c0cf20

Implementing caching in non-abstract methods for BaseEmbedding

9c78c9d

Handling circual imports and adding tests!

0039ab9

AstraBert linked an issue May 27, 2025 that may be closed by this pull request

[Feature Request]: Add Embedding Caching #18849

Closed

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label May 27, 2025

linting changes

9bdb4e4

logan-markewich reviewed May 27, 2025

View reviewed changes

linting changes

3b490c8

AstraBert added 2 commits May 27, 2025 20:39

linting and formatting changes (big changes)

8a806a8

Extending compatibility to BaseKVStore and reverting to the original …

56d6159

…SimpleKVStore

Florian-BACHO mentioned this pull request May 28, 2025

Add cachetools key-value store #18883

Closed

15 tasks

AstraBert merged commit a3c7991 into main May 28, 2025
10 checks passed

AstraBert deleted the clelia/feature-request-add-embedding-caching branch May 28, 2025 15:21

colca mentioned this pull request Jun 9, 2025

add message id colca/llama_index#2

Closed

18 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding a very simple implementation of an embeddings cache #18864

Adding a very simple implementation of an embeddings cache #18864

Uh oh!

AstraBert commented May 27, 2025

Uh oh!

Florian-BACHO commented May 27, 2025 •

edited

Loading

Uh oh!

logan-markewich May 27, 2025

Uh oh!

AstraBert May 27, 2025

Uh oh!

logan-markewich May 27, 2025

Uh oh!

AstraBert commented May 27, 2025

Uh oh!

AstraBert commented May 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Adding a very simple implementation of an embeddings cache #18864

Adding a very simple implementation of an embeddings cache #18864

Uh oh!

Conversation

AstraBert commented May 27, 2025

Description

Type of Change

How Has This Been Tested?

Suggested Checklist:

Uh oh!

Florian-BACHO commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

logan-markewich May 27, 2025

Choose a reason for hiding this comment

Uh oh!

AstraBert May 27, 2025

Choose a reason for hiding this comment

Uh oh!

logan-markewich May 27, 2025

Choose a reason for hiding this comment

Uh oh!

AstraBert commented May 27, 2025

Uh oh!

AstraBert commented May 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Florian-BACHO commented May 27, 2025 •

edited

Loading