Skip to content

Conversation

@AstraBert
Copy link
Member

Description

Added a very simple implementation of an embeddings cache that:

  • Extends the BaseEmbedding class with an embeddings_cache, that is a SimpleKVStore
  • SimpleKVStore has now a maximum of data points that can be uploaded to it, to avoid the cache swamping the memory (since the cache is never cleared at runtime).
  • The clearing of the cache when it overflows the max number of data points (default is 1000) follows a FIFO logic
  • If caching is enabled, prior to the embedding we will have a search within the cache: the search within the cache will not be semantic, but it will be exact key-value match as the embeddings will be stored with the text to be embedded as the key and the embeddings vector as a the value. If there is a match within the cache, then the corresponding embeddings are returned, else we compute the embeddings, cache them and return them.

Example of a cache:

{
   "embeddings": {
        "text-to-be-embedded": {
              "uuid": [0, 0, 1, 0.3]
         },
        "text-to-be-embedded-2": {
              "uuid": [0, 0, 1, 0.3]
         },
    }
}

Fixes #18849

Type of Change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Your pull-request will likely not be merged unless it is covered by some form of impactful unit testing.

  • I added new unit tests to cover this change
  • I believe this change is already covered by existing unit tests

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added Google Colab support for the newly added notebooks.
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I ran uv run make format; uv run make lint to appease the lint gods

@AstraBert AstraBert linked an issue May 27, 2025 that may be closed by this pull request
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label May 27, 2025
@Florian-BACHO
Copy link
Contributor

Florian-BACHO commented May 27, 2025

Hi @AstraBert,

Thanks for taking care of this. However, I believe that it would be better to accept a generic BaseKVStore object in BaseEmbedding instead of restricting the user to SimpleKVStore. Especially because there are many ways to implement caches. I invite you to have a look at the documentation of the cachetools package or simply the use of Redis to have an idea. I think it would be better to leave the SimpleKVStore class unchanged and only modify BaseEmbedding. Therefore, we could implement different caches (FIFO, LRU, TTL, and others) as Llamaindex integrations in future PRs.

if collection in self._data and len(collection) > self._maximum_data_point:
if self._strict:
raise ValueError(f"Exceeded the maximum number of data points that can be uploaded to collection {collection}")
first_key = next(iter(self._data[collection].keys()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this relies on the dict being ordered 👀 (I can't remember how dict ordering works in python or if OrderedDict is needed or not)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude suggests that python dicts follow the order of the addition of the keys (since python 3.7) , but maybe we can follow @Florian-BACHO's suggestion and revert this adding support for BaseKVStore, and avoid modifying the SimpleKVStore at all

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea supporting the BaseKVStore makes total sense (I missed that it was hardcoded to SimpleKVStore)

@AstraBert
Copy link
Member Author

For some reasons ruff-format passes locally on pre-commit but does not pass here @logan-markewich :/

@Florian-BACHO Florian-BACHO mentioned this pull request May 28, 2025
15 tasks
@AstraBert
Copy link
Member Author

I'd say with this last commit this PR can be merged? @logan-markewich
I can see that @Florian-BACHO is already baking some cache integrations to implement this!

@AstraBert AstraBert merged commit a3c7991 into main May 28, 2025
10 checks passed
@AstraBert AstraBert deleted the clelia/feature-request-add-embedding-caching branch May 28, 2025 15:21
@colca colca mentioned this pull request Jun 9, 2025
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request]: Add Embedding Caching

3 participants