UCS: adding multi-dimensional hash tables #7578

alex--m · 2021-10-23T13:34:34Z

No description provided.

swx-jenkins4 · 2021-10-23T13:39:19Z

Can one of the admins verify this patch?

shamisp · 2021-10-23T17:15:55Z

ok to test

shamisp · 2021-10-23T17:16:40Z

@alex--m any update on CLA ?

alex--m · 2021-10-23T18:36:16Z

@shamisp afraid not, and I don't expect it'll be resolved soon. I'm posting stuff I plan to upstream once it does - I just don't seem to have permissions to put the "CLA missing" label...

yosefe · 2021-11-14T13:38:14Z

how is it different from using regular khash with a custom key type that contains multiple values?

alex--m · 2021-11-14T14:28:30Z

how is it different from using regular khash with a custom key type that contains multiple values?

This is the result of some research, and the paper is still in progress, but the gist is the difference in iteration. A multi-dimensional hash-table would only access the keys matching the query vector in every dimension (Figure 1), whereas this implementation has a special way to iterate over neighboring vectors (Figure 2) in order to locate the nearest neighbor. I plan to use this as an advanced form of caching (in a separate commit).

yosefe · 2021-11-14T14:55:58Z

IIUC, the expected lookup performance of the special multi-dim implementation should be better than default khash with a vector key?

alex--m · 2021-11-14T15:20:19Z

IIUC, the expected lookup performance of the special multi-dim implementation should be better than default khash with a vector key?

No, I'm afraid the lookup is not faster, just different (and in fact typically slower). For example, if both dimensions fall into bin #1 in the figures above, the default khash with vector keys will check V3 and V4, whereas this special 2D khash will check V3 - V8. The only advantage (and purpose) of this special multi-dimensional khash-based data-structure is a fast nearest-neighbor lookup.

shamisp · 2021-11-14T15:41:48Z

@alex--m Can you give a bit more details how it will be used ? AKA how we will benefit from nearest neighbor lookup speedup.

alex--m · 2021-11-14T15:51:46Z

@alex--m Can you give a bit more details how it will be used ? AKA how we will benefit from nearest neighbor lookup speedup.

Sure. The basic Idea is this: caches tend to be all-or-nothing matches, so if you're looking up, say, a past request, you only get identical past requests. In such case, the nearest-neighbor lookup allow you to find a "similar" past request and modify it, rather than creating a brand new request (which is presumably more expensive, otherwise this makes little sense). The speedup primarily comes from (a) re-using similar past objects rather than creating new ones, and secondarily from the reduced memory consumption which is the result of this recycling. Now this begs the question: what objects are so hard to create, justifying all this? one answer is QPs, but there are others.

shamisp · 2021-11-14T18:16:24Z

@alex--m Sounds like some cache approximation. Can you please give us a bit more specific what is the follow up patch and how it will be useful for UCX.

Signed-off-by: Alex Margolin <[email protected]>

…est neighbor search (based on khash)

alex--m changed the title ~~UCS: adding multi-dimentional hash tables~~ UCS: adding multi-dimensional hash tables Oct 23, 2021

alex--m force-pushed the topic/mdht branch from 5f8c76f to df83e59 Compare October 23, 2021 22:43

shamisp added the Approved pending CLA label Oct 23, 2021

alex--m mentioned this pull request Oct 23, 2021

Add benchmark for n-dimensional khash (from UCX) erikbern/ann-benchmarks#268

Closed

alex--m force-pushed the topic/mdht branch 4 times, most recently from 30b3604 to 89094bf Compare November 1, 2021 00:49

alex--m force-pushed the topic/mdht branch from 89094bf to 02fca5f Compare November 14, 2021 12:21

alex--m force-pushed the topic/mdht branch from 02fca5f to 204e686 Compare December 1, 2021 21:51

alex--m force-pushed the topic/mdht branch from 204e686 to 74948f6 Compare December 24, 2021 19:05

alex--m force-pushed the topic/mdht branch 5 times, most recently from 541177f to efc9355 Compare February 19, 2022 15:17

alex--m force-pushed the topic/mdht branch 2 times, most recently from d4186f5 to 6f1bc0b Compare May 19, 2022 15:35

alex--m force-pushed the topic/mdht branch from 6f1bc0b to a75157a Compare June 11, 2022 14:05

alex--m force-pushed the topic/mdht branch 9 times, most recently from 7007bb5 to bcc164b Compare June 14, 2022 07:18

alex--m added 2 commits May 2, 2024 10:51

UCS/KHASH: update to v0.2.9

9715638

UCS/KHASH: fix/supress clang warnings

48c0039

Signed-off-by: Alex Margolin <[email protected]>

alex--m force-pushed the topic/mdht branch 4 times, most recently from 62be79b to 4678f39 Compare May 5, 2024 05:59

alex--m added 2 commits May 5, 2024 11:51

UCS/DATASTRUCT: new data-strucutre for N-dimensional approximate near…

dad8676

…est neighbor search (based on khash)

EXAMPLES/UCS_ANN_SEARCH: new example to be also used for benchmarking

c9401e0

alex--m force-pushed the topic/mdht branch from 4678f39 to c9401e0 Compare May 5, 2024 08:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

UCS: adding multi-dimensional hash tables #7578

UCS: adding multi-dimensional hash tables #7578

Uh oh!

alex--m commented Oct 23, 2021

Uh oh!

swx-jenkins4 commented Oct 23, 2021

Uh oh!

shamisp commented Oct 23, 2021

Uh oh!

shamisp commented Oct 23, 2021

Uh oh!

alex--m commented Oct 23, 2021 •

edited

Loading

Uh oh!

yosefe commented Nov 14, 2021

Uh oh!

alex--m commented Nov 14, 2021

Uh oh!

yosefe commented Nov 14, 2021

Uh oh!

alex--m commented Nov 14, 2021

Uh oh!

shamisp commented Nov 14, 2021

Uh oh!

alex--m commented Nov 14, 2021

Uh oh!

shamisp commented Nov 14, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

UCS: adding multi-dimensional hash tables #7578

Are you sure you want to change the base?

UCS: adding multi-dimensional hash tables #7578

Uh oh!

Conversation

alex--m commented Oct 23, 2021

Uh oh!

swx-jenkins4 commented Oct 23, 2021

Uh oh!

shamisp commented Oct 23, 2021

Uh oh!

shamisp commented Oct 23, 2021

Uh oh!

alex--m commented Oct 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yosefe commented Nov 14, 2021

Uh oh!

alex--m commented Nov 14, 2021

Uh oh!

yosefe commented Nov 14, 2021

Uh oh!

alex--m commented Nov 14, 2021

Uh oh!

shamisp commented Nov 14, 2021

Uh oh!

alex--m commented Nov 14, 2021

Uh oh!

shamisp commented Nov 14, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alex--m commented Oct 23, 2021 •

edited

Loading