Skip to content

[MARF] Investigate sparse trie representations #6593

@jcnelson

Description

@jcnelson

@kantai and myself suspect that the bulk of redundant data in a MARF index comes from one of a handful of places:

  • When performing a copy-on-write to produce a new trie, most of the nodes on the trie path will not change very much between two subsequent tries.

  • When performing a copy-on-write, most trie backpointers will point to the same nodes they did when they were created

  • A substantial number of leaves will have the (same) empty hash, since their keys are unmapped

  • We don't use varint encodings for integers, even though most integers fit into fewer than four bytes

I need to figure out how much of a potential space savings we can get by addressing some or all of the above.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

Status: 🆕 New

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions