-
Notifications
You must be signed in to change notification settings - Fork 999
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Note this is likely one of the most complex parts of implementing Shredded Variants , so it is not a good first task
We are trying to support the general case of the variant_get
function, which allows runtime dynamic access to Variants (either shredded or unshredded).
- We found in [Variant] Support Shredded Objects in
variant_get
#8083 that supporting variant_get is quite complicated (see here), so we are proposing to brake it down into multiple piece.
This ticket tracks
Support variant_get
for any input
(shredded or otherwise), any depth of object field path steps, and casting to one primitive data type, eg. Some(DataType::Int32)
. This should close the loop and is potentially a good candidate for first PR.
Implementing this functionality will likely require the basic representation for shredded Variant arrays along with path traversal in variant_get
. However, it does NOT cover the following (which are / will be broken into separate tickets)
- Support for retrieving as other data types (e.g.
Some(DataType::Utf8)
) - Retrieving any arbitrary path and returning what is there (no type specified)
- Retrieving an arbitrary path as a "Struct" (aka implementing shredding)
- Retrieving any arbitrary path as a Variant (aka "unshredding")
Describe the solution you'd like
@scovich sketched out a high level design for Shredded Objects (see Representing Variant In Arrow Proposal: "Shredding an Object" and Variant Shredding::Objects) in this PR
So roughly that means supporting
// get the named field of variant object as a typed field
variant_get(array, "$.field_name", DataType::Int)
Where $.field_name
represents some arbitrary VariantPath
such as a
for field "a", or a.b
for field "b" of field "a"
This should work for:
- Variants where the field_name is in a typed_value
- Variants where the field_name is not in the typed value
Describe alternatives you've considered
- Add a test that manually constructs a shredded variant array (follow the example in the arrow proposal)
- Add a test that calls variant_get appropriately
- Implement the code
I suggest getting this working for non-nested obejcts first, and then working on nesting / pathing as a second pR
Additional context
Reference