Skip to content

[Variant] Support Shredded Objects in variant_get: typed path access (STEP 1) #8150

@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Note this is likely one of the most complex parts of implementing Shredded Variants , so it is not a good first task

We are trying to support the general case of the variant_get function, which allows runtime dynamic access to Variants (either shredded or unshredded).

This ticket tracks
Support variant_get for any input (shredded or otherwise), any depth of object field path steps, and casting to one primitive data type, eg. Some(DataType::Int32). This should close the loop and is potentially a good candidate for first PR.

Implementing this functionality will likely require the basic representation for shredded Variant arrays along with path traversal in variant_get. However, it does NOT cover the following (which are / will be broken into separate tickets)

  • Support for retrieving as other data types (e.g. Some(DataType::Utf8))
  • Retrieving any arbitrary path and returning what is there (no type specified)
  • Retrieving an arbitrary path as a "Struct" (aka implementing shredding)
  • Retrieving any arbitrary path as a Variant (aka "unshredding")

Describe the solution you'd like
@scovich sketched out a high level design for Shredded Objects (see Representing Variant In Arrow Proposal: "Shredding an Object" and Variant Shredding::Objects) in this PR

So roughly that means supporting

// get the named field of variant object as a typed field 
variant_get(array, "$.field_name", DataType::Int)

Where $.field_name represents some arbitrary VariantPath such as a for field "a", or a.b for field "b" of field "a"

This should work for:

  1. Variants where the field_name is in a typed_value
  2. Variants where the field_name is not in the typed value

Describe alternatives you've considered

  1. Add a test that manually constructs a shredded variant array (follow the example in the arrow proposal)
  2. Add a test that calls variant_get appropriately
  3. Implement the code

I suggest getting this working for non-nested obejcts first, and then working on nesting / pathing as a second pR

Additional context

Reference

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAny new improvement worthy of a entry in the changelog

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions