Skip to content

Conversation

cpetig
Copy link
Contributor

@cpetig cpetig commented Feb 2, 2025

Implement fixed length list support as defined in WebAssembly/component-model#384

  • wit-parser
  • wasm-encoder
  • wasmparser
  • wasmprinter
  • wat
  • wit-encoder
  • wit-smith

@cpetig cpetig marked this pull request as draft February 2, 2025 16:51
@cpetig cpetig changed the title Draft: Fixed length list support Fixed length list support Feb 2, 2025
@cpetig cpetig force-pushed the fixed-length-list branch from a0c1bc4 to e5a20d0 Compare February 9, 2025 21:57
@cpetig cpetig marked this pull request as ready for review February 9, 2025 23:21
@cpetig
Copy link
Contributor Author

cpetig commented Feb 9, 2025

Fixed size lists became a separate type because they have a different ABI representation and map to different Rust/C types.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! In addition to the comments below, mind adding some more tests? For example:

  • WIth a feature gate there should be a test that using this type is invalid without the feature gate (e.g. tests/local/missing-features)
  • Tests for the *.wat syntax in tests/local/component-model/*. This also tests round-trip parsing and such
  • Tests for list<ty, 10000000000000000> (something out of the 64-bit range but still looks integer-like)

Also with the new support in wit-smith (thanks!) mind running the fuzzer for a few minutes locally to ensure it doesn't turn up anything? (FUZZER=roundtrip_wit cargo +nightly fuzz run -s none run -j30)

@alexcrichton
Copy link
Member

Also, as a possible bikeshed, which you can take or leave, in Rust at least there's the term "slice" which is &[T] which is sort of like a WIT "list", and in Rust the term for [T; N] is "array" which could possibly be used here instead of "fixed size list". It's easy to confuse "list" and "array" though and harder to confuse "fixed sized list" and "list". The downside of "fixed size list" is that it's a bit of a mouthful to say and type

@cpetig
Copy link
Contributor Author

cpetig commented Feb 11, 2025

@alexcrichton do you prefer me to resolve the ones I implemented or do you want to check first?

I will implement wat roundtrip tests later today (I didn't find them although I did one manually) and think about limiting on-stack elements.

@cpetig
Copy link
Contributor Author

cpetig commented Feb 11, 2025

Also with the new support in wit-smith (thanks!) mind running the fuzzer for a few minutes locally to ensure it doesn't turn up anything? (FUZZER=roundtrip_wit cargo +nightly fuzz run -s none run -j30)

It found something ;-)
Fixed size lists require the component model fixed size list feature (at offset 0x26)
so I need to figure out where to activate this feature inside the fuzzer.

@cpetig
Copy link
Contributor Author

cpetig commented Feb 11, 2025

I need to figure out where to activate this feature inside the fuzzer.

The default Validator used for the component encoder has this feature disabled (as it is correctly disabled by default), I don't see an easy way to only activate it for the WIT fuzzing, is there a secret environment variable?

@cpetig
Copy link
Contributor Author

cpetig commented Feb 11, 2025

Thanks for this!

Thank you for your in-depth review!

* Tests for `list<ty, 10000000000000000>` (something out of the 64-bit range but still looks integer-like)

I tried but wasn't able to figure out the right way to set up this test: https://github.com/bytecodealliance/wasm-tools/pull/1992/files#diff-1532b9f29c0a59c8d44ce87b39476ec5242d1b2a2329e0b72bf7b2f23c2ae485R24-R38

@alexcrichton
Copy link
Member

do you prefer me to resolve the ones I implemented or do you want to check first?

Feel free to do so yourself, I'll give the code a once-over anyway once it's ready too.

The default Validator used for the component encoder has this feature disabled (as it is correctly disabled by default), I don't see an easy way to only activate it for the WIT fuzzing, is there a secret environment variable?

For this I'd recommend in the various locations validation is performed for tests/etc to just enable the new feature unconditionally. Adding for example Validator::new_with_features(WasmFeatures::all()) I think is reasonable since the validation in the tooling is mostly test assertions and we want to exercise it all during fuzzing anyway.

I tried but wasn't able to figure out the right way to set up this test

Ah yeah it's a lexical error right in *.wat if the integer literal is out-of-bounds so I think it's reasonable to skip the *.wat test, but I think *.wit may want to be tested? It might also be reasonable to just leave the *.wat test around to ensure that some error message comes out, and it'd use assert_malformed I think or whatever directive supports module quote ...

@cpetig cpetig force-pushed the fixed-length-list branch from 77ca584 to 3b93cd7 Compare March 9, 2025 13:37
@alexcrichton
Copy link
Member

I've added a merge commit here which resolves the conflicts with #2096, but feel free to discard that if you'd like.

@cpetig cpetig force-pushed the fixed-length-list branch from 64f03a3 to 1fc9e72 Compare April 2, 2025 21:14
@cpetig
Copy link
Contributor Author

cpetig commented Apr 20, 2025

Oh, dear, I got a first hand impression on why I should get this merged soon … adding fixed size lists to wit-bindgen is moderate effort, but to run any runtime test you need to add support for the extension to wasm-component-ld (shipped with the Rust compiler), wac, and wasmtime.

And this is the improved situation after Alex simplified the runtime test to no longer require full support on the host (wasmtime host code generation) side.

@cpetig
Copy link
Contributor Author

cpetig commented Apr 27, 2025

Thanks for this!

Thank you for your in-depth review!

* Tests for `list<ty, 10000000000000000>` (something out of the 64-bit range but still looks integer-like)

I tried but wasn't able to figure out the right way to set up this test: https://github.com/bytecodealliance/wasm-tools/pull/1992/files#diff-1532b9f29c0a59c8d44ce87b39476ec5242d1b2a2329e0b72bf7b2f23c2ae485R24-R38

fixed now

Comment on lines +1180 to +1182
if *length as usize > lowered_types.max {
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon further reflection of this I'm starting to have second thoughts about this approach. This input for example:

package a:b;

world x {
  type t = list<t2, 16>;
  type t2 = list<t3, 16>;
  type t3 = list<t4, 16>;
  type t4 = list<t5, 16>;
  type t5 = list<t6, 16>;
  type t6 = list<t7, 16>;
  type t7 = list<t8, 16>;
  type t8 = list<u32, 16>;
  import x: func(t: t);
}

when locally run as:

$ cargo run component embed --dummy foo.wit -t

takes ~30 seconds to produce the result. I think the reason is that this case isn't hit and then the recursion gets to go hog wild for a bit.

Perhaps an alternative fix would be to iterate from 0 to length but then check each iteration of the loop if the max has been reached and bail out if so? That way each level would check the break-out and once that happens it'd quickly exit.

Also if you're up for it I think that these would be good test cases to add to the test suite to ensure processing large or nested lists doesn't accidentally take forever

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, ... but the same problem shows with

package a:b;

world x {
  type t = tuple<t2,t2,t2,t2,t2,t2,t2,t2,t2,t2,t2,t2,t2,t2,t2,t2,>;
  type t2 = tuple<t3,t3,t3,t3,t3,t3,t3,t3,t3,t3,t3,t3,t3,t3,t3,t3,>;
  type t3 = tuple<t4,t4,t4,t4,t4,t4,t4,t4,t4,t4,t4,t4,t4,t4,t4,t4,>;
  type t4 = tuple<t5,t5,t5,t5,t5,t5,t5,t5,t5,t5,t5,t5,t5,t5,t5,t5,>;
  type t5 = tuple<t6,t6,t6,t6,t6,t6,t6,t6,t6,t6,t6,t6,t6,t6,t6,t6,>;
  type t6 = tuple<t7,t7,t7,t7,t7,t7,t7,t7,t7,t7,t7,t7,t7,t7,t7,t7,>;
  type t7 = tuple<t8,t8,t8,t8,t8,t8,t8,t8,t8,t8,t8,t8,t8,t8,t8,t8,>;
  type t8 = tuple<u32,u32,u32,u32,u32,u32,u32,u32,u32,u32,u32,u32,u32,u32,u32,u32,>;
  import x: func(t: t);
}

I will take a look into whether I can shrink the time needed to process this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's a really good point! Ok in that case I'm going to go ahead and flag this for merge and follow-ups can handle fixes for cases like this.

@alexcrichton alexcrichton added this pull request to the merge queue Apr 28, 2025
Merged via the queue into bytecodealliance:main with commit 4cb6e03 Apr 28, 2025
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants