-
Notifications
You must be signed in to change notification settings - Fork 147
Fix unchecked vector pre-allocation #221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Not sure if https://github.com/3Hren/msgpack-rust/blob/master/rmp-serde/src/decode.rs#L807 needs to be adapted as well (I don't use the serde bindings). |
I'm concerned that this may be too drastic. Would it suffice to put some arbitrary limit on it? e.g. This would reduce the problem from outright DoS to mere amplification of memory usage in the worst case of infinitely recursive arrays overpromising their size. Amplification seems like a lesser problem, and also likely happens already anyway (all memory management has some overhead, even non-preallocated vec will grow by doubling its size, so there's always 2x amplification there). For a completely robust solution we'd need a completely different approach, e.g. allowing to specify maximum total memory usage for the decoding process, tracking it precisely, and baling out when the limit is reached. |
Yeah, a limit of 4K would probably cover most of the cases and make the amplification much less serious. However, I don't think the spec limits the nesting depth, so in case unlimited nesting is possible, with nested array16 items, an attacker with a 2 MiB input (1048576 nested arrays) could still cause an allocation of at least 4 GiB (amplification factor 2048). (I think it's even worse, since it's a What if we don't preallocate (with limit) for arrays, but only for bin/str types? The case is much less serious for binary data versus arrays, because binary data cannot be nested. There could also be a nesting limit for arrays, however that might need to be user-configurable. |
And in any case, we are speculating and should probably have some benchmarks. @kornelski the current rmpv benchmarks are from 2017 and broken. Do we still need them, or can they be disabled or removed for now? |
It'd be nice to fix the benchmarks, so we don't have to speculate :) Reserving capacity with a limit for non-nested types, and no reservation for nested types, sounds safe and fast enough. |
eed631a
to
02856a8
Compare
Rebased on #222. I added a few benchmarks. Before the change:
With no pre-allocation at all:
I can't quite explain why, but the array 50/100/1024 KiB cases were actually faster without the preallocation. The array 20 MiB case was the same speed. The same thing counts for bin32, the cases up to 1 MiB were faster without preallocation. The 20 MiB bin32 test was about 5 times slower, while the 100 MiB bin32 test was again the same speed.
Can someone explain this? In any case, it indicates that we're probably doing premature optimization with those preallocations. |
When preallocating up to 64 KiB for bin data:
Comparison:
Doesn't hurt, but doesn't help a lot either. |
8ad31a7
to
1c384a0
Compare
Rebased against master. |
Thanks for fixing this |
No more vector pre-allocation if the data size is not known in advance. See discussion in #151.
Fixes #151 / RUSTSEC-2017-0006.