-
Notifications
You must be signed in to change notification settings - Fork 1.6k
RFC for structs with unspecified layouts. #79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
3b4b96c
f303e45
c792a57
cf26f37
b883eca
262e32b
c67994a
d79b258
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,113 @@ | ||
- Start Date: 2014-05-17 | ||
- RFC PR #: | ||
- Rust Issue #: | ||
|
||
# Summary | ||
|
||
Leave structs with unspecified layout by default like enums, for | ||
optimisation & security purposes. Use something like `#[repr(C)]` to | ||
expose C compatible layout. | ||
|
||
# Motivation | ||
|
||
The members of a struct are always laid in memory in the order in | ||
which they were specified, e.g. | ||
|
||
```rust | ||
struct A { | ||
x: u8, | ||
y: u64, | ||
z: i8, | ||
w: i64, | ||
} | ||
``` | ||
|
||
will put the `u8` first in memory, then the `u64`, the `i8` and lastly | ||
the `i64`. Due to the alignment requirements of various types padding | ||
is often required to ensure the members start at an appropriately | ||
aligned byte. Hence the above struct is not `1 + 8 + 1 + 8 == 18` | ||
bytes, but rather `1 + 7 + 8 + 1 + 7 + 8 == 32` bytes, since it is | ||
laid out like | ||
|
||
```rust | ||
#[packed] // no automatically inserted padding | ||
struct AFull { | ||
x: u8, | ||
_padding1: [u8, .. 7], | ||
y: u64, | ||
z: i8, | ||
_padding2: [u8, .. 7], | ||
w: i64 | ||
} | ||
``` | ||
|
||
If the fields were reordered to | ||
|
||
```rust | ||
struct B { | ||
y: u64, | ||
w: i64, | ||
|
||
x: u8, | ||
i: i8 | ||
} | ||
``` | ||
|
||
then the struct is (strictly) only 18 bytes (but the alignment | ||
requirements of `u64` forces it to take up 24). | ||
|
||
There is also some security advantage to being able to randomise | ||
struct layouts, for example, | ||
[the Grsecurity suite](http://grsecurity.net/) of security | ||
enhancements to the Linux kernel provides | ||
[`GRKERNSEC_RANDSTRUCT`](http://en.wikibooks.org/wiki/Grsecurity/Appendix/Grsecurity_and_PaX_Configuration_Options#Randomize_layout_of_sensitive_kernel_structures) | ||
which randomises "sensitive kernel datastructures" at compile time. | ||
|
||
Notably, Rust's `enum`s already have undefined layout, and provide the | ||
`#[repr]` attribute to control layout more precisely (specifically, | ||
selecting the size of the discriminant). | ||
|
||
# Drawbacks | ||
|
||
Forgetting to add `#[repr(C)]` for a struct intended for FFI use can | ||
cause surprising bugs and crashes. There is already a lint for FFI use | ||
of `enum`s without a `#[repr(...)]` attribute, so this can be extended | ||
to include structs. | ||
|
||
# Detailed design | ||
|
||
A struct declaration like | ||
|
||
```rust | ||
struct Foo { | ||
... | ||
} | ||
``` | ||
|
||
has no fixed layout, that is, a compiler can chose whichever order of | ||
fields it prefers. | ||
|
||
A fixed layout can be selected with the `#[repr]` attribute | ||
|
||
```rust | ||
#[repr(C)] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the whole idea is a good one; +1. Bikeshed: I'm not sure if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we're going to bikeshed, maybe I think In any case, I don't particularly care about the name. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I had the same thought, but then it occurred to me that the only people who will insist on such control over representation would be coming from a C background anyway. I do like the analogy to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good points. |
||
struct Foo { | ||
... | ||
} | ||
``` | ||
|
||
This will force a struct to be laid out like the equivalent definition | ||
in C. | ||
|
||
# Alternatives | ||
|
||
- Have non-C layouts opt-in, via `#[repr(smallest)]` and | ||
`#[repr(random)]` (or similar similar). | ||
- Have layout defined, but not declaration order (like Java(?)), for | ||
example, from largest field to smallest, so `u8` fields get placed | ||
last, and `[u8, .. 1000000]` fields get placed first. The `#[repr]` | ||
attributes would still allow for selecting declaration-order layout. | ||
|
||
# Unresolved questions | ||
|
||
- How does this interact with binary compatibility of dynamic libraries? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's completely unnecessary if you're confident that you are memory safe, which (modulo compiler bugs) Rust can give you (except unsafe blocks).
IMO C-struct-compatibility is a major selling point of Rust, it should be the default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're writing a kernel in Rust, you presumably aren't guaranteeing that all the programs your kernel runs are also written in Rust. To that end, being able to randomize fields sounds plausibly useful.
However, I would imagine it's probably better done by writing a custom item decorator that randomizes the field order (and places whatever
#[repr()]
attribute is necessary to tell the compiler to use the declaration order). Which is to say, the kernel author can write the necessary extension, Rust doesn't need to provide it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@o11c, it is also a major selling point of Rust to be highly efficient. IMO, this selling point is more important than C interop.
Of course, easy C interop will still be a selling point, but I suspect it is the wrong default. If we stick to our current default, every struct that is not used for C interop will pay the price of unoptimized representation, unless we add an annotation to the struct definition. This violates the notion of "pay for what you use". And given that the number of structs intended for C interop is relatively few, requiring this annotation for the majority of structs would be comparable to the burden of const-correctness in C++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kballard that is a good point. It's a trivial syntax extension: https://gist.github.com/huonw/be05427dc80e44f1a594
I'll remove randomisation as a reason for this RFC.