@@ -255,7 +255,75 @@ but [you can read about those below](#promoted)).
255255
256256# # Representing constants
257257
258- *to be written*
258+ When code has reached the MIR stage, constants can generally come in two forms :
259+ *MIR constants* ([`mir::Constant`]) and *type system constants* ([`ty::Const`]).
260+ MIR constants are used as operands : in `x + CONST`, `CONST` is a MIR constant;
261+ similarly, in `x + 2`, `2` is a MIR constant. Type system constants are used in
262+ the type system, in particular for array lengths but also for const generics.
263+
264+ Generally, both kinds of constants can be "unevaluated" or "already evaluated".
265+ And unevaluated constant simply stores the `DefId` of what needs to be evaluated
266+ to compute this result. An evaluated constant (a "value") has already been
267+ computed; their representation differs between type system constants and MIR
268+ constants : MIR constants evaluate to a `mir::ConstValue`; type system constants
269+ evaluate to a `ty::ValTree`.
270+
271+ Type system constants have some more variants to support const generics : they
272+ can refer to local const generic parameters, and they are subject to inference.
273+ Furthermore, the `mir::Constant::Ty` variant lets us use an arbitrary type
274+ system constant as a MIR constant; this happens whenever a const generic
275+ parameter is used as an operand.
276+
277+ # ## MIR constant values
278+
279+ In general, a MIR constant value (`mir::ConstValue`) was computed by evaluating
280+ some constant the user wrote. This [const evaluation](../const-eval.md) produces
281+ a very low-level representation of the result in terms of individual bytes. We
282+ call this an "indirect" constant (`mir::ConstValue::Indirect`) since the value
283+ is stored in-memory.
284+
285+ However, storing everything in-memory would be awfully inefficient. Hence there
286+ are some other variants in `mir::ConstValue` that can represent certain simple
287+ and common values more efficiently. In particular, everything that can be
288+ directly written as a literal in Rust (integers, floats, chars, bools, but also
289+ ` "string literals"` and `b"byte string literals"`) has an optimized variant that
290+ avoids the full overhead of the in-memory representation.
291+
292+ # ## ValTrees
293+
294+ An evaluated type system constant is a "valtree". The `ty::ValTree` datastructure
295+ allows us to represent
296+
297+ * arrays,
298+ * many structs,
299+ * tuples,
300+ * enums and,
301+ * most primitives.
302+
303+ The most important rule for
304+ this representation is that every value must be uniquely represented. In other
305+ words : a specific value must only be representable in one specific way. For example: there is only
306+ one way to represent an array of two integers as a `ValTree` :
307+ ` ValTree::Branch(&[ValTree::Leaf(first_int), ValTree::Leaf(second_int)])` .
308+ Even though theoretically a `[u32; 2]` could be encoded in a `u64` and thus just be a
309+ ` ValTree::Leaf(bits_of_two_u32)` , that is not a legal construction of `ValTree`
310+ (and is very complex to do, so it is unlikely anyone is tempted to do so).
311+
312+ These rules also mean that some values are not representable. There can be no `union`s in type
313+ level constants, as it is not clear how they should be represented, because their active variant
314+ is unknown. Similarly there is no way to represent raw pointers, as addresses are unknown at
315+ compile-time and thus we cannot make any assumptions about them. References on the other hand
316+ *can* be represented, as equality for references is defined as equality on their value, so we
317+ ignore their address and just look at the backing value. We must make sure that the pointer values
318+ of the references are not observable at compile time. We thus encode `&42` exactly like `42`.
319+ Any conversion from
320+ valtree back a to MIR constant value must reintroduce an actual indirection. At codegen time the
321+ addresses may be deduplicated between multiple uses or not, entirely depending on arbitrary
322+ optimization choices.
323+
324+ As a consequence, all decoding of `ValTree` must happen by matching on the type first and making
325+ decisions depending on that. The value itself gives no useful information without the type that
326+ belongs to it.
259327
260328<a name="promoted"></a>
261329
@@ -283,3 +351,5 @@ See the const-eval WG's [docs on promotion](https://github.com/rust-lang/const-e
283351[`ProjectionElem::Deref`] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.ProjectionElem.html#variant.Deref
284352[`Rvalue`] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.Rvalue.html
285353[`Operand`] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.Operand.html
354+ [`mir::Constant`] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/struct.Constant.html
355+ [`ty::Const`] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Const.html
0 commit comments