|
1 | | -# Dependency graph for incremental compilation |
| 1 | +To learn more about how dependency tracking works in rustc, see the [rustc |
| 2 | +guide]. |
2 | 3 |
|
3 | | -This module contains the infrastructure for managing the incremental |
4 | | -compilation dependency graph. This README aims to explain how it ought |
5 | | -to be used. In this document, we'll first explain the overall |
6 | | -strategy, and then share some tips for handling specific scenarios. |
7 | | - |
8 | | -The high-level idea is that we want to instrument the compiler to |
9 | | -track which parts of the AST and other IR are read/written by what. |
10 | | -This way, when we come back later, we can look at this graph and |
11 | | -determine what work needs to be redone. |
12 | | - |
13 | | -### The dependency graph |
14 | | - |
15 | | -The nodes of the graph are defined by the enum `DepNode`. They represent |
16 | | -one of three things: |
17 | | - |
18 | | -1. HIR nodes (like `Hir(DefId)`) represent the HIR input itself. |
19 | | -2. Data nodes (like `TypeOfItem(DefId)`) represent some computed |
20 | | - information about a particular item. |
21 | | -3. Procedure nodes (like `CoherenceCheckTrait(DefId)`) represent some |
22 | | - procedure that is executing. Usually this procedure is |
23 | | - performing some kind of check for errors. You can think of them as |
24 | | - computed values where the value being computed is `()` (and the |
25 | | - value may fail to be computed, if an error results). |
26 | | - |
27 | | -An edge `N1 -> N2` is added between two nodes if either: |
28 | | - |
29 | | -- the value of `N1` is used to compute `N2`; |
30 | | -- `N1` is read by the procedure `N2`; |
31 | | -- the procedure `N1` writes the value `N2`. |
32 | | - |
33 | | -The latter two conditions are equivalent to the first one if you think |
34 | | -of procedures as values. |
35 | | - |
36 | | -### Basic tracking |
37 | | - |
38 | | -There is a very general strategy to ensure that you have a correct, if |
39 | | -sometimes overconservative, dependency graph. The two main things you have |
40 | | -to do are (a) identify shared state and (b) identify the current tasks. |
41 | | - |
42 | | -### Identifying shared state |
43 | | - |
44 | | -Identify "shared state" that will be written by one pass and read by |
45 | | -another. In particular, we need to identify shared state that will be |
46 | | -read "across items" -- that is, anything where changes in one item |
47 | | -could invalidate work done for other items. So, for example: |
48 | | - |
49 | | -1. The signature for a function is "shared state". |
50 | | -2. The computed type of some expression in the body of a function is |
51 | | - not shared state, because if it changes it does not itself |
52 | | - invalidate other functions (though it may be that it causes new |
53 | | - monomorphizations to occur, but that's handled independently). |
54 | | - |
55 | | -Put another way: if the HIR for an item changes, we are going to |
56 | | -recompile that item for sure. But we need the dep tracking map to tell |
57 | | -us what *else* we have to recompile. Shared state is anything that is |
58 | | -used to communicate results from one item to another. |
59 | | - |
60 | | -### Identifying the current task, tracking reads/writes, etc |
61 | | - |
62 | | -FIXME(#42293). This text needs to be rewritten for the new red-green |
63 | | -system, which doesn't fully exist yet. |
64 | | - |
65 | | -#### Dependency tracking map |
66 | | - |
67 | | -`DepTrackingMap` is a particularly convenient way to correctly store |
68 | | -shared state. A `DepTrackingMap` is a special hashmap that will add |
69 | | -edges automatically when `get` and `insert` are called. The idea is |
70 | | -that, when you get/insert a value for the key `K`, we will add an edge |
71 | | -from/to the node `DepNode::Variant(K)` (for some variant specific to |
72 | | -the map). |
73 | | - |
74 | | -Each `DepTrackingMap` is parameterized by a special type `M` that |
75 | | -implements `DepTrackingMapConfig`; this trait defines the key and value |
76 | | -types of the map, and also defines a fn for converting from the key to |
77 | | -a `DepNode` label. You don't usually have to muck about with this by |
78 | | -hand, there is a macro for creating it. You can see the complete set |
79 | | -of `DepTrackingMap` definitions in `librustc/middle/ty/maps.rs`. |
80 | | - |
81 | | -As an example, let's look at the `adt_defs` map. The `adt_defs` map |
82 | | -maps from the def-id of a struct/enum to its `AdtDef`. It is defined |
83 | | -using this macro: |
84 | | - |
85 | | -```rust |
86 | | -dep_map_ty! { AdtDefs: ItemSignature(DefId) -> ty::AdtDefMaster<'tcx> } |
87 | | -// ~~~~~~~ ~~~~~~~~~~~~~ ~~~~~ ~~~~~~~~~~~~~~~~~~~~~~ |
88 | | -// | | Key type Value type |
89 | | -// | DepNode variant |
90 | | -// Name of map id type |
91 | | -``` |
92 | | - |
93 | | -this indicates that a map id type `AdtDefs` will be created. The key |
94 | | -of the map will be a `DefId` and value will be |
95 | | -`ty::AdtDefMaster<'tcx>`. The `DepNode` will be created by |
96 | | -`DepNode::ItemSignature(K)` for a given key. |
97 | | - |
98 | | -Once that is done, you can just use the `DepTrackingMap` like any |
99 | | -other map: |
100 | | - |
101 | | -```rust |
102 | | -let mut map: DepTrackingMap<M> = DepTrackingMap::new(dep_graph); |
103 | | -map.insert(key, value); // registers dep_graph.write |
104 | | -map.get(key; // registers dep_graph.read |
105 | | -``` |
106 | | - |
107 | | -#### Memoization |
108 | | - |
109 | | -One particularly interesting case is memoization. If you have some |
110 | | -shared state that you compute in a memoized fashion, the correct thing |
111 | | -to do is to define a `RefCell<DepTrackingMap>` for it and use the |
112 | | -`memoize` helper: |
113 | | - |
114 | | -```rust |
115 | | -map.memoize(key, || /* compute value */) |
116 | | -``` |
117 | | - |
118 | | -This will create a graph that looks like |
119 | | - |
120 | | - ... -> MapVariant(key) -> CurrentTask |
121 | | - |
122 | | -where `MapVariant` is the `DepNode` variant that the map is associated with, |
123 | | -and `...` are whatever edges the `/* compute value */` closure creates. |
124 | | - |
125 | | -In particular, using the memoize helper is much better than writing |
126 | | -the obvious code yourself: |
127 | | - |
128 | | -```rust |
129 | | -if let Some(result) = map.get(key) { |
130 | | - return result; |
131 | | -} |
132 | | -let value = /* compute value */; |
133 | | -map.insert(key, value); |
134 | | -``` |
135 | | - |
136 | | -If you write that code manually, the dependency graph you get will |
137 | | -include artificial edges that are not necessary. For example, imagine that |
138 | | -two tasks, A and B, both invoke the manual memoization code, but A happens |
139 | | -to go first. The resulting graph will be: |
140 | | - |
141 | | - ... -> A -> MapVariant(key) -> B |
142 | | - ~~~~~~~~~~~~~~~~~~~~~~~~~~~ // caused by A writing to MapVariant(key) |
143 | | - ~~~~~~~~~~~~~~~~~~~~ // caused by B reading from MapVariant(key) |
144 | | - |
145 | | -This graph is not *wrong*, but it encodes a path from A to B that |
146 | | -should not exist. In contrast, using the memoized helper, you get: |
147 | | - |
148 | | - ... -> MapVariant(key) -> A |
149 | | - | |
150 | | - +----------> B |
151 | | - |
152 | | -which is much cleaner. |
153 | | - |
154 | | -**Be aware though that the closure is executed with `MapVariant(key)` |
155 | | -pushed onto the stack as the current task!** That means that you must |
156 | | -add explicit `read` calls for any shared state that it accesses |
157 | | -implicitly from its environment. See the section on "explicit calls to |
158 | | -read and write when starting a new subtask" above for more details. |
159 | | - |
160 | | -### How to decide where to introduce a new task |
161 | | - |
162 | | -Certainly, you need at least one task on the stack: any attempt to |
163 | | -`read` or `write` shared state will panic if there is no current |
164 | | -task. But where does it make sense to introduce subtasks? The basic |
165 | | -rule is that a subtask makes sense for any discrete unit of work you |
166 | | -may want to skip in the future. Adding a subtask separates out the |
167 | | -reads/writes from *that particular subtask* versus the larger |
168 | | -context. An example: you might have a 'meta' task for all of borrow |
169 | | -checking, and then subtasks for borrow checking individual fns. (Seen |
170 | | -in this light, memoized computations are just a special case where we |
171 | | -may want to avoid redoing the work even within the context of one |
172 | | -compilation.) |
173 | | - |
174 | | -The other case where you might want a subtask is to help with refining |
175 | | -the reads/writes for some later bit of work that needs to be memoized. |
176 | | -For example, we create a subtask for type-checking the body of each |
177 | | -fn. However, in the initial version of incr. comp. at least, we do |
178 | | -not expect to actually *SKIP* type-checking -- we only expect to skip |
179 | | -trans. However, it's still useful to create subtasks for type-checking |
180 | | -individual items, because, otherwise, if a fn sig changes, we won't |
181 | | -know which callers are affected -- in fact, because the graph would be |
182 | | -so coarse, we'd just have to retrans everything, since we can't |
183 | | -distinguish which fns used which fn sigs. |
184 | | - |
185 | | -### Testing the dependency graph |
186 | | - |
187 | | -There are various ways to write tests against the dependency graph. |
188 | | -The simplest mechanism are the |
189 | | -`#[rustc_if_this_changed]` and `#[rustc_then_this_would_need]` |
190 | | -annotations. These are used in compile-fail tests to test whether the |
191 | | -expected set of paths exist in the dependency graph. As an example, |
192 | | -see `src/test/compile-fail/dep-graph-caller-callee.rs`. |
193 | | - |
194 | | -The idea is that you can annotate a test like: |
195 | | - |
196 | | -```rust |
197 | | -#[rustc_if_this_changed] |
198 | | -fn foo() { } |
199 | | - |
200 | | -#[rustc_then_this_would_need(TypeckTables)] //~ ERROR OK |
201 | | -fn bar() { foo(); } |
202 | | - |
203 | | -#[rustc_then_this_would_need(TypeckTables)] //~ ERROR no path |
204 | | -fn baz() { } |
205 | | -``` |
206 | | - |
207 | | -This will check whether there is a path in the dependency graph from |
208 | | -`Hir(foo)` to `TypeckTables(bar)`. An error is reported for each |
209 | | -`#[rustc_then_this_would_need]` annotation that indicates whether a |
210 | | -path exists. `//~ ERROR` annotations can then be used to test if a |
211 | | -path is found (as demonstrated above). |
212 | | - |
213 | | -### Debugging the dependency graph |
214 | | - |
215 | | -#### Dumping the graph |
216 | | - |
217 | | -The compiler is also capable of dumping the dependency graph for your |
218 | | -debugging pleasure. To do so, pass the `-Z dump-dep-graph` flag. The |
219 | | -graph will be dumped to `dep_graph.{txt,dot}` in the current |
220 | | -directory. You can override the filename with the `RUST_DEP_GRAPH` |
221 | | -environment variable. |
222 | | - |
223 | | -Frequently, though, the full dep graph is quite overwhelming and not |
224 | | -particularly helpful. Therefore, the compiler also allows you to filter |
225 | | -the graph. You can filter in three ways: |
226 | | - |
227 | | -1. All edges originating in a particular set of nodes (usually a single node). |
228 | | -2. All edges reaching a particular set of nodes. |
229 | | -3. All edges that lie between given start and end nodes. |
230 | | - |
231 | | -To filter, use the `RUST_DEP_GRAPH_FILTER` environment variable, which should |
232 | | -look like one of the following: |
233 | | - |
234 | | -``` |
235 | | -source_filter // nodes originating from source_filter |
236 | | --> target_filter // nodes that can reach target_filter |
237 | | -source_filter -> target_filter // nodes in between source_filter and target_filter |
238 | | -``` |
239 | | - |
240 | | -`source_filter` and `target_filter` are a `&`-separated list of strings. |
241 | | -A node is considered to match a filter if all of those strings appear in its |
242 | | -label. So, for example: |
243 | | - |
244 | | -``` |
245 | | -RUST_DEP_GRAPH_FILTER='-> TypeckTables' |
246 | | -``` |
247 | | - |
248 | | -would select the predecessors of all `TypeckTables` nodes. Usually though you |
249 | | -want the `TypeckTables` node for some particular fn, so you might write: |
250 | | - |
251 | | -``` |
252 | | -RUST_DEP_GRAPH_FILTER='-> TypeckTables & bar' |
253 | | -``` |
254 | | - |
255 | | -This will select only the `TypeckTables` nodes for fns with `bar` in their name. |
256 | | - |
257 | | -Perhaps you are finding that when you change `foo` you need to re-type-check `bar`, |
258 | | -but you don't think you should have to. In that case, you might do: |
259 | | - |
260 | | -``` |
261 | | -RUST_DEP_GRAPH_FILTER='Hir&foo -> TypeckTables & bar' |
262 | | -``` |
263 | | - |
264 | | -This will dump out all the nodes that lead from `Hir(foo)` to |
265 | | -`TypeckTables(bar)`, from which you can (hopefully) see the source |
266 | | -of the erroneous edge. |
267 | | - |
268 | | -#### Tracking down incorrect edges |
269 | | - |
270 | | -Sometimes, after you dump the dependency graph, you will find some |
271 | | -path that should not exist, but you will not be quite sure how it came |
272 | | -to be. **When the compiler is built with debug assertions,** it can |
273 | | -help you track that down. Simply set the `RUST_FORBID_DEP_GRAPH_EDGE` |
274 | | -environment variable to a filter. Every edge created in the dep-graph |
275 | | -will be tested against that filter -- if it matches, a `bug!` is |
276 | | -reported, so you can easily see the backtrace (`RUST_BACKTRACE=1`). |
277 | | - |
278 | | -The syntax for these filters is the same as described in the previous |
279 | | -section. However, note that this filter is applied to every **edge** |
280 | | -and doesn't handle longer paths in the graph, unlike the previous |
281 | | -section. |
282 | | - |
283 | | -Example: |
284 | | - |
285 | | -You find that there is a path from the `Hir` of `foo` to the type |
286 | | -check of `bar` and you don't think there should be. You dump the |
287 | | -dep-graph as described in the previous section and open `dep-graph.txt` |
288 | | -to see something like: |
289 | | - |
290 | | - Hir(foo) -> Collect(bar) |
291 | | - Collect(bar) -> TypeckTables(bar) |
292 | | - |
293 | | -That first edge looks suspicious to you. So you set |
294 | | -`RUST_FORBID_DEP_GRAPH_EDGE` to `Hir&foo -> Collect&bar`, re-run, and |
295 | | -then observe the backtrace. Voila, bug fixed! |
| 4 | +[rustc guide]: https://rust-lang-nursery.github.io/rustc-guide/query.html |
0 commit comments