-
Notifications
You must be signed in to change notification settings - Fork 547
Give overview of MIR dataflow framework #583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,171 @@ | ||
# Dataflow Analysis | ||
|
||
If you work on the MIR, you will frequently come across various flavors of | ||
[dataflow analysis][wiki]. `rustc` uses dataflow to find uninitialized | ||
variables, determine what variables are live across a generator `yield` | ||
statement, and compute which `Place`s are borrowed at a given point in the | ||
control-flow graph. Dataflow analysis is a fundamental concept in modern | ||
compilers, and knowledge of the subject will be helpful to prospective | ||
contributors. | ||
|
||
However, this documentation is not a general introduction to dataflow analysis. | ||
It is merely a description of the framework used to define these analyses in | ||
`rustc`. It assumes that the reader is familiar with the core ideas as well as | ||
some basic terminology, such as "transfer function", "fixpoint" and "lattice". | ||
If you're unfamiliar with these terms, or if you want a quick refresher, | ||
[*Static Program Analysis*] by Anders Møller and Michael I. Schwartzbach is an | ||
excellent, freely available textbook. For those who prefer audiovisual | ||
learning, the Goethe University Frankfurt has published a series of short | ||
[lectures on YouTube][goethe] in English that are very approachable. | ||
|
||
## Defining a Dataflow Analysis | ||
|
||
The interface for dataflow analyses is split into three traits. The first is | ||
[`AnalysisDomain`], which must be implemented by *all* analyses. In addition to | ||
the type of the dataflow state, this trait defines the initial value of that | ||
state at entry to each block, as well as the direction of the analysis, either | ||
forward or backward. The domain of your dataflow analysis must be a [lattice][] | ||
(strictly speaking a join-semilattice) with a well-behaved `join` operator. See | ||
documentation for the [`lattice`] module, as well as the [`JoinSemiLattice`] | ||
trait, for more information. | ||
|
||
You must then provide *either* a direct implementation of the [`Analysis`] trait | ||
*or* an implementation of the proxy trait [`GenKillAnalysis`]. The latter is for | ||
so-called ["gen-kill" problems], which have a simple class of transfer function | ||
that can be applied very efficiently. Analyses whose domain is not a `BitSet` | ||
of some index type, or whose transfer functions cannot be expressed through | ||
"gen" and "kill" operations, must implement `Analysis` directly, and will run | ||
slower as a result. All implementers of `GenKillAnalysis` also implement | ||
`Analysis` automatically via a default `impl`. | ||
|
||
|
||
```text | ||
AnalysisDomain | ||
^ | ||
| | = has as a supertrait | ||
| . = provides a default impl for | ||
| | ||
Analysis | ||
^ ^ | ||
| . | ||
| . | ||
| . | ||
GenKillAnalysis | ||
|
||
``` | ||
|
||
### Transfer Functions and Effects | ||
|
||
The dataflow framework in `rustc` allows each statement (and terminator) inside | ||
a basic block define its own transfer function. For brevity, these | ||
individual transfer functions are known as "effects". Each effect is applied | ||
successively in dataflow order, and together they define the transfer function | ||
for the entire basic block. It's also possible to define an effect for | ||
particular outgoing edges of some terminators (e.g. | ||
[`apply_call_return_effect`] for the `success` edge of a `Call` | ||
terminator). Collectively, these are referred to as "per-edge effects". | ||
|
||
The only meaningful difference (besides the "apply" prefix) between the methods | ||
of the `GenKillAnalysis` trait and the `Analysis` trait is that an `Analysis` | ||
has direct, mutable access to the dataflow state, whereas a `GenKillAnalysis` | ||
only sees an implementer of the `GenKill` trait, which only allows the `gen` | ||
and `kill` operations for mutation. | ||
|
||
### "Before" Effects | ||
|
||
Observant readers of the documentation may notice that there are actually *two* | ||
possible effects for each statement and terminator, the "before" effect and the | ||
unprefixed (or "primary") effect. The "before" effects are applied immediately | ||
before the unprefixed effect **regardless of the direction of the analysis**. | ||
In other words, a backward analysis will apply the "before" effect and then the | ||
the "primary" effect when computing the transfer function for a basic block, | ||
just like a forward analysis. | ||
|
||
The vast majority of analyses should use only the unprefixed effects: Having | ||
multiple effects for each statement makes it difficult for consumers to know | ||
where they should be looking. However, the "before" variants can be useful in | ||
some scenarios, such as when the effect of the right-hand side of an assignment | ||
statement must be considered separately from the left-hand side. | ||
|
||
### Convergence | ||
|
||
TODO | ||
|
||
## Inspecting the Results of a Dataflow Analysis | ||
|
||
Once you have constructed an analysis, you must pass it to an [`Engine`], which | ||
is responsible for finding the steady-state solution to your dataflow problem. | ||
You should use the [`into_engine`] method defined on the `Analysis` trait for | ||
this, since it will use the more efficient `Engine::new_gen_kill` constructor | ||
when possible. | ||
|
||
Calling `iterate_to_fixpoint` on your `Engine` will return a `Results`, which | ||
contains the dataflow state at fixpoint upon entry of each block. Once you have | ||
a `Results`, you can can inspect the dataflow state at fixpoint at any point in | ||
the CFG. If you only need the state at a few locations (e.g., each `Drop` | ||
terminator) use a [`ResultsCursor`]. If you need the state at *every* location, | ||
a [`ResultsVisitor`] will be more efficient. | ||
|
||
```text | ||
Analysis | ||
| | ||
| into_engine(…) | ||
| | ||
Engine | ||
| | ||
| iterate_to_fixpoint() | ||
| | ||
Results | ||
/ \ | ||
into_results_cursor(…) / \ visit_with(…) | ||
/ \ | ||
ResultsCursor ResultsVisitor | ||
``` | ||
|
||
For example, the following code uses a [`ResultsVisitor`]... | ||
|
||
|
||
```rust,ignore | ||
// Assuming `MyVisitor` implements `ResultsVisitor<FlowState = MyAnalysis::Domain>`... | ||
let mut my_visitor = MyVisitor::new(); | ||
|
||
// inspect the fixpoint state for every location within every block in RPO. | ||
let results = MyAnalysis::new() | ||
.into_engine(tcx, body, def_id) | ||
.iterate_to_fixpoint() | ||
.visit_in_rpo_with(body, &mut my_visitor); | ||
``` | ||
|
||
whereas this code uses [`ResultsCursor`]: | ||
|
||
```rust,ignore | ||
let mut results = MyAnalysis::new() | ||
.into_engine(tcx, body, def_id) | ||
.iterate_to_fixpoint() | ||
.into_results_cursor(body); | ||
|
||
// Inspect the fixpoint state immediately before each `Drop` terminator. | ||
for (bb, block) in body.basic_blocks().iter_enumerated() { | ||
if let TerminatorKind::Drop { .. } = block.terminator().kind { | ||
results.seek_before_primary_effect(body.terminator_loc(bb)); | ||
let state = results.get(); | ||
println!("state before drop: {:#?}", state); | ||
} | ||
} | ||
``` | ||
|
||
["gen-kill" problems]: https://en.wikipedia.org/wiki/Data-flow_analysis#Bit_vector_problems | ||
[*Static Program Analysis*]: https://cs.au.dk/~amoeller/spa/ | ||
[`AnalysisDomain`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/trait.AnalysisDomain.html | ||
[`Analysis`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/trait.Analysis.html | ||
[`Engine`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/struct.Engine.html | ||
[`GenKillAnalysis`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/trait.GenKillAnalysis.html | ||
[`JoinSemiLattice`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/lattice/trait.JoinSemiLattice.html | ||
[`ResultsCursor`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/struct.ResultsCursor.html | ||
[`ResultsVisitor`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/trait.ResultsVisitor.html | ||
[`apply_call_return_effect`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/trait.Analysis.html#tymethod.apply_call_return_effect | ||
[`into_engine`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/trait.Analysis.html#method.into_engine | ||
[`lattice`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/lattice/index.html | ||
[goethe]: https://www.youtube.com/watch?v=NVBQSR_HdL0&list=PL_sGR8T76Y58l3Gck3ZwIIHLWEmXrOLV_&index=2 | ||
[lattice]: https://en.wikipedia.org/wiki/Lattice_(order) | ||
[wiki]: https://en.wikipedia.org/wiki/Data-flow_analysis#Basic_principles | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.