-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add alias analysis #194
base: develop
Are you sure you want to change the base?
Add alias analysis #194
Conversation
stmt match { | ||
case goto: Goto => | ||
val targetPC = tac.stmts(goto.targetStmt).pc | ||
if (targetPC <= pc && goto.pc >= pc) return false // jumping from behind the allocation site in front of it -> might break aliasing because allocation site is executed multiple times |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is inefficient and incorrect:
Inefficient:
- Checking every statement whether it is a goto, instead of using already available information (CFG, dominator tree)
- Matching on the type of the instruction instead of its astID, which allows for an integer check instead of a type check
- Extracting pcs instead of just working with stmt ids, which are ordered the same way
Incorrect:
- Would interpret some linear control flow as loops (not unsound, but still):
...
5: goto 8
6: allocation site
7: goto 9
8: goto 6
...
- Misses other branching instructions, such as if and switch, so it is unsound as it may miss loops
A better way to do this is to check whether a jump is dominated by its target. That is still not perfect (it misses non-natural loops). OPAL provides facilities to compute dominator trees and check dominance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand it correct, a better way to do it would be iterate over all basic block with a endPC >= allocationSitePC (don't care what previous bb's do) and check whether they are dominated by a bb in front of the allocation site? I don't see a way to directly check "whether a jump is dominated by its target". That would still require me to iterate over all goto statements to identify them. Or should i keep iterating over the statements and only perform the check, whether the goto is problematic, using the dominatorTree? Regarding the implementation: To check the dominance I only found the DominatorTree.foreachDom(int)(int => U) method. How can I map the given int back to the associated bb to check whether that is a problematic dominance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to do basic blocks before the allocation, too. See the example above how (linear control flow) jumps can reorder code irrespective of program counters.
Jumps are only at the end of basic blocks, so you don't need to look for gotos (and also, you get all kinds of jumps not just gotos).
There is method strictlyDominates
on DominatorTree (strictly just means it excludes the BB itself).
I can't come up with the correct procedure from my head just now, but I suppose it will be something like: Is there a dominator (loop header) of the allocation (loop body) which has a predecessor that it dominates (back edge).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. So far I have this code:
// check if the allocation site is dominated by a loop header, i.e., is executed multiple times
domTree.foreachDom(cfg.bb(tac.properStmtIndexForPC(pc)).nodeId)(dom => {
// check if the dominator is a loop header
cfg.foreachPredecessor(cfg.allBBs.toSeq(dom).nodeId)(pred => {
// if the dominator itself dominates one of its predecessors, it is a loop header
if (domTree.strictlyDominates(dom, cfg.bb(pred).nodeId)) {
return false
}
})
})
It checks whether the allocation site is dominated by a loop header. It works so far, but overapproximates cases where the allocation site is after the loop and not inside of it. E.g.:
public static void mustAliasLoopInFront() {
for (int i = 0; i < 10; i++) {
[...]
}
Object o2 = new Object();
o2.hashCode();
}
It computes that (o2,o2) is a may alias because the definition site is dominated by a loop header. Do you know whether it is possible to detect such cases, e.g., by checking if the allocation site is behind all predecessors of the loop header? Or would that cause other problems?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess also check that the backedge is a postdominator of the allocation site?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have now found a way to access the postDominatorTree. It currently uses the same way used in the TACAIProvider class. But i am not sure if it is possible to reuse the results used for the tacai instead of calculating them again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is okay for now. We can check whether that computation is performance-sensitive for the analysis, but I don't expect it to be as long as you don't recompute it multiple times per method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have just implemented that they are being cached after the first computation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that TACode.cfg.dominatorTree is also recomputed every time and that doesn't seem to be a problem usually.
I think the code you have now will produce different kinds of trees for the dominator and post dominators: the dominator tree from the CFG should be a TAC-based one with Stmts in it, while the post dominator tree you compute is bytecode-based with bytecode Instructions in it, right? That might cause issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, that's true (interesting that it still passes all tests* :D). I can of course map the values back to their pcs for the postdominator tree like so: postDomTree.strictlyDominates(cfg.bb(dom).startPC, cfg.bb(allocBB).startPC)
. But a version of the PostDominatorTree that is based on the tac would probably be better.
* They do fail if I change something else
...n/scala/org/opalj/tac/fpcf/analyses/alias/pointsto/TypePointsToBasedAliasAnalysisState.scala
Outdated
Show resolved
Hide resolved
...in/scala/org/opalj/tac/fpcf/analyses/alias/pointsto/AbstractPointsToBasedAliasAnalysis.scala
Outdated
Show resolved
Hide resolved
OPAL/tac/src/main/scala/org/opalj/tac/fpcf/analyses/alias/package.scala
Outdated
Show resolved
Hide resolved
OPAL/tac/src/main/scala/org/opalj/tac/fpcf/analyses/alias/TypeBasedAliasSet.scala
Outdated
Show resolved
Hide resolved
OPAL/tac/src/main/scala/org/opalj/tac/fpcf/analyses/alias/AliasSetLike.scala
Outdated
Show resolved
Hide resolved
OPAL/tac/src/main/scala/org/opalj/tac/fpcf/analyses/alias/AllocationSiteBasedAliasSet.scala
Outdated
Show resolved
Hide resolved
OPAL/tac/src/main/scala/org/opalj/tac/fpcf/analyses/alias/AbstractAliasAnalysis.scala
Outdated
Show resolved
Hide resolved
OPAL/br/src/main/scala/org/opalj/br/fpcf/properties/alias/AliasEntity.scala
Outdated
Show resolved
Hide resolved
...rc/main/scala/org/opalj/tac/fpcf/analyses/alias/AllocationSiteAndTacBasedAliasAnalysis.scala
Show resolved
Hide resolved
Please fix the merge conflicts listed below. |
…ysis # Conflicts: # DEVELOPING_OPAL/validate/src/test/java/org/opalj/fpcf/fixtures/alias/ArrayAlias.java # DEVELOPING_OPAL/validate/src/test/java/org/opalj/fpcf/fixtures/alias/FieldAlias.java # DEVELOPING_OPAL/validate/src/test/java/org/opalj/fpcf/fixtures/alias/NullAlias.java # DEVELOPING_OPAL/validate/src/test/java/org/opalj/fpcf/fixtures/alias/ParameterAlias.java # DEVELOPING_OPAL/validate/src/test/java/org/opalj/fpcf/fixtures/alias/ReturnValueAlias.java # DEVELOPING_OPAL/validate/src/test/java/org/opalj/fpcf/fixtures/alias/StaticFieldAlias.java # DEVELOPING_OPAL/validate/src/test/java/org/opalj/fpcf/fixtures/alias/UVarAlias.java # DEVELOPING_OPAL/validate/src/test/java/org/opalj/fpcf/properties/alias/MayAlias.java # DEVELOPING_OPAL/validate/src/test/java/org/opalj/fpcf/properties/alias/MustAlias.java # DEVELOPING_OPAL/validate/src/test/java/org/opalj/fpcf/properties/alias/NoAlias.java # OPAL/br/src/main/scala/org/opalj/br/fpcf/properties/alias/AliasEntity.scala
…d; fix formatting
...rc/main/scala/org/opalj/tac/fpcf/analyses/alias/AllocationSiteAndTacBasedAliasAnalysis.scala
Outdated
Show resolved
Hide resolved
OPAL/tac/src/main/scala/org/opalj/tac/fpcf/analyses/alias/AliasSetLike.scala
Outdated
Show resolved
Hide resolved
I think I'm mostly happy with this PR now and will approve and merge it soon. Just have a final look into the post-dominator tree issue please. It seems very weird to do dominator and post dominator checks using different underlying representations. It would be much nicer to also use a TAC-based post-dominator tree. |
I have managed to fix my initial approach of adding a postDominatorTree method in the cfg class. The problem was that I directly used the id's of the special exit nodes, which are -1 and -2. I am now using the ids of their predecessors, which seems to work fine: I have also converted the dominatorTree and postDominatorTree methods to However, the same test which made us use a postDominatorTree is still failing. The problem is, that an exception can be thrown inside the loop which creates an edge to the abnormal return node. This again causes the immediate post dominator of the loop header to be the unique exit node of the postDom tree instead of the bb after the loop, breaking our "behind-loop" check. |
Can you have another look at this? It is good work and I'd like to merge it, but I can't as long as there are still failing tests. |
The remaining problem is the handling of abnormal returns inside the loop and I am not sure how to solve it. I would for now change the expected values of the two failing tests to MayAlias. If a more mature loop detection is implemented in the future this could be used here to improve the precision. |
Note: This pull request is based on #178 and should only be merged after that pull request.
This pull request adds two alias analyses for the alias properties added in #171: One intraprocedural alias analysis and one based on the existing points-to analysis.