Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplification Pass DSA #336

Merged
merged 95 commits into from
Mar 12, 2025
Merged

Simplification Pass DSA #336

merged 95 commits into from
Mar 12, 2025

Conversation

sadrabt
Copy link
Contributor

@sadrabt sadrabt commented Feb 24, 2025

Data Structure Analysis re implemented to use simplification pass functionality.
use flag --dsa none to generate constraints or --dsa norm to perform dsa
(both require --simplify)

The following are changes from the DSA as it exists in main (--analyse flag):

  • Can perform DSA on CNTLM
  • Separates Constraint generation from DSA (into SVA and Const Gen) which allows the constraints to be used in other analysis (future version of DSA or any other memory analysis which may have compatible constraints)
  • Remove dependency on SAA which required interprocedural analysis prior to DSA making the results an underapproximate if calls weren't resolved DSA Stack Inconsistencies #267 and Remove Symbolic Address Analysis #273
  • include and update region start points in nodes which allows analysis to determine which exact regions are represented by cells
  • Global and Stack regions are represented by a single node instead of splitting into various nodes creating fewer merges
  • Invariant checking (checks pointees are maintained through merges, and checks that all memory constraints are maintained through each phase)
  • Cells are node, interval pairs instead of node, offset pairs with variable size
  • Nodes only maintain a seq of non-overlapping cells instead of map from offsets to cells simplifying operations
  • Slices are removed. Instead create dummy cells with intervals which are in the middle of the nodes' actual cell interval. This removes the need to update slices (solves bugs with cloning graphs without slice information ) Update stackMapping to Offset -> Cell #274. This means that nodes and cells could have references (point-to) cells with intervals that aren't updated to reflect join of overlapping cells. However, each node still maintains that its cells are non-overlapping. Use get() method to get the most updated version of a cell's interval (unified with all the interval of all the cells in its node that overlap). Additionally, all pointee operations such as setPointee, removePointee or hasPointee are performed on the actual cell corresponding to the dummy cell's intervals.
  • Resolves some calls
  • Unify graphs of procedures in SCCs

@sadrabt sadrabt marked this pull request as ready for review February 26, 2025 04:09
Copy link
Contributor

@ailrst ailrst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments

@@ -313,7 +313,8 @@ case class SymbolicValues(state: Map[LocalVar, SymValueSet]) {
.collect { case locVar: LocalVar if state.contains(locVar) => state(locVar) }
.foldLeft(SymValueSet.empty)((result, operand) => result.join(operand))
.toTop
case _ =>
case e @ _ =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make sure this is sound, there's a couple cases here making too-strong assumptions. The extract cases might be interesting to narrow with the known bits analysis (e.g. checking the high bits are zero for Extract(32, 0, body)) . We would probably also want to handle the case (R0 + R1 * constant). Maybe I'm missing something but it seems complex to expand this representation to arbitrary value analysis domains that don't represent every element, but I guess thats future work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think assuming that slicing/extracting a value preserves the pointers it represents (although it probably loses some precision). What is missing is that it could also represents other pointers but for the Extract(32, 0, body) case hopefully this is less of a concern since now we have 32 bit value.
For (R0 + R1 * constant), we could only handle it if for example R1 is a constant which will be removed in the copyprop.

StaticAnalysisConfig,
StaticAnalysisContext
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add the indirect call tests with the DSA to see if it will solve them? Indirect calls are definitely not a priority since we only have one in cntlm but it would be interesting to know.

.sortBy(_.name)
.foreach(proc =>
val SVAResults = getSymbolicValues(proc)
val constraints = generateConstraints(proc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't put comments on Constraints.scala cause its not in the review but it could use the iterator over the procedure directly i.e. for (c: CFGPosition <- proc) which will be more efficient than computeDomain but it will include non-reachable blocks.

temp
}

def getPointee: IntervalCell = {
Copy link
Contributor

@ailrst ailrst Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invariant that every cell has a pointee cell? I'm not sure why if pointee is empty it creates a pointee

Copy link
Contributor

@ailrst ailrst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to merge when you are happy with this

@sadrabt sadrabt merged commit 800dd6f into main Mar 12, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DSA Stack Inconsistencies
2 participants