Skip to content

C++: Speed up the cpp/unbounded-write query for an upcoming change #18485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 14, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions cpp/ql/lib/semmle/code/cpp/controlflow/IRGuards.qll
Original file line number Diff line number Diff line change
Expand Up @@ -1453,3 +1453,25 @@ private module Cached {
}

private import Cached

/**
* Holds if `left < right + k` evaluates to `isLt` given that some guard
* evaluates to `value`.
*
* To find the specific guard that performs the comparison
* use `IRGuards.comparesLt`.
*/
predicate comparesLt(Operand left, Operand right, int k, boolean isLt, AbstractValue value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want this predicate and the one below, given that they just forward arguments to compares_eq and compares_lt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that the fanout from g to g.comparesLt(...) is quite large on certain repositories, so we want to create a way to compute a version of comparesLt that can be used to prune the set of guards we want to look at early (in the interestingLessThanOrEqual predicate below).

So the important part is that there's no GuardCondition as a column to this predicate (as that is what's giving the large fan-out). We can include the ValueNumber (i.e., the first column from compares_lt) that is currently omitted from this predicate, but since it wasn't needed for the pruning in UnboundedWrite I decided to leave it out to not expose that implementation detail.

compares_lt(_, left, right, k, isLt, value)
}

/**
* Holds if `left = right + k` evaluates to `isLt` given that some guard
* evaluates to `value`.
*
* To find the specific guard that performs the comparison
* use `IRGuards.comparesEq`.
*/
predicate comparesEq(Operand left, Operand right, int k, boolean isLt, AbstractValue value) {
compares_eq(_, left, right, k, isLt, value)
}
31 changes: 31 additions & 0 deletions cpp/ql/src/Security/CWE/CWE-120/UnboundedWrite.ql
Original file line number Diff line number Diff line change
Expand Up @@ -73,11 +73,42 @@ predicate isSink(DataFlow::Node sink, BufferWrite bw, boolean qualifier) {
unboundedWriteSource(sink.asDefiningArgument(), bw, qualifier)
}

/**
* A configuration that specifies flow from a `FlowSource` to a relational
* comparison.
*
* This configuration is used to speed up the barrier computations in Config.
*/
module BarrierConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) { isSource(source, _) }

predicate isSink(DataFlow::Node sink) {
comparesEq(sink.asOperand(), _, _, true, _) or
comparesLt(sink.asOperand(), _, _, true, _)
}
}

module BarrierFlow = TaintTracking::Global<BarrierConfig>;

import semmle.code.cpp.ir.dataflow.internal.DataFlowImplCommon as DataFlowImplCommon

/**
* Holds if `left` is a left operand of some relational comparison that may
* depend on user input.
*/
predicate interestingLessThanOrEqual(Operand left) {
exists(DataFlowImplCommon::NodeEx node |
node.asNode().asOperand() = left and
BarrierFlow::Stages::Stage1::sinkNode(node, _)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my understanding/guessing is that stage one does some cheap flow starting at sources to find sink that could be interesting, and we are using that capability to prune down barriers as well.

Is this something that the data flow library could actually do itself? Prune down barriers in this way. Or could it be bad to do more generally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that the lessThanOrEqual predicate itself blows up due to the fanout from g in g.comparesLt(left, _, _, true, branch) (or g.comparesEq(left, _, _, true, branch)) on some projects containing a function with many guards that guard equivalent expressions (so that there are many guards for the same left).

As an alternative we could've inlined lessThanOrEqual and isBarrier, and then we'd most likely get to a point where the resulting predicate was restricted by an internal call inside the dataflow library that ensures that the node being barrier'd is part of the path from a source to a sink, but this more robust 🙂

)
}

predicate lessThanOrEqual(IRGuardCondition g, Expr e, boolean branch) {
exists(Operand left |
g.comparesLt(left, _, _, true, branch) or
g.comparesEq(left, _, _, true, branch)
|
interestingLessThanOrEqual(left) and
left.getDef().getUnconvertedResultExpression() = e
)
}
Expand Down
Loading