Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Java] - Limiting Flows Based on Patterns #18050

Open
KylerKatzUH opened this issue Nov 20, 2024 · 5 comments
Open

[Java] - Limiting Flows Based on Patterns #18050

KylerKatzUH opened this issue Nov 20, 2024 · 5 comments
Labels
question Further information is requested

Comments

@KylerKatzUH
Copy link

Hello, I am trying to restrict flows to only include those that have a source flow that is used as a query parameter.

For example, say authToken is a source,

String urlString = "http://auth.companyportal.com/auth?userId=" + userId + "&token=" + authToken;
URL url = new URL(urlString);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("GET");

However, my current query is picking up false positives where the source isn't used as a query parameter but somehow reaches the sink. Such as a dummy example like this

URL url = new URL(authToken);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("GET");

To address this I added a isValidQueryParamFlow predicate to my query that matches based on ".*\\?.*=.*" however, this causes all of the expected detections to be removed. Even if I remove the regex, or relax the restrictions there still aren't any results. I know the rest of the query is operating as it should since I am getting the expected results without this check. So, I believe it is an issue with how I am performing this filtering.

Here is my full query

 import java
 import semmle.code.java.dataflow.DataFlow
 import semmle.code.java.dataflow.TaintTracking
 import SensitiveInfo.SensitiveInfo
 import Barrier.Barrier
 
 module Flow = TaintTracking::Global<SensitiveInfoToUrlConfig>;
 import Flow::PathGraph
 
 /** A configuration for finding flows from sensitive information sources to URL constructions. */
 module SensitiveInfoToUrlConfig implements DataFlow::ConfigSig {
 
   predicate isSource(DataFlow::Node source) {
     exists(SensitiveVariableExpr sve |  
      source.asExpr() = sve and 
      not sve.toString().toLowerCase().matches("%url%"))
    }

   predicate isSink(DataFlow::Node sink) {
    // Direct use of URL with openConnection followed by setRequestMethod("GET")
    exists(ConstructorCall urlConstructor, MethodCall openConnectionCall, MethodCall setRequestMethod |
      urlConstructor.getConstructedType().hasQualifiedName("java.net", "URL") and
      urlConstructor.getAnArgument() = sink.asExpr() and
      openConnectionCall.getMethod().hasName("openConnection") and
      openConnectionCall.getMethod().getDeclaringType().hasQualifiedName("java.net", "URL") and
      DataFlow::localExprFlow(urlConstructor, openConnectionCall.getQualifier()) and
      setRequestMethod.getMethod().hasName("setRequestMethod") and
      ((StringLiteral)setRequestMethod.getArgument(0)).getValue() = "GET" and
      DataFlow::localExprFlow(openConnectionCall, setRequestMethod.getQualifier())
    )
  }
 
   predicate isBarrier(DataFlow::Node node) {
    Barrier::barrier(node)
   }
 }

 predicate isValidQueryParamFlow(Flow::PathNode source, Flow::PathNode sink) {
  exists(BinaryExpr be |
      be.getOp() = "+" and
      be.getLeftOperand().toString().matches(".*\\?.*=.*") and // Ensure there is a `=` after `?`
      source.getNode().asExpr() = be.getRightOperand() and
      sink.getNode().asExpr() = be
  )
}
 
 from Flow::PathNode source, Flow::PathNode sink
 where Flow::flowPath(source, sink) and
 isValidQueryParamFlow(source, sink)
 select sink.getNode(), source, sink, "Sensitive information used in a URL constructed for a GET request." 
 

Any help is appreciated, thank you,

@KylerKatzUH KylerKatzUH added the question Further information is requested label Nov 20, 2024
@aibaars
Copy link
Contributor

aibaars commented Nov 21, 2024

The predicate isValidQueryParamFlow(Flow::PathNode source, Flow::PathNode sink) requires the source to be an operand of a + expression and the sink to be the result of the + expression. In addition isSource requires the source to be an argument of new URL(..) . That seems overly restrictive. I suppose the query could still find cases like

URL url = new URL("http://auth.companyportal.com/auth?userId=" + userId + "&token=" + authToken);
...

Note: in general it is best to avoid toString() in the logic of queries. That predicate is meant for displaying a short string to a human, the strings are often abbreviations.

@KylerKatzUH
Copy link
Author

Hello @aibaars,

Thank you for your response. I agree it is restrictive, and that there are many more ways that that a URL can be constructed with query params. I am using this as a proof of concept and eventually plan to extend the isValidQueryParamFlow predicate to include more common formats.

For right now, I am confused as to why this query isn't picking up on the simple example that I am testing. Is it correct to do this in the where clause like I have? Or should it be done differently? For example, isBarrier is used to restrict flows that go through something such as an encryption function. With this, I am looking to almost do the opposite and restrict it to only flows that match a certain pattern.

I tried switching my query to use TaintTracking::Configuration so that I can override isAdditionalTaintStep with this, I get the expected result detected, however, the pathgraph fails to be displayed correctly.

Giving me

Showing raw results instead of interpreted ones due to an error. A fatal error occurred: Could not process query metadata.
Error was: Expected result pattern(s) are not present for path-problem query: Expected at least two result patterns. These should include at least an 'edges' result set (see https://codeql.github.com/docs/writing-codeql-queries/creating-path-queries/). [INVALID_RESULT_PATTERNS]

I haven't ever been able to get tainttracking to work without this error which is why I use the DataFlow::ConfigSig instead for all of my dataflow queries. However, that won't let me override anything.

Here is my query using this method for reference

/**
 * @name CWE-598: Use of GET request method with sensitive query strings
 * @description Detects sensitive information being sent in query strings over GET requests, which could be exposed in server logs or browser history.
 * @kind path-problem
 * @problem.severity warning
 * @id java/sensitive-get-query2/598
 * @tags security
 *       external/cwe/cwe-598
 * @cwe CWE-598
 */

 import java
 import semmle.code.java.dataflow.TaintTracking
 import semmle.code.java.dataflow.DataFlow
 
 class SensitiveDataToQueryParam extends TaintTracking::Configuration {
   SensitiveDataToQueryParam() { this = "SensitiveDataToQueryParam" }
 
   override predicate isSource(DataFlow::Node source) {
     exists(VarAccess v |
       source.asExpr() = v and
       v.toString() = "cardNumber"
     )
   }
 
   override predicate isSink(DataFlow::Node sink) {
     exists(ConstructorCall urlConstructor, MethodCall openConnectionCall, MethodCall setRequestMethod |
       urlConstructor.getConstructedType().hasQualifiedName("java.net", "URL") and
       urlConstructor.getArgument(0) = sink.asExpr() and
       openConnectionCall.getMethod().hasName("openConnection") and
       DataFlow::localExprFlow(urlConstructor, openConnectionCall.getQualifier()) and
       setRequestMethod.getMethod().hasName("setRequestMethod") and
       setRequestMethod.getArgument(0) instanceof StringLiteral and
       ((StringLiteral)setRequestMethod.getArgument(0)).getValue() = "GET" and
       DataFlow::localExprFlow(openConnectionCall, setRequestMethod.getQualifier())
     )
   }
  
   override predicate isAdditionalTaintStep(DataFlow::Node src, DataFlow::Node dest) {
     exists(BinaryExpr binOp |
       binOp.getOp() = "+" and
       binOp = dest.asExpr() and
       (
         (binOp.getLeftOperand() = src.asExpr() and isQueryParamString(binOp.getRightOperand())) or
         (binOp.getRightOperand() = src.asExpr() and isQueryParamString(binOp.getLeftOperand()))
       )
     )
   }
 
   predicate isQueryParamString(Expr expr) {
     expr instanceof StringLiteral and
     ((StringLiteral)expr).getValue().matches("%=%")
   }
 }
 
 from SensitiveDataToQueryParam t, DataFlow::PathNode sourceNode, DataFlow::PathNode sinkNode
 where t.hasFlowPath(sourceNode, sinkNode)
 select sinkNode.getNode(),
   sourceNode.getNode(),
   sinkNode.getNode(),
   sourceNode,
   sinkNode,
   "Sensitive information used in a URL constructed for a GET request."
 

@aibaars
Copy link
Contributor

aibaars commented Nov 22, 2024

Giving me

Showing raw results instead of interpreted ones due to an error. A fatal error occurred: Could not process query metadata.
Error was: Expected result pattern(s) are not present for path-problem query: Expected at least two result patterns. These should include at least an 'edges' result set (see https://codeql.github.com/docs/writing-codeql-queries/creating-path-queries/). [INVALID_RESULT_PATTERNS]

That error means the path graph is missing . I think you need to add

import TaintedPathFlow::PathGraph

@KylerKatzUH
Copy link
Author

Hi @aibaars

I tried

import TaintedPathFlow::PathGraph

However, it says that it's not defined, I am using version 2.17 so I am not sure if it was recently added. However, I can't upgrade due to legacy queries.

With that said, I almost have it working with DataFlow::ConfigSig. After some testing, I figured out that the regex wasn't working correctly, so I have now fixed it. The issue I am facing now it getting it to consider intermediate nodes in the flow.

Using isValidQueryParamFlow(source, sink) Means that it will only consider direct concatenations in the sink such as

URL url = new URL("http://api.companynetwork.com/private/data?apikey=" + authToken);

This means that indirect uses like this aren't detected

String url = "http://api.companynetwork.com/private/data?apikey=" + authToken;
URL url = new URL(url);

I tried recursively traversing back up the flow path using the getASuccessor() predicate. However, this ends up causing false positives.

Recursive attempt.


predicate isQueryParamNode(Flow::PathNode node) {
  exists(AddExpr concatOp, Expr param |
    (param = concatOp.getLeftOperand() or param = concatOp.getRightOperand()) and
    (param.toString().indexOf("?") >= 0 or param.toString().indexOf("&") >= 0) and
    param.toString().indexOf("=") > 0 and
    node.getNode().asExpr() = concatOp
  )
}

/**
 * Walks backward from sink to source, validating if any node matches a query parameter concatenation.
 */
predicate isValidQueryParamFlow(Flow::PathNode source, Flow::PathNode sink) {
  // Start at the sink and walk backward
  exists(Flow::PathNode current |
    // Initialize the traversal at the sink
    sink = current or
    // Traverse backward through successors
    current.getASuccessor() = sink and
    (
      // Check if the current node matches a query parameter
      isQueryParamNode(current) or
      // Continue traversing backward until the source is reached
      isValidQueryParamFlow(source, current)
    )
  )
}

Full query without recursion

import java
 import semmle.code.java.dataflow.DataFlow
 import semmle.code.java.dataflow.TaintTracking
 import SensitiveInfo.SensitiveInfo
 import Barrier.Barrier
 
 module Flow = TaintTracking::Global<SensitiveInfoToUrlConfig>;
 import Flow::PathGraph
 
 /** A configuration for finding flows from sensitive information sources to URL constructions. */
 module SensitiveInfoToUrlConfig implements DataFlow::ConfigSig {
 
   predicate isSource(DataFlow::Node source) {
     exists(SensitiveVariableExpr sve |  
      source.asExpr() = sve and 
      not sve.toString().toLowerCase().matches("%url%"))
    }

   predicate isSink(DataFlow::Node sink) {
    // Direct use of URL with openConnection followed by setRequestMethod("GET")
    exists(ConstructorCall urlConstructor, MethodCall openConnectionCall, MethodCall setRequestMethod |
      urlConstructor.getConstructedType().hasQualifiedName("java.net", "URL") and
      urlConstructor.getAnArgument() = sink.asExpr() and
      openConnectionCall.getMethod().hasName("openConnection") and
      openConnectionCall.getMethod().getDeclaringType().hasQualifiedName("java.net", "URL") and
      DataFlow::localExprFlow(urlConstructor, openConnectionCall.getQualifier()) and
      setRequestMethod.getMethod().hasName("setRequestMethod") and
      ((StringLiteral)setRequestMethod.getArgument(0)).getValue() = "GET" and
      DataFlow::localExprFlow(openConnectionCall, setRequestMethod.getQualifier())
    )
  }
 
   predicate isBarrier(DataFlow::Node node) {
    Barrier::barrier(node)
   }
 }
predicate isValidQueryParamFlow(Flow::PathNode source, Flow::PathNode sink) {
  exists(AddExpr concatOp, Expr param |
    // Check if param is part of the concatenation
    (param = concatOp.getLeftOperand() or param = concatOp.getRightOperand()) and
    // Ensure the string contains `?` or `&`, and also `=`
    (param.toString().indexOf("?") >= 0 or param.toString().indexOf("&") >= 0) and
    param.toString().indexOf("=") > 0 and
    // Map source and sink to the concatenation components
    source.getNode().asExpr() = concatOp.getRightOperand() and
    sink.getNode().asExpr() = concatOp
  )
}

 
 from Flow::PathNode source, Flow::PathNode sink
 where Flow::flowPath(source, sink) and
 isValidQueryParamFlow(source, sink)
 select sink.getNode(), source, sink, "Sensitive information used in a URL constructed for a GET request." 

Thank you, for any help

@aibaars
Copy link
Contributor

aibaars commented Nov 23, 2024

I tried

import TaintedPathFlow::PathGraph

However, it says that it's not defined, I am using version 2.17 so I am not sure if it was recently added. However, I can't upgrade due to legacy queries.

No, I think I just remembered it wrong. In one of the versions of the query tt looks like you were missing

 module Flow = TaintTracking::Global<SensitiveInfoToUrlConfig>;
 import Flow::PathGraph

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants