Sensibility and Leaks analysis

Phase name: jtp.SensibleData

Evaluates whether a sensible value, marked the by the following method call:

Object newSensibleObject;
analysis.SensibilityMarker.markAsSensible(newSensibleObject);

is leaked by some offending method. There's also a way of cleaning the sensitiveness of an object, by calling it upon this method:

Object newSensibleObject;
analysis.SensibilityMarker.sanitize(newSensibleObject);

Both method are detected just by being declared as this example, as long as they respect the package, method and class names.

The sensibility values belong to the lattice described below:

BOTTOM -> NOT_SENSIBLE ---> MAYBE_SENSIBLE (TOP)
       \                /
        `-> HIGH ------´

This is implemented in the class SensibilityLattice. The main operation is the supremeBetween which is the equivalent to the mathematical lattice operator.

When a local variable marked with a sensibility level HIGH reaches an offending method (just checking for java.io.PrintStream#println, but easily extensible), the analysis marks that statement as a leak-producer. This means that this statement may expose a sensible value to a user-facing interface (log, screen, request response , etc.).

The analysis has the following features analyzing it in the data-flow framework:

Inter-procedural: See this section for more details.
Handles just local assignment operations (does not support fields or array references).
Handles polymorphic method invocations: See this section for more details.

Build instructions

Clone the repository.
Clone the points-to analysis repo.
Run mvn clean install in the points-to analysis repo. This is a necessary dependency for the sensible-data-analysis.
Run mvn clean install in the root of this repo. This will also run the whole test suite.
For details on calling the analysis from Soot, check the root readme.

Inter-procedural implementation details

Overview

The inter-procedural implementation of this analysis is both naive and technical. First, let's characterize the features implemented when handling method invocations:

When the called method is defined in the same package being analyzed (in the user-code per-se, and not a third party or JVM library), the method called is fully analyzed running this same procedure in the invoked method . This is done passing a context, which contains the sensibility level of the called method's arguments.
Recursion not explicitly handled. A method that modifies some value every time called recursively might make this analysis produce an StackOverflowException. This is due to a huge number of nested calls, hence running this analysis once for every one of them.
Polymorphic calls handled.
Non-user methods (those described in the first point) are handled with a simplified invocation model

Simplification model

When a non-user method is called, one of two things can be done:

Treat them as method calls, which implies analyzing the called method (which belongs to third-party libraries, JVM libraries of native code). This might look like an easy idea at first, but there are some challenges:
- The call-chains can be really long. For example calling the good-old System.out.prinln leads to a long chain of PrintStream indirections which use low-level JVM features.
- Because the called method might use those low-level JVM features (goto statements for example, or some bit-based operator), their handlers have to be implemented in the analysis. This makes the model more and more complex as new methods need to be handled.
Simplify invocation of non-user methods by a naive model which captures the effect that those calls might have in the analysis we are performing.

While developing this analysis, I tried implementing the former option, but quickly run into problems like StackOverflowError due to long chained-calls, or some other feature of the language not being handled correctly or omitted by my analysis (array references for example).

This led towards choosing the latter, and simplifying the invocation semantics of non-user methods. These can be described with the following rules:

If one of the arguments is labeled as sensible, and the method returns a non-void type, the returned value is sensible. This captures calls to methods like new StringBuilder().append(aSensibleValue).
If the invocation is an instance invocation, and the receiver of the method call is a sensible value, the returned value is sensible. This captures calls like someSensibleString.concat.

Also, non-user called methods are that are not considered offending methods are assumed not never produce a leak as side effect. This is a very naive approach, but since offending methods are checked way before this simplification is reached, this can be assumed.

This rules are implemented here.

Polymorphic calls handling

When an invoked method corresponds to an interface method, some extra information is needed to decide which implementation is being called. Without additional settings (Soot's whole world mode), Soot is not able to provide the necessary to handle this situations.

To cope with this, before running the main analysis a points-to analysis is performed to have information of which Object might pointed from each local variable in the analyzed program. For details, check this repo.

There might be some cases when the base object of the invocation points to several object (due to the lack of context sensitivity of the points-to used, for example). In those cases, every called is analyzed in a similar way, and the results are merged making this decision in a MAY-fashion. This might generate false-positive, but helps in keeping the overall analysis sound.

This is implemented in here

Future work

All over the analysis code there are TODO's statements suggesting future improvements for this project. Overall, they can be summarized with the following:

Add more language features support (field-sensitivity, arrays, etc.).
Support non-user method invocations.
Review context-sensitivity.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Sensibility and Leaks analysis

Build instructions

Inter-procedural implementation details

Overview

Simplification model

Polymorphic calls handling

Future work

Files

README.md

Latest commit

History

README.md

File metadata and controls

Sensibility and Leaks analysis

Build instructions

Inter-procedural implementation details

Overview

Simplification model

Polymorphic calls handling

Future work