Skip to content

ShiftLeftSecurity/llvm2graphml

Repository files navigation

llvm2graphml

llvm2graphml is a tool that helps you explore LLVM Bitcode interactively using a graph database.

Installation

Get the latest binary from here or build llvm2graphml yourself:

git clone https://github.com/ShiftLeftSecurity/llvm2graphml.git --recursive
mkdir build.dir; cd build.dir
cmake ../llvm2graphml
make
make install

Usage

Take this file:

; main.ll
define i32 @increment(i32 %x) {
  %result = add i32 %x, 1
  ret i32 %result
}

Convert it into GraphML:

> llvm2graphml --output-dir=/tmp main.ll
[llvm2graphml] [info] More details: /var/folders/pp/lt3pgm5971n1qw7pp2g_bmfr0000gn/T/llvm2graphml-77ed40.log
[llvm2graphml] [info] Loading main.ll
[llvm2graphml] [info] Saved result into /tmp/llvm.graphml.xml
[llvm2graphml] [info] Shutting down

The /tmp/llvm.graphml.xml now contains the graph version of the bitcode.

Run queries

To follow the example you need to install Gremlin Console from the Apache TinkerPop project.

Run the gremlin.sh to start the interactive session and load /tmp/llvm.graphml.xml into it.

> gremlin-console/bin/gremlin.sh

         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
plugin activated: tinkerpop.tinkergraph
gremlin> 
gremlin> graph = TinkerGraph.open()
gremlin> g = graph.traversal()
gremlin> g.io("/tmp/llvm.graphml.xml").read()

List all modules:

gremlin> g.V().hasLabel('module').valueMap().unfold()
==>moduleIdentifier=[main.ll]

List all functions:

gremlin> g.V().hasLabel('function').valueMap().unfold()
==>argSize=[1]
==>basicBlockCount=[1]
==>name=[increment]
==>isDeclaration=[false]
==>isVarArg=[false]
==>isIntrinsic=[false]
==>numOperands=[0]
==>instructionCount=[2]

Count all the instructions:

gremlin> g.V().hasLabel('instruction').groupCount().by('opcode').unfold()
==>ret=1
==>add=1

Explore the types:

gremlin> g.V().hasLabel('type').valueMap().unfold()
==>typeID=[void]
==>typeID=[label]
==>typeID=[pointer]
==>typeID=[function]
==>typeID=[integer]
==>bitwidth=[32]

Find functions with an argument called x:

gremlin> g.V().has('argument', 'name', 'x').out('function').valueMap('name')
==>[name:[increment]]

See more of those in the Queries.md.

Feature (in)completeness

llvm2graphml is not feature complete and is in a very early stage. Properties of instructions and values are not there yet, global variables and constants are also missing. Some more edges between things would probably help as well.

But we welcome contributions!

Contributing

Please, look at the CONTRIBUTING.md

License

Apache 2. See the LICENSE for more details.