llvm2graphml is a tool that helps you explore LLVM Bitcode interactively using a graph database.
Get the latest binary from here or build llvm2graphml yourself:
git clone https://github.com/ShiftLeftSecurity/llvm2graphml.git --recursive
mkdir build.dir; cd build.dir
cmake ../llvm2graphml
make
make install
Take this file:
; main.ll
define i32 @increment(i32 %x) {
%result = add i32 %x, 1
ret i32 %result
}
Convert it into GraphML:
> llvm2graphml --output-dir=/tmp main.ll
[llvm2graphml] [info] More details: /var/folders/pp/lt3pgm5971n1qw7pp2g_bmfr0000gn/T/llvm2graphml-77ed40.log
[llvm2graphml] [info] Loading main.ll
[llvm2graphml] [info] Saved result into /tmp/llvm.graphml.xml
[llvm2graphml] [info] Shutting down
The /tmp/llvm.graphml.xml
now contains the graph version of the bitcode.
To follow the example you need to install Gremlin Console from the Apache TinkerPop project.
Run the gremlin.sh
to start the interactive session and load /tmp/llvm.graphml.xml
into it.
> gremlin-console/bin/gremlin.sh
\,,,/
(o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
plugin activated: tinkerpop.tinkergraph
gremlin>
gremlin> graph = TinkerGraph.open()
gremlin> g = graph.traversal()
gremlin> g.io("/tmp/llvm.graphml.xml").read()
List all modules:
gremlin> g.V().hasLabel('module').valueMap().unfold()
==>moduleIdentifier=[main.ll]
List all functions:
gremlin> g.V().hasLabel('function').valueMap().unfold()
==>argSize=[1]
==>basicBlockCount=[1]
==>name=[increment]
==>isDeclaration=[false]
==>isVarArg=[false]
==>isIntrinsic=[false]
==>numOperands=[0]
==>instructionCount=[2]
Count all the instructions:
gremlin> g.V().hasLabel('instruction').groupCount().by('opcode').unfold()
==>ret=1
==>add=1
Explore the types:
gremlin> g.V().hasLabel('type').valueMap().unfold()
==>typeID=[void]
==>typeID=[label]
==>typeID=[pointer]
==>typeID=[function]
==>typeID=[integer]
==>bitwidth=[32]
Find functions with an argument called x
:
gremlin> g.V().has('argument', 'name', 'x').out('function').valueMap('name')
==>[name:[increment]]
See more of those in the Queries.md.
llvm2graphml is not feature complete and is in a very early stage. Properties of instructions and values are not there yet, global variables and constants are also missing. Some more edges between things would probably help as well.
But we welcome contributions!
Please, look at the CONTRIBUTING.md