-
Notifications
You must be signed in to change notification settings - Fork 255
Internal representation
Reko maintains the analyzed binary program in a format called IR. It's useful for developers and users to see this format to gain an understanding of the internals of the analysis. A sample below will serve as a starting point for the definition of the text format of the internal representation.
This identifies the architecture and platform Reko used to load the original binary
target "x86-Win32"
The function factorial
is being defined. It was found at address 00123400
. Nothing is known about its type ("signature")
define factorial
addr "00123400"
{
The 32-bit identifiers ecx
,eax
, and esp
are stored at bit offset 0 in the x86 registers ECX, EAX, and ESP, respectively at bit offset 0
def ecx:word32 reg(ecx 0)
def eax:word32 reg(eax 0)
def esp:word32 reg(esp 0)
The id edx_eax
is a sequence formed by treating the two x86 registers edx and eax as a single unit.
def edx_eax:word64 seq(reg(edx 0) reg(eax 0)
SCZO
is a group of flags stored in the x86 FLAGS register at the bit positions indicated by the bit mask 0xF.
def SCZO:byte flags("flags", 0xF)
fp
is a temporary introduced by Reko that doesn't map to an x86 register. It is used to represent the stack frame of the procedure. The id esp
is initialized to fp
, corresponding to the stack pointer pointing to the stack frame at the start of the procedure.
def fp:ptr32
esp = fp
The first basic block sets the SCZO
flag set with the outcome of subtracting the 32-bit constant 1:word32
from register ecx
(which clearly is an input parameter). The Test
expresses which bits the branch instruction looks at to determine a LE
condition.
l0:
SCZO = cond(ecx - 1:word32)
if Test(SZ,LE) branch l2
An x86 push
instruction is lifted to a stack pointer adjustment and a store.
l1:
esp = esp - 4:word32
Mem0[esp:word32] = ecx
The recursive call to factorial pushes a 4-byte return value on the stack. Here a type indicator for the '4' is not needed because in the RtlCall
instruction, it's always an integer.
call factorial 4
A rewritten x86 pop
instruction restores the old value of ecx
before the call.
ecx = Mem0[esp:word32]
esp = esp + 4:word32
The imul
instruction returns its result in a register pair.
edx_eax = eax *s ecx
SCZO = cond(edx_eax)
goto l3
The base case of the induction sets eax
to 1.
l2:
eax = 1:word32
The return
operation removes 4 bytes from the stack. Consider: make the stack adjustment explicit in the IR?
l3:
return ; 4
}
Later stages of Reko will have modified this code by quite a bit. Here it is after SSA analysis, value propagation, and function signature analysis:
define word32 factorial(ecx:word32 reg(ecx 0))
addr "00123400"
{
def fp:ptr32
esp_1 = fp
l0:
SCZO_2 = cond(ecx - 1:word32)
SZ_3 = SZCO_2
if Test(ecx <= 1:word32) branch l2
l1:
esp_4 = fp - 4:word32
Mem5[fp - 4:word32] = ecx
call factorial 4
use: ecx
def: eax_6 Mem7
ecx_7 = Mem7[fp - 4:word32]
esp_8 = fp
edx_eax_9 = eax_6 *s ecx_7
eax_10 = (word32) edx_eax_9
SCZO_11 = cond(edx_eax_9)
goto l3
l2:
eax_12 = 1:word32
l3:
eax_13 = PHI(eax_10, eax_13)
return eax_13
}