Skip to content
This repository has been archived by the owner on Jun 28, 2022. It is now read-only.

[QEUSTION]How does granary core work from a high-level insight? #23

Open
renzhengeek opened this issue Mar 28, 2014 · 4 comments
Open

Comments

@renzhengeek
Copy link
Contributor

Hi all,
With pgoodman's help 👍 , I get much more clear about wrappers, watchpoints.

But I still know little details about granary core.
So, I start from granary's entry to follow how granary going on.

In init_granary (https://github.com/Granary/granary/blob/master/module.c#L601):

how granary work after this is even more harder for me(i.e. how to control and instrument target module's executions.).for example:

  1. do we split module code into bb based on direct cti only or both indirect and direct cti?
  2. what will be stored in code cache?
  3. when/where do we actually instruments module, only in visit_app/host_instructions?
  4. the last parameter ls in visit_app_instructions actually refer to what?
    does it refer to instructions of a basic block in code cache?
granary::instrumentation_policy null_policy::visit_app_instructions(
        granary::cpu_state_handle,
        granary::basic_block_state &,
        granary::instruction_list &
    ) throw() {
        return granary::policy_for<null_policy>();
    }

Thank you very much ;)

@pgoodman
Copy link
Member

This is a good question and the answer will not be obvious from looking at the code. I will address each question in different replies.

granary/gen/kernel_init.s exists to invoke the so-called "static initializers". If you want to see examples of static initializers, then search the code for uses of the STATIC_INITIALISE or STATIC_INITIALISE_ID macros.

Granary depends on a number of global data structures. The code cache index (https://github.com/Granary/granary/blob/master/granary/code_cache.cc#L53) is one such data structure, and the memory available for the code cache code (https://github.com/Granary/granary/blob/master/granary/allocator.cc#L86) is another one. Space is reserved for these global data structures in the executable, and that space is allocated at load time by the kernel module loader.

The static initializers solves two problems:

  1. Some of Granary's statically allocated data structures have non-trivial constructors/destructors. For example, the code cache index, when "destroyed", deallocates all of the memory backing its internal the hash table. Similarly, the fields of the code cache index data structure must be correctly initialized (constructed) before it can be used.

    In user space, statically allocating a data structure with a non-trivial constructor is easy: you just write a global variable for it, and the runtime will just "magically" invoke the constructor for that object before your program ever gets to main. The Linux kernel, however, does not automatically invoke constructors. To get around this issue, Granary code uses the STATIC_INITIALISE macros that basically define functions with specific names, and these functions are responsible for initializing all of those data structures (or doing other miscellaneous initializations). There is a script that goes and finds all of these functions and then generates an assembly routine to call each function.

    When Granary is initialized (by doing sudo touch /dev/granary), the generated function is called, and then all of the initialization routines are put into various initialization lists. These lists are iterated and their callback initialization functions are invoked at different stages of Granary's init process.

  2. The other problem the STATIC_INITIALISE macros and the static_data template solve is user space object destruction. If you weren't already aware, Granary can be used for user space instrumentation, although it's somewhat ad-hoc and not always well-defined.

    The way Granary instruments user space code is that a granary.so shared object file is dynamically loaded into a process's address space using the LD_PRELOAD environment variable. If global objects where statically allocated, and if they have non-trivial destructors, then a funny thing can happen: Granary can instrument the destructors / destruction of its own internal objects! The static_data template avoids this issue by making all non-trivial global data structures "opaque" sequences of bytes (with respect to object initialization and destruction), so that Granary does not encounter any re-entrancy issues like destroying the code cache index, while simultaneously querying the address of a basic block that is part of the destroying procedure :-P

@pgoodman
Copy link
Member

Granary sets the text sections of modules to read-only so that if the kernel invokes a module directly (and not through a dynamic wrapper), then the hardware will raise a page fault. Granary catches these faults so that it can gain control of the module's execution to ensure that all of the module's code is instrumented. Here's the code that does that:

  1. Granary takes over the interrupt descriptor table (IDT) here: https://github.com/Granary/granary/blob/master/granary/kernel/state.cc#L143. Shadow IDTs are created by this function: https://github.com/Granary/granary/blob/master/granary/kernel/interrupt.cc#L1019
  2. Granary makes sure module text sections are non-executable here: https://github.com/Granary/granary/blob/master/granary/kernel/linux/module.cc#L106 which leads to https://github.com/Granary/granary/blob/master/module.c#L304
  3. If the kernel manages to get a hold of a native module code address (and not the address of a dynamic wrapper for the module's code), then an attempt to execute the code will raise a fault. The fault will be caught by Granary's generic interrupt handler here: https://github.com/Granary/granary/blob/master/granary/kernel/interrupt.cc#L564
  4. The generic interrupt handler will detect that this is a page fault in module code here: https://github.com/Granary/granary/blob/master/granary/kernel/interrupt.cc#L609
  5. Granary handles that case by looking up the native module address in the code cache here: https://github.com/Granary/granary/blob/master/granary/kernel/interrupt.cc#L529
  6. We then replace the return address of the interrupt stack frame, and IRET to return from the interrupt, but in the instrumented module code: https://github.com/Granary/granary/blob/master/granary/kernel/interrupt.cc#L559

@renzhengeek
Copy link
Contributor Author

Clear explainations. Thank you ;)

@pgoodman
Copy link
Member

Granary bootstraps by replacing the init function pointer in the kernel's struct module with an instrumented version of the module's init function. Depending on how Granary is configured, this can be an instrumented version of the first basic block, or of a trace of basic blocks. This happens here: https://github.com/Granary/granary/blob/master/granary/kernel/linux/module.cc#L83

After Granary's notifier is called, the kernel will try to initialize the just-loaded module by invoking its init function (via the function pointer that Granary replaced). The kernel invokes this function pointer, and control transfers into Granary's code cache. Eventually the module will do one of a few things:

  1. It will do some sort of control-flow instruction (e.g. an if-then-else, function call, indirect call/jump) that is meant to transfer execution to another part of the module.
  2. It will invoke a kernel function to do something like allocate memory or register some data structures (e.g. device driver information, operations structures, etc.) with the kernel.
  3. It will eventually return back to the kernel.

Case (3) is the simplest. In Granary, an instrumented ret (return instruction) is just a ret. Nothing changes.

Case (2) involves wrappers, and has previously been discussed. If Granary sees a control-flow instruction (CFI) transferring to kernel code, then Granary will either leave the CFI as is, or convert it to jump to a wrapped version of the function being called. This is so that Granary's type wrappers can search for module function pointers, and replace them with dynamic wrappers that transparently give Granary control over the module's execution, without incurring the cost of a page fault.

Case (1) is the trickiest case of all because it goes into just how Granary maintains control. For the sake of space, I won't describe how Granary maintains control of indirect CFIs (e.g. call * and jmp *, but it does). If you want some extra details then I've got an sort-of paper with these details. I suggest emailing me (<first name>``<dot>``<last name>``<at>``<gmail>``<dot>``<com>) and I'll send you a copy.

The way that Granary maintains control of direct CFIs is via "edge code". If native code contains the a jmp foo then instrumented code will contain an equivalent jmp edge_foo. Think of edge code as a little stub of code that gives Granary control, and then asks it: "what is the address of the instrumented version of foo?" After potentially translating/instrumenting foo, the Granary code invoked by edge_foo goes and "hot" patches the jmp edge_foo instruction in place, modifying it into jmp instrumented_foo.

  1. Mangling of a direct CFI: https://github.com/Granary/granary/blob/master/granary/mangle.cc#L187
  2. Replace the CFI with an equivalent CFI to edge code: https://github.com/Granary/granary/blob/master/granary/mangle.cc#L287
  3. Create the edge code that is specific to the CFI's target (foo): https://github.com/Granary/granary/blob/master/granary/dbl.cc#L246
  4. This first allocates a patch data structure that contains information about the patch that Granary needs to make: https://github.com/Granary/granary/blob/master/granary/dbl.cc#L23
  5. A key feature of the patch data structure is that it persists the instruction IR of the CFI, in the form of the in_to_patch field. This persisted instruction IR will know where the CFI was encoded in the code cache, so that Granary will then know what bytes to patch.
  6. When the edge code is executed, control will eventually transfer to this entrypoint: https://github.com/Granary/granary/blob/master/granary/dbl.cc#L62, which is responsible for making the instruction patch.
  7. This code will hot patch the instruction: https://github.com/Granary/granary/blob/master/granary/dbl.cc#L162 and return back into the edge code.
  8. Control will eventually return back into some "gencode" that the edge code indirect calls, and that gencode will divert execution to the target basic block via a mechanism that I don't quite remember anymore.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants