Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unilateral unlinking of edge's TB causes infinite loop #52

Closed
cube0x8 opened this issue Mar 13, 2024 · 8 comments
Closed

Unilateral unlinking of edge's TB causes infinite loop #52

cube0x8 opened this issue Mar 13, 2024 · 8 comments

Comments

@cube0x8
Copy link
Contributor

cube0x8 commented Mar 13, 2024

Context: my target allocates RWX pages, which are populated with jit code and executed.

I'm experiencing infinite loops when using the QemuCoverageEdgeHelper. The infinite loop occurs because the edges generated by libafl_gen_edge have the same pc as the last TB executed and because edges' TB is unlinked from its successor in the chain when the QEMU's signal handler is triggered.

I report below, step-by-step, a possible scenario that would trigger the infinite loop:

  1. We are in cpu_exec_loop and about to execute a TB, named X from now on
  2. During its execution, X encounters a jne to another TB, named Y
  3. Code for Y is generated, and TBs X and Y are linked through an edge:
    X -> edge -> Y
    the important detail here is that the edge's TB has the same pc as X
  4. Execution proceeds through several TBs until when another TB, named Z from now on, is executed
  5. Z triggers the signal handler and, causality, one of the TB in the page that is going to be invalidated is Y:
    tb_invalidate_phys_page_unwind

    PAGE_FOR_EACH_TB(addr, last, unused, tb, n) {
        if (current_tb == tb &&
            (tb_cflags(current_tb) & CF_COUNT_MASK) != 1) {
            /*
             * If we are modifying the current TB, we must stop its
             * execution. We could be more precise by checking that
             * the modification is after the current PC, but it would
             * require a specialized function to partially restore
             * the CPU state.
             */
            current_tb_modified = true;
            cpu_restore_state_from_tb(current_cpu, current_tb, pc);
        }
        tb_phys_invalidate__locked(tb); // TB IS Y
    }
  1. By looking at point 3, we know that Y is referenced by edge so, when tb_jmp_unlink is executed, tb_reset_jmp is called on the edge's TB. Now edge's TB and Y are not linked anymore:
    X -> edge -x-> Y
  2. execution proceeds...
  3. X is executed again, its basic blocks are executed until the edge's TB (remember, X -> edge), but edge is unlinked now, so after the edge is executed, it exits instead of jumping to Y
  4. We are now back in cpu_exec_loop, edge is now last_tb, the pc returned by cpu_get_tb_cpu_state is the same as X, so tb_lookup will return X, X will be executed again, which will jump again to 9), resulting in an infinite loop between 9) and 10)

I see two possible problems here:

  1. We are in a RWX context, where a page is accessed to be written, so potentially this means the JIT code in a page is going to be replaced from other code. In our example above, the page that is being re-written contains the TB named Y and, as we have to invalidate the edge X -> Y, we should also unlink X from edge to have the chain completely invalidated.
  2. an edge shouldn't be initialized with the pc of the previous TB (so, with the X's pc)

I think a possible solution would be:
when the signal handler is triggered, and we are unlinking an edge -> TB2 relationship, we should also unlink the other side of the chain: TB1 -> edge.

I'm open to other suggestions.

@cube0x8 cube0x8 changed the title Dummy pc in the libafl_gen_edge generated TB generates an infinite loop Dummy pc in the libafl_gen_edge generated TB, generates an infinite loop Mar 13, 2024
@cube0x8
Copy link
Contributor Author

cube0x8 commented Mar 13, 2024

I forgot an important detail between 9) and 10). I add it below:

  1. X is executed again, its basic blocks are executed until the edge's TB (remember, X -> edge), but edge is unlinked now, so after the edge is executed, it exits instead of jumping to Y
  2. We are now back in cpu_exec_loop, edge is now last_tb, the pc returned by cpu_get_tb_cpu_state is the same as X, so tb_lookup will return X. The check if (last_tb->jmp_reset_offset[1] != TB_JMP_OFFSET_INVALID) fails, since edge (now last_tb) has only 1 exit point, so last_tb and tb are chained together: tb_add_jump(last_tb, tb_exit, tb);. Current situation:
    X -> edge -> X
  3. X will be executed again, which will jump again to edge, which will jump to X again, and so on in an infinite loop...

@cube0x8 cube0x8 changed the title Dummy pc in the libafl_gen_edge generated TB, generates an infinite loop Unilateral unlinking of edge's TB causes infinite loop Mar 13, 2024
@rmalmain
Copy link
Collaborator

rmalmain commented Mar 14, 2024

Thanks a lot for the detailed walkthrough, it was very nice to follow!
There is a point in your explanation I am not sure to understand:

  1. "We are now back in cpu_exec_loop, edge is now last_tb, the pc returned by cpu_get_tb_cpu_state is the same as X, so tb_lookup will return X." -> At this point last_tb should be X, not edge, no? (since X and edge are linked together). Also, cpu_get_tb_cpu_state should return the pc right after X, not the pc of X if I'm not wrong, since the call to ops->tb_stop(db, cpu); will make the TB update the pc at the end of its execution.

I'm not sure about my remark since I can't test easily with an example, so maybe I missed something. I also suspect there is something fishy around if (last_tb->jmp_reset_offset[1] != TB_JMP_OFFSET_INVALID).

@andreafioraldi
Copy link
Member

andreafioraldi commented Mar 14, 2024 via email

@rmalmain
Copy link
Collaborator

Yes, I meant this is indeed where we would create the infinite loop if we are in the conditions he describes (last_tb == edge and pc == X). I'm surprised there is an infinite loop there tough, if my scenario is the one being executed then last_tb == X and pc == Y. In that case we should just fall back to the original generation of edge and Y no?

@andreafioraldi
Copy link
Member

andreafioraldi commented Mar 14, 2024 via email

@cube0x8
Copy link
Contributor Author

cube0x8 commented Mar 18, 2024

At this point last_tb should be X, not edge, no?

No? From my debugging, within libafl_gen_code, tcg_out_exit_tb initiates a lea instruction, with its operand being the relative address to the edge's tb address. This address will be the return address of tcg_qemu_tb_exec, which subsequently will be last_tb. It seems logical to me, considering edge is a distinct TB. Thus, why should the exit code load the TB of X rather than that of the edge?

if my scenario is the one being executed then last_tb == X and pc == Y

This is true for regular TBs. However, for edge TBs, the code responsible for updating env->pc upon exit is not present. Below, I've reported the two lines of code that typically update the PC. These lines are present in a standard TB but missing in an edge's TB:

0x7fffe8000130 <code_gen_buffer+259>:    lea    rbx,[rip+0xe862cb9]  // load Y's pc
0x7fffe8000137 <code_gen_buffer+266>:    mov    QWORD PTR [rbp+0x80],rbx // store it into env->eip
0x7fffe800013e <code_gen_buffer+273>:    lea    rax,[rip+0xfffffffffffffefb] // load current TB's address
0x7fffe8000145 <code_gen_buffer+280>:    jmp    0x7fffe8000018 // jump to epilogue

and the lines below are taken from edge's generated code:

   0x7fffe8a5f480:	movabs rbx,0x555556e353b0
   0x7fffe8a5f48a:	movzx  r12d,BYTE PTR [rbx+0x2306]
   0x7fffe8a5f492:	inc    r12d
   0x7fffe8a5f495:	mov    BYTE PTR [rbx+0x2306],r12b
   0x7fffe8a5f49c:	data16 xchg ax,ax
   0x7fffe8a5f49f:	jmp    0x7fffe8a5f4a4 // jump to Y
   0x7fffe8a5f4a4:	lea    rax,[rip+0xffffffffffffff15]  // load edge's TB address
   0x7fffe8a5f4ab:	jmp    0x7fffe8000018 // jump to epilogue

As you can see, in the code above, the part that updates env->eip (rbp+0x80) is missing.

To be honest, I haven't went through the specifics of the ops generation part, so I can't say for sure why the PC is updated exclusively for regular TBs and not for edges'. However, I can somewhat understand why this occurs: what would be the next PC of an invalidated edge? Conceptually speaking, it doesn't seem reasonable to me for an edge to update the PC of the left-side TB (the TB I referred to as X in my example), as its sole purpose is to execute hooks and transition to Y.

Yes, I supspect like a cache invalidation in between, that would explain why he cannot reproduce on a small example program

@andreafioraldi could you explain what is it this cache you mentioned? I believe reproducing this issue on a small program (e.g., examples within the fuzzers directory) might be not possible because there might not be any QEMU user-mode example that replicates the scenario I'm facing (namely, edges between RWX regions being invalidated by the signal handler).

Btw, I think that with some effort I can create a small program that reproduces the issue but I would do that only if really necessary, as it would require a considerable amount of time.

@cube0x8
Copy link
Contributor Author

cube0x8 commented Mar 27, 2024

Btw, I think we cannot just simply unlink edge from its predecessor but we should call do_tb_phys_invalidate on it...

@rmalmain
Copy link
Collaborator

Fixed in #53.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants