Skip to content

ZJIT: Lightweight frames #909

@k0kubun

Description

@k0kubun

Design

When ZJIT pushes an interpreter frame, it should write only one metadata pointer, and others should be lazily materialized.

Frame push

ZJIT should bump ec->cfp as usual, but only write the address of a native stack slot that points to the return address into cfp->jit_return.

  • x86_64:
    • Save the address of the stack slot pushed by call instruction, which has the return address.
  • arm64:
    • For JIT-to-JIT calls, save the address of the stack slot pushed by the callee's Insn::FrameSetup, which saves the link register.
    • For any other C calls, write the return address somewhere in the caller's native stack, and save the address of the stack slot.

Prepare for calls

gen_prepare_call_with_gc and gen_prepare_non_leaf_call should update cfp->jit_return instead of saving PC/SP and spilling stack slots and locals.

How to materialize

When cfp->jit_return is not zero, ZJIT should retrieve compile-time metadata from it as follows:

  1. Read cfp->jit_return to get the return address. Look up a { return_address => metadata } hash table to get metadata for the callsite.
    • The metadata should contain: PC, ISEQ, stack size, cme, env flags, location of self, type/location of specval
  2. The metadata should have the offset from the cfp->jit_return address to the frame's base pointer.
    • Using this base pointer and offsets in the metadata, ZJIT should be able to discover stack slots and locals from the native stack.

ZJIT should fully materialize the frame and set 0 to cfp->jit_return when it hands over the frame's execution to the interpreter. Otherwise, it may just query metadata and leave the frame un-materialized, e.g. for showing backtraces.

When to materialize

The frame metadata is supposed to be queried in the following conditions:

  • On-Stack Replacement: An exception is raised, the longjmp expired a JIT frame, and the interpreter takes over the execution of the cfp.
  • Backtraces: An exception is raised, Kernel#caller is called, or rb_profile_frames is used by a profiler.
  • Binding: rb_debug_inspector API is used, and the Binding of a JIT frame is dynamically accessed.

Open questions

  • When a C function pushes a frame on top of a lightweight frame, can we leave the lightweight frame unmaterialized?
    • Do we need to reserve the VM stack slots (VM_ENV_DATA_SIZE + stack size) so that we won't need to move the next frame's env, which might be referenced by pointers on stack, when actually materializing the lightweight frame?

Prior art

Lazy frame push

This is what we successfully merged to YJIT and still exists in Ruby master. Unlike lightweight frames, it does not push a frame (does not bump ec->cfp) before the call, and lazily push the frame on rb_yjit_lazy_push_frame using the metadata queried by cfp->pc when the callee method is about to raise an exception.

In lightweight frames, because we intend to bump ec->cfp, we shouldn't need to do anything as of rb_yjit_lazy_push_frame, which would hopefully eliminate the check overhead in those places. When it actually queries backtraces to raise an exception, the frame metadata should be queried to retrieve line numbers for lightweight frames.

Frame outlining

This is what Alan and I experimented with in 2023. We used a tagged pointer in cfp->pc to mark it as an "outlined" frame. Every read of cfp->pc, cfp->sp, or cfp->iseq had a branch on whether cfp->pc is tagged or not. If it's a tagged pointer, it points to frame metadata to materialize the outlined frame. Because we made every read of pc/sp/iseq slower, the interpreter became slower. So we gave it up.

Unlike frame outlining, the idea of lightweight frames is to optimistically skip the cfp->jit_return check on most cfp reads to avoid the interpreter slowdown (we should have assertions on the debug mode) and check cfp->jit_return on the above "When to materialize" conditions. Hopefully, we will not need to add materialization checks in places that make the interpreter as slow as frame outlining.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions