Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure backwards jumps get a yk_location. #96

Merged
merged 1 commit into from
Nov 19, 2024
Merged

Conversation

vext01
Copy link
Contributor

@vext01 vext01 commented Nov 18, 2024

Requires: ykjit/yk#1466

I noticed that the mandelbrot benchmark was not triggering the JIT. This was because we only assigned locations to for loops, but this benchmark contains mostly while looping.

Lua has no deidcated bytecode for while loops. Instead it uses unstructured control flow using conditionals and jumps.

Here's an example:

i = 10
while i > 0 do
    i = i - 1
end

Becomes:

main <a.lua:0,0> (11 instructions at 0xbfc160)
0+ params, 2 slots, 1 upvalue, 0 locals, 2 constants, 0 functions
        1       [1]     VARARGPREP      0
        2       [1]     SETTABUP        0 0 1k  ; _ENV "i" 10
        3       [2]     GETTABUP        0 0 0   ; _ENV "i"
        4       [2]     GTI             0 0 0
        5       [2]     JMP             5       ; to 11
        6       [3]     GETTABUP        0 0 0   ; _ENV "i"
        7       [3]     ADDI            0 0 -1
        8       [3]     MMBINI          0 1 7 0 ; __sub
        9       [3]     SETTABUP        0 0 0   ; _ENV "i"
        10      [3]     JMP             -8      ; to 3
        11      [4]     RETURN          0 1 1   ; 0 out

In this example JMP -8 is a CFG back edge that makes the loop.

This commit makes yklua recognise back edges as potential loops so that they get JITted if/when they get hot. This is of course a proxy: not all back edges are necessarily loops, but it should (tm) be a decent enough approximation for our purposes.

I measured about a 12% slowdown with this change. We can't not trace while loops and expect to be faster, so we will have to take this on for now.

I noticed that the mandelbrot benchmark was not triggering the JIT.
This was because we only assigned locations to `for` loops, but this
benchmark contains mostly `while` looping.

Lua has no deidcated bytecode for `while` loops. Instead it uses
unstructured control flow using conditionals and jumps.

Here's an example:
```
i = 10
while i > 0 do
    i = i - 1
end
```

Becomes:
```
main <a.lua:0,0> (11 instructions at 0xbfc160)
0+ params, 2 slots, 1 upvalue, 0 locals, 2 constants, 0 functions
        1       [1]     VARARGPREP      0
        2       [1]     SETTABUP        0 0 1k  ; _ENV "i" 10
        3       [2]     GETTABUP        0 0 0   ; _ENV "i"
        4       [2]     GTI             0 0 0
        5       [2]     JMP             5       ; to 11
        6       [3]     GETTABUP        0 0 0   ; _ENV "i"
        7       [3]     ADDI            0 0 -1
        8       [3]     MMBINI          0 1 7 0 ; __sub
        9       [3]     SETTABUP        0 0 0   ; _ENV "i"
        10      [3]     JMP             -8      ; to 3
        11      [4]     RETURN          0 1 1   ; 0 out
```

In this example `JMP -8` is a CFG back edge that makes the loop.

This commit makes yklua recognise back edges as potential loops so that
they get JITted if/when they get hot. This is of course a proxy: not all
back edges are necessarily loops, but it should (tm) be a decent enough
approximation for our purposes.

I measured about a 12% slowdown with this change. We can't not trace
while loops and expect to be faster, so we will have to take this on for
now.
@vext01
Copy link
Contributor Author

vext01 commented Nov 19, 2024

Once ykjit/yk#1469 is merged, this can go in too.

@@ -1222,8 +1222,11 @@ void luaV_execute (lua_State *L, CallInfo *ci) {
vmfetch();
#ifdef USE_YK
YkLocation *ykloc = NULL;
if (GET_OPCODE(i) == OP_FORLOOP || GET_OPCODE(i) == OP_TFORLOOP)
if ((GET_OPCODE(i) == OP_FORLOOP) ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this is fine but it's probably not what we want in the medium term because it introduces several sort-of-constant comparisons that will have an overhead on the interpreter (probably the JIT removes all of this).

We've got at least two possible ways here:

  1. Rely on yklocs[pc] being NULL/non-NULL for the right opcodes. Downside: we have to do a load for every instruction (which might obliterate any advantage we get from removing the comparisons).
  2. Is there a spare bit in an opcode we can reuse for "this particular opcode should have a non-null YkLocation"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I just did the smallest change possible to get while loops traced.

We used to do 1., but that seems to have been changed at some point:

yklua/src/lvm.c

Line 1181 in 51b872e

YkLocation *ykloc = yk_lookup_ykloc(ci, pc);

2. I couldn't say without studying lua more.

@ltratt ltratt assigned ltratt and unassigned ptersilie Nov 19, 2024
@ltratt ltratt added this pull request to the merge queue Nov 19, 2024
Merged via the queue into ykjit:main with commit 7696f18 Nov 19, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants