Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better heuristics for modulo and division in gcc 2.x #282

Open
joshlory opened this issue Aug 14, 2024 · 5 comments
Open

Better heuristics for modulo and division in gcc 2.x #282

joshlory opened this issue Aug 14, 2024 · 5 comments

Comments

@joshlory
Copy link
Contributor

Hello! @bismurphy and I have noticed a number of division and modulo patterns that aren't picked up automatically.

Division

var_a0_2 = temp_v1_4;
if (temp_v1_4 < 0) {
    var_a0_2 = temp_v1_4 + 0xFF;
}
temp_a0 = var_a0_2 >> 8;
// matches as temp_a0 = var_a0_2 / 256;
 4d4:    bgez    v0,4e0 ~>
 4d8:    nop     
 4dc:    addiu   v0,v0,0xff
 4e0: ~> sra     v0,v0,0x8
 4e4:    sb      v0,4(s1)
 4e8:    sll     v0,v1,0x5

Modulo

var_a0 = D_146000_8017ABD0[a0 - ((a0 / 0x10) * 0x10)];
// matches as D_146000_8017AB90[temp_v1_6 % 16];
 544:    move    v0,a0
 560:    bgez    a0,56c ~>
 564:    sb      zero,0x28(s1)
 568:    addiu   v0,a1,0x10
 56c: ~> sra     v0,v0,0x4
 570:    sll     v0,v0,0x4
 574:    subu    v0,a0,v0
 578:    sll     v0,v0,0x2
 57c:    lui     at,%hi(D_146000_8017ABD0)
 580:    addu    at,at,v0
 584:    lw      a0,%lo(D_146000_8017ABD0)(at)

I see tests for https://github.com/matt-kempster/m2c/tree/master/tests/end_to_end/division-by-power-of-two and https://github.com/matt-kempster/m2c/tree/master/tests/end_to_end/modulo-by-power-of-two but not for gcc. Does it make sense for me to start there and import some of the reduced repros as test cases?

I'm new to m2c so happy to leave this up to the end user to identify, if it's not something we can heuristically determine. But if it's something we can pattern-match in more cases I'd love to take a crack at it!

@ethteck
Copy link
Contributor

ethteck commented Aug 14, 2024

I think any tests / patterns would be appreciated! What we've done in the past after changing decompiler behavior is to run it against an entire project and then diff the result, just to make sure nothing looks crazy.

@simonlindholm
Copy link
Collaborator

There are some more tests in https://github.com/matt-kempster/m2c/tree/master/tests/end_to_end/gcc-division and https://github.com/matt-kempster/m2c/tree/master/tests/end_to_end/gcc-division-by-two (test organization is somewhat haphazard). Not sure whether these patterns are in there already, if not, adding the asm examples from this issue as e.g. new .s files in division-by-power-of-two/ and modulo-by-power-of-two/ sounds great.

Fixing this in m2c should be fairly simple, just a matter of adding more pattern-matching code like

class ModP2Pattern1(SimpleAsmPattern):
(+
ModP2Pattern1(),
).

@simonlindholm
Copy link
Collaborator

What we've done in the past after changing decompiler behavior is to run it against an entire project and then diff the result, just to make sure nothing looks crazy.

There are instructions in the README for setting this up, and it can be a great way to iterate on more complex m2c changes and seeing how they affect output in practice. I would say in this case it's not needed though, it's easy enough to convince yourself of the asm pattern's correctness in isolation.

@ethteck
Copy link
Contributor

ethteck commented Aug 14, 2024

My thought was that patterns that are too general may introduce false positives, but idk if that's actually happened before

@simonlindholm
Copy link
Collaborator

It can definitely happen, I just don't expect it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants