Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mark_start()/mark_end() sometimes break autovectorization #30

Open
Seelengrab opened this issue Dec 2, 2022 · 0 comments
Open

mark_start()/mark_end() sometimes break autovectorization #30

Seelengrab opened this issue Dec 2, 2022 · 0 comments
Labels

Comments

@Seelengrab
Copy link
Collaborator

Seelengrab commented Dec 2, 2022

Adding mark_start() to the tight inner loop here:

@inline function scorep1(opp, me)
    isdraw = opp == me
    iswin  = (opp+0x1 == me) | (me+0x2 == opp)
    me + (0x3*isdraw) + (0x6*iswin)
end

@inline function scorep2(opp, target)
    mychoice = mod1(opp + mod1(target+0x1, 0x3), 0x3)
    mychoice + 0x3*(target-0x1)
end

solve(file::String) = solve(read(file))
function solve(data, f::F=scorep1) where F
    l = length(data)
    acc = UInt16(0)
    @inbounds @simd for idx in 1:4:l
        opp = data[idx + 0] - UInt8('A') + 0x1
        me  = data[idx + 2] - UInt8('X') + 0x1
        acc += f(opp, me)
    end
    acc
end

Breaks vectorization pretty badly. It goes from happily using lots of xmm to only using eax & friends. I just wanted to know how much performance was still left on the table, which is kind of hard to do when the tool breaks the vectorization. I don't yet know how, so this issue is just here for tracking this in general, but it ought to be possible to have our cake & eat it too here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant