Skip to content

[SPARK-52103][SQL] Fallback complex expression whole-stage codegen #50871

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

wankunde
Copy link
Contributor

What changes were proposed in this pull request?

Add a config to determines whether fallback the complex expressions codegen.

Why are the changes needed?

If the expression contains more than this of non-leaf expressions, the generated method may too long to be JIT compiled.

For the test query, the filter operator contains 193 non-leaf expressions and will generate about 3000 lines code, the code can not be JIT compiled and will be very slow.

SELECT  vv
FROM
(
    SELECT  vv, case vv
                when '1' then '1'
                when '2' then '2'
                when '3' then '3'
                when '4' then '4'
                when '5' then '5'
                when '6' then '6'
                when '7' then '7'
                when '8' then '8'
                when '9' then '9'
                when '10' then '10'
                when '11' then '11'
                when '12' then '12'
                when '13' then '13'
                when '14' then '14'
                when '15' then '15'
                when '16' then '16'
                when '17' then '17'
                when '18' then '18'
                when '19' then '19'
                when '20' then '20'
                when '21' then '21'
                when '22' then '22'
                when '23' then '23'
                when '24' then '24'
                when '25' then '25'
                when '26' then '26'
                when '27' then '27'
                when '28' then '28'
                when '29' then '29'
                when '30' then '30'
                when '31' then '31'
                when '32' then '32'
                else ''
                end as cv
    FROM (
        SELECT  regexp_replace(trim(lower(
                   get_json_object(concat(v,'}'),'$$.s'))),'\\n','') AS vv
        FROM values('a') as t(v)
    ) tmp
) t2
WHERE length(cv) > 0
AND cv not LIKE '%xxx%'

Does this PR introduce any user-facing change?

No

How was this patch tested?

Added UT

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label May 13, 2025
@wankunde
Copy link
Contributor Author

Hi, @panbingkun do you have any idea about this codegen JIT fail issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant