Skip to content

Stack overflow for substrait functions with large argument lists that translate to DataFusion binary operators #16030

Closed
@fmonjalet

Description

@fmonjalet

Describe the bug

When translating a substrait scalar function call to DataFusion logical plan, when the function name translates to a binary operator and has a large number of arguments (say 2000), further processing of this logical plan result in stack overflow.

To Reproduce

This commit contains a reproducer plan and test. The plan is pretty large (because a lot of arguments are required to see the stack overflow), but pretty simple: it's an OR(col == a, col ==b, col == c, ..., col == x). We could argue that this precise example should actually be a substrait SingularOrList, but the issue can be triggered all the same with other operators and expressions.

Expected behavior

No stack overflow or crash. At best, the plan is accepted and executed, at worst DataFusion returns a clean error.

Additional context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions