Failing SQL Feature:
When using a LATERAL VIEW with three or more column aliases, only the first two aliases are correctly absorbed into LateralView.columnAlias. The third and subsequent aliases are silently mis-parsed as additional comma-separated tables in the FROM clause (implicit cross joins) and end up in PlainSelect.joins as bare Table items.
I noticed that closed issue #2088 addressed the case of a LATERAL VIEW with two column aliases. However, that fix appears to cover only the two-alias scenario; the grammar production for the column-alias list is still capped at two identifiers, so any case with three or more aliases is still broken in the same way described in #2088. Hive/Spark allow arbitrary-length alias lists (for example, json_tuple typically yields many output columns), so this remaining gap is hit easily in real workloads.
The parse does not raise a syntax error — the failure is silent. Worse, Statement.toString() re-emits a textually identical SQL string (because the leaked aliases get re-printed as the joined table list), which makes the broken AST very hard to detect by round-trip inspection.
SQL Example:
SELECT a
FROM t
LATERAL VIEW json_tuple(j, 'a', 'b', 'c', 'd', 'e', 'f', 'g')
x AS c1, c2, c3, c4, c5, c6, c7;
In this query, c1 and c2 are correctly registered as column aliases of the lateral view. However, c3, c4, c5, c6, c7 are incorrectly interpreted as five separate tables joined to the FROM clause. Reflective inspection of the parsed AST shows:
- PlainSelect.fromItem = Table("t")
- PlainSelect.joins = [Join(simple=true, right=Table("c3")), ..., Join(simple=true, right=Table("c7"))] ← unexpected
- LateralView.columnAlias.aliasColumns.size() = 2 ← should be 7
The expected behavior is that all of c1, c2, c3, c4, c5, c6, c7 are recognized as column aliases of the lateral view, with PlainSelect.joins being null (or empty), and no implicit table joins implied.
For comparison, the parenthesized form LATERAL VIEW ... x AS (c1, c2, ..., c7) is rejected outright with a ParseException at the (, so there is no alternative syntax that currently works for more than two aliases.
Boundary behavior observed (jsqlparser 5.3):
Aliases Result
AS c1, c2 OK — both absorbed into columnAlias.aliasColumns
AS c1, c2, c3 WRONG — c3 becomes a join Table
AS c1, c2, ..., cN (N >= 3) WRONG — c3..cN become join Tables
AS (c1, c2, ..., cN) ERROR — ParseException at "("
Software Information:
- JSqlParser version: 5.3
- Database: Spark SQL (Hive-style LATERAL VIEW)
Failing SQL Feature:
When using a LATERAL VIEW with three or more column aliases, only the first two aliases are correctly absorbed into LateralView.columnAlias. The third and subsequent aliases are silently mis-parsed as additional comma-separated tables in the FROM clause (implicit cross joins) and end up in PlainSelect.joins as bare Table items.
I noticed that closed issue #2088 addressed the case of a LATERAL VIEW with two column aliases. However, that fix appears to cover only the two-alias scenario; the grammar production for the column-alias list is still capped at two identifiers, so any case with three or more aliases is still broken in the same way described in #2088. Hive/Spark allow arbitrary-length alias lists (for example, json_tuple typically yields many output columns), so this remaining gap is hit easily in real workloads.
The parse does not raise a syntax error — the failure is silent. Worse, Statement.toString() re-emits a textually identical SQL string (because the leaked aliases get re-printed as the joined table list), which makes the broken AST very hard to detect by round-trip inspection.
SQL Example:
SELECT a
FROM t
LATERAL VIEW json_tuple(j, 'a', 'b', 'c', 'd', 'e', 'f', 'g')
x AS c1, c2, c3, c4, c5, c6, c7;
In this query, c1 and c2 are correctly registered as column aliases of the lateral view. However, c3, c4, c5, c6, c7 are incorrectly interpreted as five separate tables joined to the FROM clause. Reflective inspection of the parsed AST shows:
The expected behavior is that all of c1, c2, c3, c4, c5, c6, c7 are recognized as column aliases of the lateral view, with PlainSelect.joins being null (or empty), and no implicit table joins implied.
For comparison, the parenthesized form LATERAL VIEW ... x AS (c1, c2, ..., c7) is rejected outright with a ParseException at the (, so there is no alternative syntax that currently works for more than two aliases.
Boundary behavior observed (jsqlparser 5.3):
Aliases Result
AS c1, c2 OK — both absorbed into columnAlias.aliasColumns
AS c1, c2, c3 WRONG — c3 becomes a join Table
AS c1, c2, ..., cN (N >= 3) WRONG — c3..cN become join Tables
AS (c1, c2, ..., cN) ERROR — ParseException at "("
Software Information: