Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[postgresql] Ambiguity with select_with_parens/select_no_parens #4328

Open
kaby76 opened this issue Nov 15, 2024 · 0 comments
Open

[postgresql] Ambiguity with select_with_parens/select_no_parens #4328

kaby76 opened this issue Nov 15, 2024 · 0 comments

Comments

@kaby76
Copy link
Contributor

kaby76 commented Nov 15, 2024

Consider input SELECT * FROM ((SELECT 1 AS x)) ss;. This contains two ambiguities:

$ trparse --ambig -i 'SELECT * FROM ((SELECT 1 AS x)) ss;' | trtree -a
CSharp 0 string success 0.1803122
string.716: (root (stmtblock (stmtmulti (stmt (selectstmt (select_no_parens (select_clause (simple_select_intersect (simple_select_pramary (SELECT "SELECT") (target_list_ (target_list (target_el (STAR "*")))) (from_clause (FROM "FROM") (from_list (table_ref (select_with_parens (OPEN_PAREN "(") (select_no_parens (select_clause (simple_select_intersect (simple_select_pramary (select_with_parens (OPEN_PAREN "(") (select_no_parens (select_clause (simple_select_intersect (simple_select_pramary (SELECT "SELECT") (target_list_ (target_list (target_el (a_expr (a_expr_qual (a_expr_lessless (a_expr_or (a_expr_and (a_expr_between (a_expr_in (a_expr_unary_not (a_expr_isnull (a_expr_is_not (a_expr_compare (a_expr_like (a_expr_qual_op (a_expr_unary_qualop (a_expr_add (a_expr_mul (a_expr_caret (a_expr_unary_sign (a_expr_at_time_zone (a_expr_collate (a_expr_typecast (c_expr (aexprconst (iconst (Integral "1"))))))))))))))))))))))))) (AS "AS") (colLabel (identifier (Identifier "x")))))))))) (CLOSE_PAREN ")")))))) (CLOSE_PAREN ")")) (alias_clause (colid (identifier (Identifier "ss"))))))))))))) (SEMI ";"))) (EOF ""))
string.716: (root (stmtblock (stmtmulti (stmt (selectstmt (select_no_parens (select_clause (simple_select_intersect (simple_select_pramary (SELECT "SELECT") (target_list_ (target_list (target_el (STAR "*")))) (from_clause (FROM "FROM") (from_list (table_ref (select_with_parens (OPEN_PAREN "(") (select_with_parens (OPEN_PAREN "(") (select_no_parens (select_clause (simple_select_intersect (simple_select_pramary (SELECT "SELECT") (target_list_ (target_list (target_el (a_expr (a_expr_qual (a_expr_lessless (a_expr_or (a_expr_and (a_expr_between (a_expr_in (a_expr_unary_not (a_expr_isnull (a_expr_is_not (a_expr_compare (a_expr_like (a_expr_qual_op (a_expr_unary_qualop (a_expr_add (a_expr_mul (a_expr_caret (a_expr_unary_sign (a_expr_at_time_zone (a_expr_collate (a_expr_typecast (c_expr (aexprconst (iconst (Integral "1"))))))))))))))))))))))))) (AS "AS") (colLabel (identifier (Identifier "x")))))))))) (CLOSE_PAREN ")")) (CLOSE_PAREN ")")) (alias_clause (colid (identifier (Identifier "ss"))))))))))))) (SEMI ";"))) (EOF ""))


string.796: (root (stmtblock (stmtmulti (stmt (selectstmt (select_no_parens (select_clause (simple_select_intersect (simple_select_pramary (SELECT "SELECT") (target_list_ (target_list (target_el (STAR "*")))) (from_clause (FROM "FROM") (from_list (table_ref (select_with_parens (OPEN_PAREN "(") (select_no_parens (select_clause (simple_select_intersect (simple_select_pramary (select_with_parens (OPEN_PAREN "(") (select_no_parens (select_clause (simple_select_intersect (simple_select_pramary (SELECT "SELECT") (target_list_ (target_list (target_el (a_expr (a_expr_qual (a_expr_lessless (a_expr_or (a_expr_and (a_expr_between (a_expr_in (a_expr_unary_not (a_expr_isnull (a_expr_is_not (a_expr_compare (a_expr_like (a_expr_qual_op (a_expr_unary_qualop (a_expr_add (a_expr_mul (a_expr_caret (a_expr_unary_sign (a_expr_at_time_zone (a_expr_collate (a_expr_typecast (c_expr (aexprconst (iconst (Integral "1"))))))))))))))))))))))))) (AS "AS") (colLabel (identifier (Identifier "x")))))))))) (CLOSE_PAREN ")")))))) (CLOSE_PAREN ")")) (alias_clause (colid (identifier (Identifier "ss"))))))))))))) (SEMI ";"))) (EOF ""))
string.796: (root (stmtblock (stmtmulti (stmt (selectstmt (select_no_parens (select_clause (simple_select_intersect (simple_select_pramary (SELECT "SELECT") (target_list_ (target_list (target_el (STAR "*")))) (from_clause (FROM "FROM") (from_list (table_ref (OPEN_PAREN "(") (table_ref (select_with_parens (OPEN_PAREN "(") (select_no_parens (select_clause (simple_select_intersect (simple_select_pramary (SELECT "SELECT") (target_list_ (target_list (target_el (a_expr (a_expr_qual (a_expr_lessless (a_expr_or (a_expr_and (a_expr_between (a_expr_in (a_expr_unary_not (a_expr_isnull (a_expr_is_not (a_expr_compare (a_expr_like (a_expr_qual_op (a_expr_unary_qualop (a_expr_add (a_expr_mul (a_expr_caret (a_expr_unary_sign (a_expr_at_time_zone (a_expr_collate (a_expr_typecast (c_expr (aexprconst (iconst (Integral "1"))))))))))))))))))))))))) (AS "AS") (colLabel (identifier (Identifier "x")))))))))) (CLOSE_PAREN ")"))) (CLOSE_PAREN ")") (alias_clause (colid (identifier (Identifier "ss"))))))))))))) (SEMI ";"))) (EOF ""))

As mentioned in the comments of gram.y, we see why select_with_stmt was created.

/* A complete SELECT statement looks like this.
 *
 * The rule returns either a single SelectStmt node or a tree of them,
 * representing a set-operation tree.
 *
 * There is an ambiguity when a sub-SELECT is within an a_expr and there
 * are excess parentheses: do the parentheses belong to the sub-SELECT or
 * to the surrounding a_expr?  We don't really care, but bison wants to know.
 * To resolve the ambiguity, we are careful to define the grammar so that
 * the decision is staved off as long as possible: as long as we can keep
 * absorbing parentheses into the sub-SELECT, we will do so, and only when
 * it's no longer possible to do that will we decide that parens belong to
 * the expression.	For example, in "SELECT (((SELECT 2)) + 3)" the extra
 * parentheses are treated as part of the sub-select.  The necessity of doing
 * it that way is shown by "SELECT (((SELECT 2)) UNION SELECT 2)".	Had we
 * parsed "((SELECT 2))" as an a_expr, it'd be too late to go back to the
 * SELECT viewpoint when we see the UNION.
 *
 * This approach is implemented by defining a nonterminal select_with_parens,
 * which represents a SELECT with at least one outer layer of parentheses,
 * and being careful to use select_with_parens, never '(' SelectStmt ')',
 * in the expression grammar.  We will then have shift-reduce conflicts
 * which we can resolve in favor of always treating '(' <select> ')' as
 * a select_with_parens.  To resolve the conflicts, the productions that
 * conflict with the select_with_parens productions are manually given
 * precedences lower than the precedence of ')', thereby ensuring that we
 * shift ')' (and then reduce to select_with_parens) rather than trying to
 * reduce the inner <select> nonterminal to something else.  We use UMINUS
 * precedence for this, which is a fairly arbitrary choice.
 *
 * To be able to define select_with_parens itself without ambiguity, we need
 * a nonterminal select_no_parens that represents a SELECT structure with no
 * outermost parentheses.  This is a little bit tedious, but it works.
 *
 * In non-expression contexts, we use SelectStmt which can represent a SELECT
 * with or without outer parentheses.
 */
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant