Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java20 ambiguities and improvements #4238

Open
kaby76 opened this issue Sep 14, 2024 · 0 comments
Open

Java20 ambiguities and improvements #4238

kaby76 opened this issue Sep 14, 2024 · 0 comments
Labels

Comments

@kaby76
Copy link
Contributor

kaby76 commented Sep 14, 2024

This is the first of a series of some ambiguities and improvements that I'm finding with the newest version of trparse with the --ambig option to display ambiguous parses.

Input: in.txt

Ambig trees:

$ trparse in.txt --ambig | trtree -a | grep in.txt.33
CSharp 0 in.txt success 0.8978623
in.txt.33: (start_ (compilationUnit (ordinaryCompilationUnit (topLevelClassOrInterfaceDeclaration (classDeclaration (normalClassDeclaration (CLASS "class") (typeIdentifier (Identifier "S")) (classBody (LBRACE "{") (classBodyDeclaration (classMemberDeclaration (fieldDeclaration (unannType (unannPrimitiveType (numericType (integralType (INT "int"))))) (variableDeclaratorList (variableDeclarator (variableDeclaratorId (Identifier "x")) (ASSIGN "=") (variableInitializer (expression (assignmentExpression (conditionalExpression (conditionalOrExpression (conditionalAndExpression (inclusiveOrExpression (exclusiveOrExpression (andExpression (equalityExpression (relationalExpression (shiftExpression (additiveExpression (multiplicativeExpression (unaryExpression (unaryExpressionNotPlusMinus (postfixExpression (primary (primaryNoNewArray (literal (IntegerLiteral "0"))))))))))))))))))))))) (SEMI ";")))) (RBRACE "}"))))) (topLevelClassOrInterfaceDeclaration (classDeclaration (normalClassDeclaration (CLASS "class") (typeIdentifier (Identifier "Test1")) (classBody (LBRACE "{") (classBodyDeclaration (classMemberDeclaration (methodDeclaration (methodModifier (PUBLIC "public")) (methodModifier (STATIC "static")) (methodHeader (result (VOID "void")) (methodDeclarator (Identifier "main") (LPAREN "(") (formalParameterList (formalParameter (unannType (unannReferenceType (unannArrayType (unannClassOrInterfaceType (typeIdentifier (Identifier "String"))) (dims (LBRACK "[") (RBRACK "]"))))) (variableDeclaratorId (Identifier "args")))) (RPAREN ")"))) (methodBody (block (LBRACE "{") (blockStatements (blockStatement (localVariableDeclarationStatement (localVariableDeclaration (localVariableType (unannType (unannReferenceType (unannClassOrInterfaceType (typeIdentifier (Identifier "S")))))) (variableDeclaratorList (variableDeclarator (variableDeclaratorId (Identifier "s")) (ASSIGN "=") (variableInitializer (expression (assignmentExpression (conditionalExpression (conditionalOrExpression (conditionalAndExpression (inclusiveOrExpression (exclusiveOrExpression (andExpression (equalityExpression (relationalExpression (shiftExpression (additiveExpression (multiplicativeExpression (unaryExpression (unaryExpressionNotPlusMinus (postfixExpression (primary (primaryNoNewArray (unqualifiedClassInstanceCreationExpression (NEW "new") (classOrInterfaceTypeToInstantiate (Identifier "S")) (LPAREN "(") (RPAREN ")")))))))))))))))))))))))) (SEMI ";"))) (blockStatement (statement (statementWithoutTrailingSubstatement (expressionStatement (statementExpression (methodInvocation (typeName (packageName (Identifier "System") (DOT ".") (packageName (Identifier "out")))) (DOT ".") (Identifier "println") (LPAREN "(") (argumentList (expression (assignmentExpression (conditionalExpression (conditionalOrExpression (conditionalAndExpression (inclusiveOrExpression (exclusiveOrExpression (andExpression (equalityExpression (relationalExpression (shiftExpression (additiveExpression (additiveExpression (multiplicativeExpression (unaryExpression (unaryExpressionNotPlusMinus (postfixExpression (primary (primaryNoNewArray (literal (StringLiteral "\"s.x=\""))))))))) (ADD "+") (multiplicativeExpression (unaryExpression (unaryExpressionNotPlusMinus (postfixExpression (expressionName (ambiguousName (Identifier "s")) (DOT ".") (Identifier "x"))))))))))))))))))) (RPAREN ")"))) (SEMI ";")))))) (RBRACE "}")))))) (RBRACE "}"))))))) (EOF ""))
in.txt.33: (start_ (compilationUnit (ordinaryCompilationUnit (topLevelClassOrInterfaceDeclaration (classDeclaration (normalClassDeclaration (CLASS "class") (typeIdentifier (Identifier "S")) (classBody (LBRACE "{") (classBodyDeclaration (classMemberDeclaration (fieldDeclaration (unannType (unannPrimitiveType (numericType (integralType (INT "int"))))) (variableDeclaratorList (variableDeclarator (variableDeclaratorId (Identifier "x")) (ASSIGN "=") (variableInitializer (expression (assignmentExpression (conditionalExpression (conditionalOrExpression (conditionalAndExpression (inclusiveOrExpression (exclusiveOrExpression (andExpression (equalityExpression (relationalExpression (shiftExpression (additiveExpression (multiplicativeExpression (unaryExpression (unaryExpressionNotPlusMinus (postfixExpression (primary (primaryNoNewArray (literal (IntegerLiteral "0"))))))))))))))))))))))) (SEMI ";")))) (RBRACE "}"))))) (topLevelClassOrInterfaceDeclaration (classDeclaration (normalClassDeclaration (CLASS "class") (typeIdentifier (Identifier "Test1")) (classBody (LBRACE "{") (classBodyDeclaration (classMemberDeclaration (methodDeclaration (methodModifier (PUBLIC "public")) (methodModifier (STATIC "static")) (methodHeader (result (VOID "void")) (methodDeclarator (Identifier "main") (LPAREN "(") (formalParameterList (formalParameter (unannType (unannReferenceType (unannArrayType (unannClassOrInterfaceType (typeIdentifier (Identifier "String"))) (dims (LBRACK "[") (RBRACK "]"))))) (variableDeclaratorId (Identifier "args")))) (RPAREN ")"))) (methodBody (block (LBRACE "{") (blockStatements (blockStatement (localVariableDeclarationStatement (localVariableDeclaration (localVariableType (unannType (unannReferenceType (unannClassOrInterfaceType (typeIdentifier (Identifier "S")))))) (variableDeclaratorList (variableDeclarator (variableDeclaratorId (Identifier "s")) (ASSIGN "=") (variableInitializer (expression (assignmentExpression (conditionalExpression (conditionalOrExpression (conditionalAndExpression (inclusiveOrExpression (exclusiveOrExpression (andExpression (equalityExpression (relationalExpression (shiftExpression (additiveExpression (multiplicativeExpression (unaryExpression (unaryExpressionNotPlusMinus (postfixExpression (primary (primaryNoNewArray (unqualifiedClassInstanceCreationExpression (NEW "new") (classOrInterfaceTypeToInstantiate (Identifier "S")) (LPAREN "(") (RPAREN ")")))))))))))))))))))))))) (SEMI ";"))) (blockStatement (statement (statementWithoutTrailingSubstatement (expressionStatement (statementExpression (methodInvocation (typeName (packageName (Identifier "System")) (DOT ".") (typeIdentifier (Identifier "out"))) (DOT ".") (Identifier "println") (LPAREN "(") (argumentList (expression (assignmentExpression (conditionalExpression (conditionalOrExpression (conditionalAndExpression (inclusiveOrExpression (exclusiveOrExpression (andExpression (equalityExpression (relationalExpression (shiftExpression (additiveExpression (additiveExpression (multiplicativeExpression (unaryExpression (unaryExpressionNotPlusMinus (postfixExpression (primary (primaryNoNewArray (literal (StringLiteral "\"s.x=\""))))))))) (ADD "+") (multiplicativeExpression (unaryExpression (unaryExpressionNotPlusMinus (postfixExpression (expressionName (ambiguousName (Identifier "s")) (DOT ".") (Identifier "x"))))))))))))))))))) (RPAREN ")"))) (SEMI ";")))))) (RBRACE "}")))))) (RBRACE "}"))))))) (EOF ""))
09/15-10:47:55 ~/issues/g4-current/java/java20/Generated-CSharp
$

typeName/packageName/packageOrTypeName

packageName
    : Identifier ('.' packageName)?
    // left recursion --> right recursion
    ;

typeName
    : packageName ('.' typeIdentifier)?
    ;

packageOrTypeName
    : identifier ('.' packageOrTypeName)?
    // left recursion --> right recursion
    ;

From the JLS20,

Notes

  1. The rule typeName is incorrect. It should have referenced packageOrTypeName, not packageName.
  2. Although Antlr does a great job and rewriting left-recursion into kleene operators before running Thompson's Construction, it does not rewrite right recursion. Right recursion should not be used because it's inefficient because it causes a call to a sub-automaton in AdaptivePredict(). E.g., the NFA for packageName is:
    graphviz (17)
  3. For input with a method call System.out.println("s.x=" + s.x);, we have ambiguity on where to include .out. Should it be as a packageName or typeIdentifier?

We don't have a symbol table for the grammar to distinguish packages vs. types. Just get rid of packageName and typeName and just define a dotIdChain: identifier ('.' identifier)*;.

@KvanTTT KvanTTT added the java label Sep 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants