Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grammar railroad diagram #1096

Open
mingodad opened this issue Jun 6, 2021 · 3 comments
Open

Grammar railroad diagram #1096

mingodad opened this issue Jun 6, 2021 · 3 comments

Comments

@mingodad
Copy link

mingodad commented Jun 6, 2021

Looking for people using CocoR I found this project and I've done a experimental tool to convert CocoR grammars to a kind of EBNF understood by https://www.bottlecaps.de/rr/ui to generate railroad diagrams see bellow the converted and with some hand made changes of OST.atg to allow view it at https://www.bottlecaps.de/rr/ui the order of the rules could be changed to a better view of the railroad diagrams. Copy and paste the EBNF bellow on https://www.bottlecaps.de/rr/ui tab Edit Grammar then switch to the tab View Diagram.

Cheers !

//"/*" "*/" "//" eol '\t'  '+' '\r'  '+' '\n'
OST ::= SYNC "OST"  ( MAXSPEEDS  )?  ( GRADES  )?  ( FEATURES  )?  ( TYPES  )? "END"
MAXSPEEDS ::= SYNC "MAX" "SPEEDS" MAXSPEED  ( MAXSPEED  )*
MAXSPEED ::= SYNC "SPEED" STRING "=" UINT  ( "km/h"  )?
GRADES ::= SYNC "GRADES" GRADE  ( GRADE  )*
GRADE ::= SYNC "SURFACE" "GRADE" UINT "{"  ( STRING  )* "}"
FEATURES ::= SYNC "FEATURES" FEATURE  ( FEATURE  )*
FEATURE ::= SYNC "FEATURE" IDENT  ( FEATUREDESCS  )?
FEATUREDESCS ::= "DESC" IDENT ":" STRING  ( IDENT ":" STRING  )*
TYPES ::= SYNC "TYPES" TYPE  ( TYPE  )*
TYPE ::= SYNC "TYPE" IDENT  ( "IGNORE"  )? "=" TYPEKINDS "(" TAGCONDITION ")"  ( "OR" TYPEKINDS "(" TAGCONDITION ")"  )*  ( "{"  ( TYPEFEATURE  ( "," TYPEFEATURE  )*  )? "}"  )?  ( SPECIALTYPE  )?  ( TYPEOPTIONS  )?  ( GROUPS  )?  ( TYPEDESCS  )?
TAGCONDITION ::= TAGANDCOND  ( "OR" TAGANDCOND  )*
TAGANDCOND ::= TAGBOOLCOND  ( "AND" TAGBOOLCOND  )*
TAGBOOLCOND ::= TAGBINCOND  | TAGEXISTSCOND  | "(" TAGCONDITION ")"  | "!" TAGBOOLCOND
TAGBINCOND ::= string  ( TAGLESSCOND  | TAGLESSEQUALCOND  | TAGEQUALSCOND  | TAGNOTEQUALSCOND  | TAGGREATERCOND  | TAGGREATEREQUALCOND  | TAGISINCOND  )
TAGLESSCOND ::= "<"  ( STRING  | UINT  )
TAGLESSEQUALCOND ::= "<="  ( STRING  | UINT  )
TAGEQUALSCOND ::= "=="  ( STRING  | UINT  )
TAGNOTEQUALSCOND ::= "!="  ( STRING  | UINT  )
TAGGREATEREQUALCOND ::= ">="  ( STRING  | UINT  )
TAGGREATERCOND ::= ">"  ( STRING  | UINT  )
TAGISINCOND ::= "IN" "[" string  ( "," string  )* "]"
TAGEXISTSCOND ::= "EXISTS" string
TYPEKINDS ::= TYPEKIND  (  ( ","  )? TYPEKIND  )*
TYPEKIND ::= "NODE"  | "WAY"  | "AREA"  | "RELATION"
TYPEFEATURE ::= IDENT
SPECIALTYPE ::= "MULTIPOLYGON"  | "ROUTE_MASTER"  | "ROUTE"
TYPEOPTIONS ::= TYPEOPTION  ( TYPEOPTION  )*
TYPEOPTION ::= PATH  | "LOCATION"  | "ADMIN_REGION"  | "ADDRESS"  | "POI"  | "OPTIMIZE_LOW_ZOOM"  | "PIN_WAY"  | "MERGE_AREAS"  | "IGNORESEALAND"  | LANES
PATH ::= "PATH"  ( "["  ( "FOOT"  )?  ( "BICYCLE"  )?  ( "CAR"  )? "]"  )?
LANES ::= "LANES" "[" UINT8 UINT8 "]"
GROUPS ::= "GROUP" IDENT  ( "," IDENT  )*
TYPEDESCS ::= "DESC" IDENT ":" STRING  ( IDENT ":" STRING  )*
IDENT ::= ident
STRING ::= string
UINT ::= number
UINT8 ::= number

letter ::= 'a'  .. 'z'  '+' 'A'  .. 'Z'
digit ::= '0'  .. '9'
eol ::= '\n'
stringchar ::= ANY  '-' '"'
quotchar ::= ANY
ident  ::= letter  ( letter  | digit  | '_'  )*
number  ::= digit  ( digit  )*
string  ::= '"'  ( stringchar  | '\' quotchar  )* '"'

@Framstag
Copy link
Owner

Framstag commented Jun 6, 2021

This is interesting. SYNC is a COCO/R keyword for error recovery/handling (IMHO there should also be "WEAK" or similar) for your purpose you should filter it out.

I'm a fan of Coco and used it in various versions over the years. I like it very much, because it makes it rather simple to create simple grammars and robust lexer and parsers. It is sad that is is not as well know as lex bison, ANTLR or similar tools. I thought about switching from time to time to have a more knows parser, but there was just no real benefit.

I would like to make use of our work by adding such processing step and image to the documentation. See https://github.com/Framstag/libosmscout/blob/master/.github/workflows/webpage.yml for the current documentation build.

Is there any way to automate things with a small effort?

Is your interest in libosmscout or Coco/R?

@mingodad
Copy link
Author

mingodad commented Jun 6, 2021

Here is the parser I use to do the conversion, it requires a bit of manual work after but with a bit of effort probably it could do it all alone. I use it with an scripting language with a syntax very close to C/C++/Java/CSharp https://github.com/mingodad/squilu , It's basically the original Coco.atg with very few semantic actions.

/*-------------------------------------------------------------------------
Coco.ATG -- Attributed Grammar
Compiler Generator Coco/R,
Copyright (c) 1990, 2004 Hanspeter Moessenboeck, University of Linz
extended by M. Loeberbauer & A. Woess, Univ. of Linz
with improvements by Pat Terry, Rhodes University

This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2, or (at your option) any
later version.

This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
for more details.

You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

As an exception, it is allowed to write an extension of Coco/R that is
used as a plugin in non-free software.

If not otherwise stated, any source code generated by Coco/R (other than
Coco/R itself) does not fall under the GNU General Public License.
-------------------------------------------------------------------------*/
/*-------------------------------------------------------------------------
 compile with:
   Coco Coco.ATG -namespace at.jku.ssw.Coco
-------------------------------------------------------------------------*/
#include "Scanner.nut"
#include "DFA.nut"

COMPILER Coco

	string checkEscaped(string s) {
		if( s == "'\\\\'") return "'\\'";
		else if( s == "'\\''") return "\"'\"";
		return s;
	}

CHARACTERS
	letter    = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_".
	digit     = "0123456789".
	cr        = '\r'.
	lf        = '\n'.
	tab       = '\t'.
	stringCh  = ANY - '"' - '\\' - cr - lf.
	charCh    = ANY - '\'' - '\\' - cr - lf.
	printable = '\u0020' .. '\u007e'.
	hex       = "0123456789abcdef".

TOKENS
	ident     = letter { letter | digit }.
	number    = digit { digit }.
	string    = '"' { stringCh | '\\' printable } '"'.
	badString = '"' { stringCh | '\\' printable } (cr | lf).
	char      = '\'' ( charCh | '\\' printable { hex } ) '\''.

PRAGMAS
	ddtSym    = '$' { digit | letter }.

	optionSym = '$' letter { letter } '='
	            { digit | letter
	            | '-' | '.' | ':'
	            }.


COMMENTS FROM "/*" TO "*/" NESTED
COMMENTS FROM "//" TO lf

IGNORE cr + lf + tab

/*-------------------------------------------------------------------------*/

PRODUCTIONS

Coco
=
  [ // using statements
    ANY
    { ANY }
  ]
  "COMPILER"
  ident
  { ANY }
  [ "IGNORECASE"                 ]   /* pdt */
  [ "CHARACTERS" { SetDecl }]
  [ "TOKENS"  { TokenDecl }]
  [ "PRAGMAS" { TokenDecl }]
  { "COMMENTS"
    "FROM" TokenExpr
    "TO" TokenExpr
    [ "NESTED"
    ]
  }
  { "IGNORE" Set
  }

  SYNC
  "PRODUCTIONS"
  { ident (. printf("%s ::= ", t.val); .)
    [ AttrDecl ]
    [ SemText ] WEAK
    '='
    Expression
                                WEAK
    '.' (. print(""); .)
  }
  "END" ident
  '.'  (. print(""); .)
.

/*------------------------------------------------------------------------------------*/

SetDecl
=
  ident (. printf("%s ::= ", t.val); .)
  '='  Set
  '.' (. print(""); .)
.

/*------------------------------------------------------------------------------------*/

Set
=
  SimSet (. printf("%s ", checkEscaped(t.val)); .)
  { '+' (. printf(" '+' "); .) SimSet (. printf("%s ", checkEscaped(t.val)); .)
  | '-'  (. printf(" '-' "); .) SimSet (. printf("%s ", checkEscaped(t.val)); .)
  }
.

/*------------------------------------------------------------------------------------*/

SimSet
=
( ident
| string
| Char  	(. if( la.val == ".." ) printf("%s ", t.val); .)
  [ ".." (. printf(" .. "); .) Char
  ]
| "ANY"
)
.

/*--------------------------------------------------------------------------------------*/

Char
=
  char
.

/*------------------------------------------------------------------------------------*/

TokenDecl
=
  Sym
  SYNC
  ( '='  (. printf(" ::= "); .) TokenExpr '.'   (. print(""); .)
  |
  )
  [ SemText
  ]
.

/*------------------------------------------------------------------------------------*/

AttrDecl
=
  '<'
  { ANY
  | badString
  }
  '>'
| "<."
  { ANY
  | badString
  }
  ".>"
.

/*------------------------------------------------------------------------------------*/

Expression
=
  Term
  {                             WEAK
    '|' (. printf(" | "); .)
    Term
  }
.

/*------------------------------------------------------------------------------------*/

Term
=
( [
    Resolver
  ]
  Factor
  { Factor
  }
|
)
.

/*------------------------------------------------------------------------------------*/

Factor
=
( [ "WEAK" (. printf("WEAK "); .)
  ]
  Sym
  [ Attribs
  ]
| '(' (. printf(" ( "); .) Expression ')' (. printf(" ) "); .)
| '['  (. printf(" ( "); .) Expression ']' (. printf(" )? "); .)
| '{' (. printf(" ( "); .) Expression '}' (. printf(" )* "); .)
| SemText
| "ANY" (. printf("ANY "); .)
| "SYNC" (. printf("SYNC "); .)
)
.

/*------------------------------------------------------------------------------------*/

Resolver
=
  "IF" "("
  Condition
.

/*------------------------------------------------------------------------------------*/

Condition = { "(" Condition | ANY } ")" .

/*------------------------------------------------------------------------------------*/

TokenExpr
=
  TokenTerm
  {                             WEAK
    '|' (. printf(" | "); .)
    TokenTerm
  }
.

/*------------------------------------------------------------------------------------*/

TokenTerm
=
  TokenFactor
  { TokenFactor
  }
  [ "CONTEXT" (. printf("CONTEXT "); .)
    '(' (. printf(" ( "); .) TokenExpr
    ')' (. printf(" ) "); .)
  ]
.

/*------------------------------------------------------------------------------------*/

TokenFactor
=

( Sym
| '(' (. printf(" ( "); .) TokenExpr ')' (. printf(" ) "); .)
| '[' (. printf(" ( "); .) TokenExpr ']' (. printf(" )? "); .)
| '{' (. printf(" ( "); .) TokenExpr '}' (. printf(" )* "); .)
)
.

/*------------------------------------------------------------------------------------*/

Sym
=
( ident (. printf("%s ", t.val); .)
| (string 	(. printf("%s ",  t.val); .)
  | char (. printf("%s ", checkEscaped(t.val)); .)
  )
)
.

/*------------------------------------------------------------------------------------*/

Attribs
=
  '<'
  { ANY
  | badString
  }
  '>'
| "<."
  { ANY
  | badString
  }
  ".>"
.

/*------------------------------------------------------------------------------------*/

SemText
=
  "(."
  { ANY
  | badString
  | "(."
  }
  ".)"
.

END Coco.

@mingodad
Copy link
Author

mingodad commented Jun 8, 2021

After thinking a bit on the issue pointed bellow I implemented the EBNF generation in https://github.com/SSW-CocoR/CocoR-CPP when specifying this command line option -genRREBNF see bellow the new result for OTS.atg.

This is interesting. SYNC is a COCO/R keyword for error recovery/handling (IMHO there should also be "WEAK" or similar) for your purpose you should filter it out.

I'm a fan of Coco and used it in various versions over the years. I like it very much, because it makes it rather simple to create simple grammars and robust lexer and parsers. It is sad that is is not as well know as lex bison, ANTLR or similar tools. I thought about switching from time to time to have a more knows parser, but there was just no real benefit.

I would like to make use of our work by adding such processing step and image to the documentation. See https://github.com/Framstag/libosmscout/blob/master/.github/workflows/webpage.yml for the current documentation build.

Is there any way to automate things with a small effort?

//
// EBNF generated by CocoR parser generator to be viewed with https://www.bottlecaps.de/rr/ui
//

//
// productions
//

OST ::= "OST" ( MAXSPEEDS )? ( GRADES )? ( FEATURES )? ( TYPES )? "END" 
MAXSPEEDS ::= "MAX" "SPEEDS" MAXSPEED ( MAXSPEED )* 
GRADES ::= "GRADES" GRADE ( GRADE )* 
FEATURES ::= "FEATURES" FEATURE ( FEATURE )* 
TYPES ::= "TYPES" TYPE ( TYPE )* 
MAXSPEED ::= "SPEED" STRING "=" UINT ( "km/h" )? 
STRING ::= string 
UINT ::= number 
GRADE ::= "SURFACE" "GRADE" UINT "{" ( STRING )* "}" 
FEATURE ::= "FEATURE" IDENT ( FEATUREDESCS )? 
IDENT ::= ident 
FEATUREDESCS ::= "DESC" IDENT ":" STRING ( IDENT ":" STRING )* 
TYPE ::= "TYPE" IDENT ( "IGNORE" )? "=" TYPEKINDS "(" TAGCONDITION ")" ( "OR" TYPEKINDS "(" TAGCONDITION ")" )* ( "{" ( TYPEFEATURE ( "," TYPEFEATURE )* )? "}" )? ( SPECIALTYPE )? ( TYPEOPTIONS )? ( GROUPS )? ( TYPEDESCS )? 
TYPEKINDS ::= TYPEKIND ( ( "," )? TYPEKIND )* 
TAGCONDITION ::= TAGANDCOND ( "OR" TAGANDCOND )* 
TYPEFEATURE ::= IDENT 
SPECIALTYPE ::= ( "MULTIPOLYGON" | "ROUTE_MASTER" | "ROUTE" ) 
TYPEOPTIONS ::= TYPEOPTION ( TYPEOPTION )* 
GROUPS ::= "GROUP" IDENT ( "," IDENT )* 
TYPEDESCS ::= "DESC" IDENT ":" STRING ( IDENT ":" STRING )* 
TAGANDCOND ::= TAGBOOLCOND ( "AND" TAGBOOLCOND )* 
TAGBOOLCOND ::= ( TAGBINCOND | TAGEXISTSCOND | "(" TAGCONDITION ")" | "!" TAGBOOLCOND ) 
TAGBINCOND ::= string ( TAGLESSCOND | TAGLESSEQUALCOND | TAGEQUALSCOND | TAGNOTEQUALSCOND | TAGGREATERCOND | TAGGREATEREQUALCOND | TAGISINCOND ) 
TAGEXISTSCOND ::= "EXISTS" string 
TAGLESSCOND ::= "<" ( STRING | UINT ) 
TAGLESSEQUALCOND ::= "<=" ( STRING | UINT ) 
TAGEQUALSCOND ::= "==" ( STRING | UINT ) 
TAGNOTEQUALSCOND ::= "!=" ( STRING | UINT ) 
TAGGREATERCOND ::= ">" ( STRING | UINT ) 
TAGGREATEREQUALCOND ::= ">=" ( STRING | UINT ) 
TAGISINCOND ::= "IN" "[" string ( "," string )* "]" 
TYPEKIND ::= ( "NODE" | "WAY" | "AREA" | "RELATION" ) 
TYPEOPTION ::= ( PATH | "LOCATION" | "ADMIN_REGION" | "ADDRESS" | "POI" | "OPTIMIZE_LOW_ZOOM" | "PIN_WAY" | "MERGE_AREAS" | "IGNORESEALAND" | LANES ) 
PATH ::= "PATH" ( "[" ( "FOOT" )? ( "BICYCLE" )? ( "CAR" )? "]" )? 
LANES ::= "LANES" "[" UINT8 UINT8 "]" 
UINT8 ::= number 

//
// tokens
//

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants