Skip to content

Rule syntax grammar

Irina Dragoste edited this page Oct 20, 2022 · 6 revisions

Introduction

This document defines a textual rule syntax, which is used as (one) input format in Rulewerk to represent data sources, rules, and facts.

See also:

Preliminaries from Logic

Term, (existential and universal) Variable, (Ground) Term, (Ground) Atom, (we avoid the word Literal to avoid confusion with RDFLiterals), Fact, and Rule are defined as usual. We refer the user to the books Foundations of Databases, and Logic for Computer Science for standard definitions and semantics.

Preliminaries from the Semantic Web

We use IRIREF (without UCHAR characters at the moment), PrefixedName, and PNAME_NS definitions from RDF Turtle 1.1. Note that these correspond to IRI_REF, PrefixedName, and PNAME_NS from SPARQL Query Language for RDF respectively. PrefixedNames are resolved following the Internationalized Resource Identifiers (IRIs) protocol.

An IRI is an IRIREF or a PrefixedName

IRI                    ::= IRIREF | PrefixedName

RLS program

A valid RLS program is composed by at most one BASE declaration at the beginning of the file, zero or more PREFIX declarations, zero or more SOURCE declarations, and zero or more STATEMENT declarations, in that order.

PROGRAM                ::= BASE? PREFIX* SOURCE* STATEMENT*

Base Declaration

BASE declarations might occur at most once at the beginning of the document using the keyword @base. Please note the dot at the end of the declaration. More than once base declaration will result in a ParserError.

BASE ::= '@base' IRIREF '.'

Prefix Declarations

PREFIX declarations might occur after BASE declaration and before SOURCE declarations and STATEMENT declarations. They use the keyword @prefix. Using the same PNAME_NS more than once will result in a ParserError.

PREFIX                 ::= '@prefix' PNAME_NS?':' IRIREF '.'

Source declarations

Source declarations are used to declare in a simple and compact way different data sources in Rulewerk.

SOURCE ::= '@source' PREDICATENAME[INTEGER]: DATASOURCE '.'
INTEGER ::= [1-9][0-9]*
DATASOURCE ::= CSVFILEDATASOURCE | RDFFILEDATASOURCE | SPARQLQUERYRESULTDATASOURCE
CSVFILEDATASOURCE ::= 'load-csv' ( STRING )
RDFFILEDATASOURCE ::='load-rdf' ( STRING )
SPARQLQUERYRESULTDATASOURCE ::= 'sparql'(IRI, STRING, STRING)

Please note that STRING references to the string production rule from turtle. It is used as filename in CSVFILEDATASOURCE and RDFFILEDATASOURCE declarations, while it is used as a list of variables and a path query in SPARQLQUERYRESULTDATASOURCE

Statement declaration

A STATEMENT declaration can be a FACT`` or a RULE```.

STATEMENT              ::= FACT | RULE

FACT

A FACT is composed by a PREDICATENAME and a list of one or more GROUNDTERMs. A PREDICATENAME can be an IRI or a PREDNAME. A PREDNAME is any alphanumeric character that starts with a letter. A GROUNDTERM is a IRI, NumericLiteral, or RDFLiteral

FACT                   ::= PREDICATENAME '(' GROUNDTERMS ')' '.'
PREDICATENAME          ::= IRI | PREDNAME
PREDNAME               ::= [a-zA-Z][a-zA-Z0-9]*
GROUNDTERMS            ::= GROUNDTERM (',' GROUNDTERM)*
GROUNDTERM             ::= IRI | NumericLiteral | RDFLiteral

(Existential) Rule

A RULE is composed by a non-empty comma separated list of POSITIVEATOMs, called HEAD; an arrow depicted as :- ; a non-empty comma separated list of ATOMs, called BODY; and a dot.

An ATOM is a POSITIVEATOM or a NEGATIVEATOM. A NEGATIVEATOM is a POSITIVEATOM preceded by a ~.

A POSITIVEATOM is composed by a PREDICATENAME and a non-empty comma separated list of TERMs. TERMs can be GROUNDTERMs or VARIABLEs. A VARIABLE can be (1) an EXISTENTIALVARIABLE, i.e. a VARNAME preceded by !, or (2) an UNIVERSALVARIABLE, i.e. a VARNAME preceded by a ?. A VARNAME is any alphanumeric character that starts with a letter.

RULE                   ::= HEAD ':-' BODY '.'
HEAD                   ::= POSITIVEATOM (',' POSITIVEATOM)*\
BODY                   ::= ATOM (',' ATOM)*
ATOM                   ::= POSITIVEATOM | NEGATIVEATOM
NEGATIVEATOM           ::= '~' POSITIVEATOM
POSITIVEATOM           ::= PREDICATENAME '(' TERMS ')'
TERMS                  ::= TERM (',' TERM)*
TERM                   ::= VARIABLE | GROUNDTERM
VARIABLE               ::= EXISTENTIALVARIABLE | UNIVERSALVARIABLE
EXISTENTIALVARIABLE    ::= '!' VARNAME
UNIVERSALVARIABLE      ::= '?' VARNAME
VARNAME                ::= [a-zA-Z][a-zA-Z0-9]*

The scope of a VARNAME is the rule where it appears. A VARNAME can be EXISTENTIALVARIABLE XOR UNIVERSALVARIABLE in the rule. EXISTENTIALVARIABLE can occur only in the HEAD. Multiple occurrences of the same VARNAME are allowed only if they follow the previous criteria. Other cases will result in a ParserError.