forked from jeffreykegler/Marpa-web-site
-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
235 lines (235 loc) · 10.4 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
<html>
<head>
<link rel="alternate" title="Ocean of Awareness RSS" type="application/rss+xml" title="RSS" href="http://jeffreykegler.github.com/Ocean-of-Awareness-blog/index.rss" />
<title>
The Marpa parser
</title>
<style type="text/css">
strong {font-weight: 700;}
</style>
</head>
<body>
<div
style="color:white;background-color:#38B0C0;padding:1em;clear:left;text-align:center;">
<h1>The Marpa parser</h1>
<!--
marpa_r2_html_fmt --no-added-tag-comment --no-ws-ok-after-start-tag
-->
</div>
<div style="margin:0;padding:10px 30px 10px 10px;width:150px;float:left;border-right:2px solid #38B0C0">
<h2>This is Jeffrey Kegler's website for Marpa, a parsing algorithm</h2>
<h3>Resources</h3>
<p>
<a href="http://savage.net.au/Marpa.html">The official Marpa
starting page</a>.
</p>
<p>
<a href="https://twitter.com/jeffreykegler" class="twitter-follow-button" data-show-count="false">
Follow @jeffreykegler</a>
</p>
<p style="text-align:center">
<!-- Place this code where you want the badge to render. -->
<a href="//plus.google.com/101567692867247957860?prsrc=3" rel="publisher" style="text-decoration:none;">
<img src="//ssl.gstatic.com/images/icons/gplus-32.png" alt="Google+" style="border:0;width:32px;height:32px;"/></a>
</p>
<p>
The Ocean of Awareness blog:
<a href="http://jeffreykegler.github.com/Ocean-of-Awareness-blog">
home page</a>,
<a href="http://jeffreykegler.github.com/Ocean-of-Awareness-blog/metapages/chronological.html">
chronological index</a>,
and
<a href="http://jeffreykegler.github.com/Ocean-of-Awareness-blog/metapages/annotated.html">
annotated index</a>.
</p>
<p>
Marpa::R2 (active distribution):
<a href="http://search.cpan.org/dist/Marpa-R2/">
CPAN</a>
|
<a href="https://metacpan.org/module/Marpa::R2">
MetaCPAN</a></p>
<p>
<a href="https://groups.google.com/forum/?hl=en&fromgroups#%21forum/marpa-parser">
Mailing List</a>
</p>
<p>
<a href="https://github.com/jeffreykegler/Marpa--R2">
Github repository</a></p>
<p>
<a href="http://www.jeffreykegler.com/">
Jeffrey Kegler's personal website
</a>
</p>
</div>
<div style="margin-left:190px;border-left:2px solid #38B0C0;padding:25px;">
<p>Marpa
is a parsing algorithm.
It is new, but very much based
on earlier work by Jay Earley, Joop Leo, John Aycock and R. Nigel Horspool.
Marpa is intended to replace, and to go well beyond,
recursive descent and the yacc family of parsers.
</p><ul>
<li>
Marpa is fast. It parses in linear time:
<ul>
<li>all the grammar classes that recursive descent parses;</li>
<li>the grammar class that the yacc family parses;</li>
<li>in fact, any unambiguous grammars,
with a couple of exceptions that are not likely to
be an issue in practice (see <a href="#quibbles">quibbles</a>); and
</li>
<li>all
ambiguous grammars that are unions of a finite set of any of the above grammars.</li>
</ul>
</li>
<li>
Marpa is powerful. Marpa will parse anything that can be
written in BNF.
This includes any mixture of left, right and middle recursions.
</li>
<li>Marpa is convenient.
Unlike recursive descent, you do not have to write a parser --
Marpa generates one from BNF.
Unlike PEG or yacc, parser generation is unrestricted and exact.
Marpa converts any grammar which can be written as BNF
into a parser which recognizes everything
in the language described by that BNF, and which rejects everything that is
not in that language.
The programmer is not forced to make arbitrary choices while parsing.
If a rule has several alternatives,
all of the alternatives are considered for as long as they might yield a valid parse.
</li>
<li>
Marpa is flexible. Like recursive descent, Marpa allows you to stop and
do your own custom processing. Unlike recursive descent, Marpa makes available
to you detailed information about the parse so far --
which rules and symbols have been recognized, with their locations,
and which rules and symbols are expected next.
</li>
</ul>
<h2>Learning about Marpa</h2>
<p>
What you are looking at is the web site maintained by the author of Marpa
(Jeffrey Kegler).
It is
<b>NOT</b>
the best page for starting to learn about Marpa.
Good places to do that are:
</p><ul>
<li><a href="http://savage.net.au/Marpa.html">Marpa's official starting page</a>,
which is maintained by Ron Savage.
</li><li>The documentation of
<a href="https://metacpan.org/module/Marpa::R2">
Marpa::R2</a>,
Marpa's current stable release.
</li></ul>
<h2>Other Marpa resources</h2>
<p>
Discussion of Marpa currently centers around
<a href="http://groups.google.com/group/marpa-parser">
the "marpa parser" Google Group</a>
and the IRC channel:
<tt>#marpa</tt>
on
<tt>irc.freenode.net</tt>.
</p><p>
Most of the posts on
<a href="http://jeffreykegler.github.com/Ocean-of-Awareness-blog">
Ocean of Awareness</a>,
my blog,
are about Marpa.
To get oriented in my blog,
start at its
<a href="http://jeffreykegler.github.com/Ocean-of-Awareness-blog/metapages/annotated.html">
annotated list of the most interesting Marpa posts</a>.
</p><p>If you are interested in tutorials,
</p><ul>
<li>My blog contains
<a href="http://jeffreykegler.github.io/Ocean-of-Awareness-blog/metapages/annotated.html#TUTORIAL">
several tutorials</a>.</li>
<li>Peter Stuifzand has written another as part of
<a href="http://marpa-guide.github.io/index.html">
the Marpa Guide</a>.</li>
<li>And amon has written this
<a href="http://longanswers.blogspot.de/2013/06/transforming-syntax.html">
one for Stackoverflow</a>.</li>
</ul>
<h2>Theory</h2>
<p>
For those interested in the mathematics behind Marpa, there's
<a href="https://docs.google.com/file/d/0B9_mR_M2zOc4Ni1zSW5IYzk3TGc/edit?usp=sharing" target="_blank">
a paper with pseudocode, and proofs of correctness and of my complexity claims</a>.
</p>
<h2>Marpa internals</h2>
<p>
<a href="http://jeffreykegler.github.com/Marpa-web-site/libmarpa.html">
Libmarpa</a>
is a C library, and is the core of Marpa.
</p>
<p>
<a href="http://jeffreykegler.github.com/Marpa-web-site/developer.html">
Marpa internals</a>:
These are resources of interest only
to those working on the internals of Marpa itself --
"bleeding edge" documentation, etc.
</p>
<h2>Quibbles</h2><a name="quibbles">
<p>I mentioned above that Marpa parses unambiguous grammars in linear time,
with a couple of exceptions,
and claimed that those were unlikely to be bothersome in practice.
Here are the details.
<p>
For an unambiguous grammar to be parsed in linear time,
it must
<ul>
<li>be free of unmarked middle recursions; and
<li>be free of ambiguous right recursions.
</ul>
<h3>Unmarked middle recursions?</h3>
Unmarked middle recursions are what they sound like:
recursions that are not left and right, but in the middle of
a rule, and for which there is no "marker".
What's a marker?
That gets tricky.
<p>The marker of a middle recursion is anything that allows the parser to find the middle.
Since you can encode a Turing machine into the marker
<span style="color:red">No,no,no, I can't encode machine into marker. Because I don't understood for now what is "marker" and how i can encode machine into unknown. And there is no link.</span>
, that means the exact question
of what a marker is, is undecidable.
But, for practical purposes, if you can spot the middle by eyeball, the middle
recursion is "marked".
If you can't, the middle recursion might be unmarked.
<h3>Ambiguous right recursions</h3>
How does an unambiguous grammar manage to include an ambiguous right recursion?
The answer is not very easily, but you can sneak an ambiguous right recursion into
an unambigious grammar,
by having two different right recursive rules,
both of which recurse on the same symbol.
I call these ambiguities right recursive symches -- "symches"
because they are ambiguous due to a choice between symbols.
A right recursion can also be ambiguous because it is a "factoring" --
that is, it divides the input up differently
among its symbols.
But a factoring will make the whole grammar ambiguous, while a symch does not necessarily do so.
<p>Right recursive symches are very easy to avoid.
You just rewrite the rules so that they recurse on different symbols.
Preserving the semantics is no problem in this case --
you simply make sure both of the new symbols
have the same semantics as the original one.
</div>
<div id="footer" style="border-top:thick solid #38B0C0;clear:left;padding:1em;">
</div>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs");</script>
<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
try {
var pageTracker = _gat._getTracker("UA-33430331-1");
pageTracker._trackPageview();
} catch(err) {}
</script>
</body>
</html>