Matthew Goode | 28 Oct 22:47 2014

New version of AustenX (1.1)

A new version of AustenX is now available at This is a Java based PEG generator, 
that supports "advanced" handling of left-recursion (which I still have 
not explained how it works), and optional/selective memorisation,  as 
well as useful extensions to PEGs. It works with a tokenisation step.

New for version 1.1 are extensions that allow for indentation sensitive 
parsing (eg, for languages like Python, etc...), as well as some other 
useful features.

Roman Redz | 27 Oct 17:43 2014

PEG for Java 1.8

Just to inform you that I have posted a preliminary version of PEG for Java 1.8  (Java 8?)
at .
Best regards - Roman
PEG mailing list
PEG <at>
Abdelkader Belloundja | 23 Sep 14:10 2014

integer encoding // computational model

Hi everybody,

Maybe some of you, if not all, have heard of things like Church Numerals that use lambda calculus to model numbers and operators, or Peano Numbers that use axioms. This is more or less related to computational models.

I'm concerned about finding a way to do the same with PEG / Backtracking. I, however, don't have any clue for where to start.

I guess the grammar should play the role of the axioms and therefore the input should be the calculus to be computed / verified, but I can't figure out how it should actually work or even look.

Anyway, I would be glad to have any piece of data on the matter. Feel free to send me anything you think may help.

Thanks !

PEG mailing list
PEG <at>
Vitaly S. Lugovskiy | 6 Sep 14:27 2014

MBase 1.0 released under QPL

MBase is a simple metaprogramming framework for .net platform, featuring, among other compiler building blocks, an extensible Packrat+Pratt parser frontend with a transparent left recursion support.

The PEG-related code is in src/l/lib/parsing/ and src/l/lib/pfront/

See (a somewhat outdated) demo here: 

PEG mailing list
PEG <at>
mathew palsberg | 4 Sep 19:54 2014

random program/string generator for PEG

I recently built a small language with PEG. I now want to test my language with random generated(not necessarily meaningful) programs. Does anybody know a tool that given a PEG grammar, it generates random sequences that satisfy the given grammar?


PEG mailing list
PEG <at>
Mayo Kaze | 14 Aug 03:13 2014

Is there any PEG implementation in ActionScript?

Hi mates:

I'd like to ask is there any PEG implementation in ActionScript? If there isn't I am planing to create it myself.

Kind regards,
cashin.peter | 13 Aug 17:48 2014

Serv ed

Pity sugar makes it useless he added taking a sip and shuddering

you have a new message from me

read your video message
PEG mailing list
PEG <at>
Marcus Brinkmann | 12 Aug 00:22 2014

Re: Breaking the Mizushima(2010) cut operator


I reported this in August 2013 to the grako bug tracker:

I think you will find that your example is similar to mine.

The cut operator in grako was implemented like you describe.  But note 
that this is _not_ the Mizushima cut operator.  The Mizushima paper is 
very careful to limit application of the cut operator to an empty 
backtracking stack, which avoids the problem: "Generally, when the 
backtracking stack of a parser is empty at position n, the parser can 
discard all regions in the memoization table whose column indices are 
less than n."

Later the Mizushima paper gives a negative example: "Because the 
backtracking stack is not empty and the position 0 is pushed to the 
stack, the parser cannot discard space for memoization at this point."

You ignored the condition (empty backtracking stack), and thus your 
example showed non-linear behavior.  It is an easy mistake to make.

When I reported this to Juancarlo, he told me (again, see the 
bugtracker): "When I read Kota's paper I empirically found that deleting 
the memoization cache left speed unchanged, but dramatically decreased 
memory use, so I left it in in the know that swiping the whole cache was 
overdoing it :-"

As I wrote at the time, it seems difficult (at least for me!) to come up 
with a proper example that actually shows non-linear behaviour on 
arbitrary long input (rather than a small example grammer that only 
accepts a finite set of sentences).  If you succeed with that, I'd be 
interested!  I described my difficulties like this: "It's surprisingly 
hard to come up with this example, and I can't find an example that has 
non-linear runtime on the input, as the cutting rules themselves are 
memoized, and thus the memoization deletions are in some sense 
"self-healing."" and "I can not prove that this can break the linear 
time promise of PEG parsing, and at the moment I suspect that due to the 
fact that the number of cuts in any grammar is constant and independent 
of the input length, the linear time promise is still preserved. 
However, I can't prove that either :P"

Roman Redziejowski | 23 Jul 14:09 2014

Breaking cut operator

I believe Ger Hobbelt discovered a problem.

The "cut point", in the sense of a point after which there is no use of backtracking, is the property of a specific selection operator.

Once B in Ger's examples sucessfully consumed A, it indeed does not need to backtrack (as this must fail).
But G in the second example has to backtrack after failing B, even if B passed its cut point. So B has no right to erase the history because it may be used by the containing expression.
So the cut points should somehow be considered in the entire hierarchy of parsing expressions.

I recently wrote a note about cut points, to be presented at the CSP workshop this year. It was inspired by a section in the paper by Maidl et al. about producing error messages; apparently failures in 'no return' areas defined by cut points are those relevant ones.
I missed there the fact that cut points are local.

(My note can be found at Remember: it is preliminary and not peer-reviewed yet.)

PEG mailing list
PEG <at>
Ger Hobbelt | 10 Jul 16:14 2014

Breaking the Mizushima(2010) cut operator


I'm looking for feedback = to be corrected about mistakes, any, more or less obvious, because I think I ran into an issue and after having doublechecked several times what I consider to be the cause several times, I can't find any goof(s) in it anymore.

The observation below is written as a fact, but I know sometimes I overlook the obvious elephant in the room. :'-(  And I want to make sure I got this one nailed.



# Intro #

Take grammar1:

G = A + B / B               (1)
B = A + C / C               (2)
A = a
C = c 

Matching Inputs:

c                            (1)
a+c                          (2)
a+a+c                        (3)

Application of M's cut operator rules will *not* add a cut (marked as '#') in rule G(1) as the ExtFirst(e) set for both the A+B and B choices will include the 'a' prefix (full common 'first' prefix: 'a+').
It does add an automatic cut operator to rule B(2) as there the union of the ExtFirst of both choices is empty:

B = A # + C / C               (2)

And this automatic cut insertion is correct as it does not alter the behaviour of the grammar in any way.

If we would be using, as suggested in the next paper, BITES (Redziejowski, 2011) rather than the FIRST basic building block used by Mizushima et al(2010), things don't change: The inferred ExtBites(e) would deliver 'a+' rather than just 'a' but that's about it for this example grammar.

# A Breaking Grammar #

Take grammar2, which is a minimal alteration of grammar1 which accepts the same input set ('language') as grammar1:

G = B / A + B               (1)
B = A + C / C               (2)
A = a
C = c 

Matching Inputs:

c                            (1)
a+c                          (2)
a+a+c                        (3)

Note that rule G has 'flipped' both choices but still parses the same input.
Mazushima's algorithm does not mind, so, once again, only rule B gets a cut:

G = B / A + B               (1)
B = A # + C / C             (2)
A = a
C = c 

as the ExtFirst() sets turn up the same results: B-choice1 = {'a'}, B-choice2 = {'c'}. Also upgrading to Redziejowski's BITES does not change things.

## The Problem ##

However, the cut in B will now kill the memoization cache for the initial 'a' parsed by B in the inputs 'a+c' and 'a+a+c'. When we now inspect the latter, it will fail the first B choice, backtracking to G, where it will be matched by the second choice in G, but this time around, without any help of memoization for the initial 'a' it would otherwise have received in grammar1. 

In other words: for the input 'a+a+c', the second evaluation of the leading matching 'A = a' at input column 0 (now in rule G(1), in grammar1 its second evaluation is in the first choice of rule B(2)) does not receive memoization support anymore. 
This means that using the cut operator, using the Mazushima/Redziejowski ExtFirst/ExtBites approach, for clearing out and reducing your memoization cache footprint, breaks=removes the memoization support unintended, hence the grammar does not exhibit guaranteed linear time behaviour any more.

## Side Notes ##

This is the minimal grammar example that I can devise to showcase the issue. 
Having rule G as

G = B / A + B               (1)

looks suspicious to the human eye, when you're aware of 'prefix hiding', even while the notation as such does *not* imply 'prefix hiding' as we're looking at non-terminals here, but that suspicion can take a long time to develop when this construct occurs in a production grammar along the lines of grammar3 below:

G = B / D                   (1)
B = A # + C / C             (2)
D = A + B                   (3)
A = a
C = c 

where rule D is 'obfuscating' the 'A +' common prefix in both choices of rule G(1). Of course, in a production grammar, the B(2) and D(3) definitions will very probably be further apart if this issue would not indirectly be caught by 'probable trouble' in the visual patterns of the grammar definition itself.
('probable trouble' being a human brain thing, not something a machine would use. Neither is 'probable trouble' a sure-fire thing; it's just that when you look at grammars long and often enough, you may get the feeling 'something is odd with this one'. And you *may* be right. Then again, you may be not.)

# Conclusion #

As the Mazushima cut operator only looks into the future (ExtFirst) but does not consider history/context of the rule currently under scrutiny (i.e. the rule G's behaviour when we determine rule B(2) can have a cut inserted) I expect a solution/fix to this issue would be derived from propagating the invoking rule's First/Bites info into the rule-under-scrutiny in order to add the outer context into the condition set for the automatic insertion of the cut operator. (To be addressed further in another email.)

# References #

Mizushima et al, 2010: Packrat Parsers can handle practical grammars in mostly constant space (Mizushima, Maeda, Yamaguchi, 2010)

Redziejowski, 2011: BITES instead of FIRST for parsing expression grammar (Redziejowski, 2011) -- 'upgrading' Mizushima's algorithm (ExtFirst -> ExtBites) is already suggested in its ch.1:Introduction. 


I have searched the Nets to the best of my ability but wasn't able to find earlier mention of this issue; if I missed it, I love to hear about it and a reference to it. :-)  (Because if I missed this one, I'm bound to have missed others in that vicinity too.)

Thank you for bearing with me.

Met vriendelijke groeten / Best regards,

Ger Hobbelt

mail:   ger <at>
mobile: +31-6-11 120 978
PEG mailing list
PEG <at>
Mackenzie High | 10 Jul 10:14 2014

First Official Release of Snowflake: A Graphical PEG Parser

Greetings everyone,

I would like to officially announce the release of the Java based Snowflake project, which allows users to test PEG grammars on sample input as they write them. In addition, the project is a parser-generator. 

I envision that this program would especially be useful for students who are learning about parsing, since it allows one to immediately see the parse-tree that results from a given grammar and sample input. 

I hope someone finds this program useful. I would be happy to hear any of your questions, comments, or bug reports, regarding this project. 

Mackenzie High

PEG mailing list
PEG <at>