sudhir.parisaboyina | 22 Aug 14:45 2007

Query on YAPP Usage

 

Hi,

 

We have a query on YAPP usage.

 

A brief about our requirement:

A BNF rule has to be used to validate a data to see whether it is as per that BNF rule. (Assume 3.45 (data) is according to BNF rule say Fraction Number)

 

Our understanding about YAPP:

We understand from the document available <at> http://www.o-xml.org/yapp/ that YAPP can be used to convert a BNF rule into XML and XSLT, and it also says that the parser can be called using the call-template from another xsl stylesheet.

 

 

We have added a small BNF Rule to the existing xpath-grammar.bnf under <bnf> element as below.

 

number ::= [.0123456789];

Expression ::= AdditiveExpression end;

AdditiveExpression ::= number | AdditiveExpression minus number | AdditiveExpression plus number;

 

 

After giving make, we see a <xsl:template name="p:AdditiveExpression"> .. .. ..  .. </xsl:template> in the bnf-parser.xsl file.

 

 

When we tried calling this name template through an outsider xsl stylesheet we are getting compilation issue.

 

Pls help us to know if our understanding about YAPP is correct or not and how to use the bnf-parser.xsl file. Also, it would be great for us if we are provided with YAPP detailed documentation or any user guide to understand it’s specifications & usage thoroughly.

  

 

 

Below is the snapshot of our xsl style sheet.

 

<xsl:template match=”/”>

.

.

<xsl:call-template name="p:Expression">

<xsl:with-param name="in" select="'123 + 456 - 789'"/>

</xsl:call-template>

.

.

</xsl:template>

 

 

Any sort of help would be highly appreciated.

 

 

Thanks & Regards,

Pnvd Sudhir

 


The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.

www.wipro.com
_______________________________________________
o-xml mailing list
o-xml@...
http://lists.pingdynasty.com/mailman/listinfo/o-xml
Martin Klang | 22 Aug 15:34 2007

Re: Query on YAPP Usage

Hi there,

Thanks for using YAPP, apologies that documentation and examples  
aren't quite up to scratch. If you would like to contribute e.g. with  
code examples or docs, please feel free.

Your grammar contains a construct that is left recursive, which makes  
it unparseable by recursive descent parsers - see for example http:// 
en.wikipedia.org/wiki/Left_recursion for more details.

YAPP is a recursive descent parser, hence the generated grammar  
doesn't produce a valid parser.
The solution is quite simple, YAPP comes with a stylesheet that will  
eliminate left recursion.

I managed to get your example working with the following process:

1) create YAPP BNF parser and lexer.
xalan -XSL generator.xsl -IN bnf-grammar.xml > bnf-parser.xsl
xalan -XSL tokenizer.xsl -IN bnf-grammar.xml > bnf-lexer.xsl

2) generate XML grammar from calculator BNF grammar:
xalan -XSL bnf-parser.xsl -IN calculator-grammar.bnf > calculator- 
grammar.xml

3) fix left recursion problem:
xalan -XSL eliminator.xsl -IN calculator-grammar.xml > fixed- 
calculator-grammar.xml

3) generate parser and lexer:
xalan -XSL generator.xsl -IN fixed-calculator-grammar.xml >  
calculator-parser.xsl
xalan -XSL tokenizer.xsl -IN fixed-calculator-grammar.xml >  
calculator-lexer.xsl

All files apart from calculator-grammar.bnf are supplied with YAPP, I  
created it from your example as follows:
<grammar>
   <terminal name="end">
     <end/>
   </terminal>
   <ignore char=" "/>
   <bnf>
minus ::= '-' ;
plus ::= '+' ;
number ::= [.0123456789] ;
Expression ::= AdditiveExpression end ;
AdditiveExpression ::= number | AdditiveExpression minus number |  
AdditiveExpression plus number ;
   </bnf>
</grammar>

I then called the resulting parser with this stylesheet:
<xsl:stylesheet version="1.0"
                 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                 xmlns:p="http://www.pingdynasty.com/namespaces/parser">
   <xsl:import href="calculator-lexer.xsl"/>
   <xsl:import href="calculator-parser.xsl"/>
   <xsl:template match="/">
     <output>
       <xsl:call-template name="p:Expression">
         <xsl:with-param name="in" select="'123 + 456 - 789'"/>
       </xsl:call-template>
     </output>
   </xsl:template>
</xsl:stylesheet>

... and I got this result:
<output>
   <term name="Expression">
     <term name="AdditiveExpression">
       <term name="number">123</term>
       <term name="AdditiveExpression-rest">
         <term name="plus">+</term>
         <term name="number">456</term>
         <term name="AdditiveExpression-rest">
           <term name="minus">-</term>
           <term name="number">789</term>
           <term name="AdditiveExpression-rest"/>
         </term>
       </term>
     </term>
     <term name="end"/>
   </term>
   <remainder/>
</output>

The construct called AdditiveExpression-rest has been created by  
YAPP, it can be ignored, or filtered out with another XSL.

I hope this helps, let me know how it goes!

/m
sudhir.parisaboyina | 23 Aug 13:34 2007

RE: Query on YAPP Usage


Hi,
Thanks a zillion for letting all of us know on how to use YAPP. Now I am
able to generate the parser xsl file & call it from my own style sheet
using the same as illustrated by you. But I have an observation over
here. I was trying to see what this resulting YAPP parser will do if I
pass an invalid  data (as my project requirement is to validate the data
with BNF rule). Hence I passed the in param in the call template as '123
+ abc', and I see this output as:

<term name="AdditiveExpression">
<term name="number">123</term>
<term name="AdditiveExpression-rest">
<term name="plus">+</term>
<term name="number">abc</term>
<term name="AdditiveExpression-rest"/>

Where as my BNF rule is as below

minus ::= '-' ;
plus ::= '+' ;
number ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' ;
Expression ::= AdditiveExpression end ;
AdditiveExpression ::= number | AdditiveExpression minus number |
AdditiveExpression plus number ; 

After going through the mailing archive, I have sensed that YAPP does
not support "error detection & reporting". I want to re confirm if this
is true and also willing to know whether YAPP, in future is going to
have such enhancements done. If no, can you pls give us some guidelines
on how can we make YAPP as an error detecting + reporting tool?

Thanks & Regards,
Pnvd Sudhir 

-----Original Message-----
From: Martin Klang [mailto:martin@...] 
Sent: Wednesday, August 22, 2007 7:05 PM
To: Sudhir Parisaboyina (WT01 - TES-Transport Infrastructure)
Cc: o-xml@...; Karuna Kosaraju (WT01 - TES-Transport
Infrastructure); Siddesh P R (WT01 - TES-Transport Infrastructure)
Subject: Re: [o:XML] Query on YAPP Usage

Hi there,

Thanks for using YAPP, apologies that documentation and examples  
aren't quite up to scratch. If you would like to contribute e.g. with  
code examples or docs, please feel free.

Your grammar contains a construct that is left recursive, which makes  
it unparseable by recursive descent parsers - see for example http:// 
en.wikipedia.org/wiki/Left_recursion for more details.

YAPP is a recursive descent parser, hence the generated grammar  
doesn't produce a valid parser.
The solution is quite simple, YAPP comes with a stylesheet that will  
eliminate left recursion.

I managed to get your example working with the following process:

1) create YAPP BNF parser and lexer.
xalan -XSL generator.xsl -IN bnf-grammar.xml > bnf-parser.xsl
xalan -XSL tokenizer.xsl -IN bnf-grammar.xml > bnf-lexer.xsl

2) generate XML grammar from calculator BNF grammar:
xalan -XSL bnf-parser.xsl -IN calculator-grammar.bnf > calculator- 
grammar.xml

3) fix left recursion problem:
xalan -XSL eliminator.xsl -IN calculator-grammar.xml > fixed- 
calculator-grammar.xml

3) generate parser and lexer:
xalan -XSL generator.xsl -IN fixed-calculator-grammar.xml >  
calculator-parser.xsl
xalan -XSL tokenizer.xsl -IN fixed-calculator-grammar.xml >  
calculator-lexer.xsl

All files apart from calculator-grammar.bnf are supplied with YAPP, I  
created it from your example as follows:
<grammar>
   <terminal name="end">
     <end/>
   </terminal>
   <ignore char=" "/>
   <bnf>
minus ::= '-' ;
plus ::= '+' ;
number ::= [.0123456789] ;
Expression ::= AdditiveExpression end ;
AdditiveExpression ::= number | AdditiveExpression minus number |  
AdditiveExpression plus number ;
   </bnf>
</grammar>

I then called the resulting parser with this stylesheet:
<xsl:stylesheet version="1.0"
                 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                 xmlns:p="http://www.pingdynasty.com/namespaces/parser">
   <xsl:import href="calculator-lexer.xsl"/>
   <xsl:import href="calculator-parser.xsl"/>
   <xsl:template match="/">
     <output>
       <xsl:call-template name="p:Expression">
         <xsl:with-param name="in" select="'123 + 456 - 789'"/>
       </xsl:call-template>
     </output>
   </xsl:template>
</xsl:stylesheet>

... and I got this result:
<output>
   <term name="Expression">
     <term name="AdditiveExpression">
       <term name="number">123</term>
       <term name="AdditiveExpression-rest">
         <term name="plus">+</term>
         <term name="number">456</term>
         <term name="AdditiveExpression-rest">
           <term name="minus">-</term>
           <term name="number">789</term>
           <term name="AdditiveExpression-rest"/>
         </term>
       </term>
     </term>
     <term name="end"/>
   </term>
   <remainder/>
</output>

The construct called AdditiveExpression-rest has been created by  
YAPP, it can be ignored, or filtered out with another XSL.

I hope this helps, let me know how it goes!

/m

The information contained in this electronic message and any attachments to this message are intended for
the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged
information. If you are not the intended recipient, you should not disseminate, distribute or copy this
e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any
attachments for the presence of viruses. The company accepts no liability for any damage caused by any
virus transmitted by this email.

www.wipro.com
Martin Klang | 25 Aug 15:16 2007

Re: Query on YAPP Usage

You're welcome!

Recursive descent parsers are notoriously bad at error reporting.  
Generally, if no valid production can be found, the parser returns an  
empty result - it can't tell you where or why it failed.
One thing you can try is to extend your grammar with error branches,  
e.g.
Operator ::= PlusOperator | MinusOperator | InvalidOperatorError ;
- though be careful that you don't end up with false matches higher  
up. I think it only works if Operator is the last match in any other  
production. It's quite tricky to define productions of the errors  
that don't interfere with normal parsing.

I had a look at the problem you were experiencing, with 'abc' wrongly  
accepted as a number. It's a bug or issue in the YAPP lexer that, in  
order to find the best match efficiently, it uses an exclusion  
string: for each terminal, all other terminal characters are removed  
from the token and what's left is considered a match. The 'other'  
string, the exclusion, is hence only made up of characters contained  
in other terminals, and since no other terminal in your grammar  
contains the letters, it wrongly matches them as a number.
You can change/fix this behaviour by editing tokenizer.xsl, and  
changing the variable t:allChars to a static list of 'matchable'  
characters, eg:
   <xsl:variable name="t:allChars">
     <![CDATA 
[ abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890-_= 
+`~,<.>/?;:'"\|[{]}§±! <at> £$%^&*()]]>
   </xsl:variable>

Better still, if you use XSLT 2.0 you can rewrite the tokenizer to  
use XSLT 2.0 regular expressions, and support Extended BNF.

hope this helps,

/m

On 23 Aug 2007, at 12:34, <sudhir.parisaboyina@...>  
<sudhir.parisaboyina@...> wrote:

>
>
> Hi,
> Thanks a zillion for letting all of us know on how to use YAPP. Now  
> I am
> able to generate the parser xsl file & call it from my own style sheet
> using the same as illustrated by you. But I have an observation over
> here. I was trying to see what this resulting YAPP parser will do if I
> pass an invalid  data (as my project requirement is to validate the  
> data
> with BNF rule). Hence I passed the in param in the call template as  
> '123
> + abc', and I see this output as:
>
> <term name="AdditiveExpression">
> <term name="number">123</term>
> <term name="AdditiveExpression-rest">
> <term name="plus">+</term>
> <term name="number">abc</term>
> <term name="AdditiveExpression-rest"/>
>
> Where as my BNF rule is as below
>
> minus ::= '-' ;
> plus ::= '+' ;
> number ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' ;
> Expression ::= AdditiveExpression end ;
> AdditiveExpression ::= number | AdditiveExpression minus number |
> AdditiveExpression plus number ;
>
>
> After going through the mailing archive, I have sensed that YAPP does
> not support "error detection & reporting". I want to re confirm if  
> this
> is true and also willing to know whether YAPP, in future is going to
> have such enhancements done. If no, can you pls give us some  
> guidelines
> on how can we make YAPP as an error detecting + reporting tool?
>
>
>
>
> Thanks & Regards,
> Pnvd Sudhir
>

Gmane