Re: Thinking out loud...
Eric M. Ludlam <eric <at> siege-engine.com>
2008-02-04 11:40:38 GMT
Using Ripper sounds like a good way to tackle the problem. If you
are looking to font-lock a buffer, it could certainly generate font
locking code directly without using any semantic tools.
As a Semantic parser, it would probably work well too. If it spit
out the code to make raw Semantic tags, where START and END show up at
the tail of a new tag, they could be "cooked" into usable tags. ie:
(semantic-tag-new-function "moose" "int" :attrib1 "a" start end)
Those tags then get expanded. See semantic-texi-expand-tag as an
As for the wisent parser in CEDET, the handy thing about generating
tags, or even font lock code, is that it is not required to be
complete, or even terribly accurate. Accepting bogus syntax in tough
spots is fine for what Semantic or font lock might do with it.
Sounds like there are some interesting options to look at. Good
>>> Perry Smith <pedz <at> easesoftware.com> seems to think that:
>I got up today to start looking into this more. I wanted first to
>understand the existing Ruby parser. Ruby 1.9 was just released and
>there is much talk about version 2. Most people are running 1.8.6.
>The changes in parse.y between 1.8.6 and 1.9 are hugh. Matz is
>introduction new syntaxes and removing things he doesn't like. That
>didn't scare me too bad. But then I noticed that yylex is 1286 lines
>of code and very intimately tied into the parser. And I saw some
>comment which I didn't understand that "such-n-such syntax is
>impossible to do with bison". But currently they are using bison.
>So, I don't know exactly what this means. This was on a blog site
>documenting the changes between 1.8 and 1.9. Anyhow, the more I
>looked, the more I was convinced I didn't want to write a parser from
>I've looked at Daniel's work a few times. It would be really hard for
>me to get "right". Ruby has so many odd things that I want to parse
>properly that Daniel did not address. I can understand why now.
>Eventually, after much poking and hoping, I came across Ripper. It is
>a ruby class that is built in 1.9. parse.y has these funny comments
>in it which I hoped were there for a reason. In fact, I found Ripper
>by grepping all the Ruby source for the funny comment string. %%%
>They process Ruby's parse.y into ripper.y, wrap it in a ruby class,
>and create a "SAX-oriented" lexer/parser. I guess they are referring
>to this: SAX (So, all of this part of it is in C)
>The parser and lexer generate events that are sent to a class built up
>from Ripper (which is written in Ruby) through registered functions.
>They have a trivial example that simply recreates the parse tree. The
>lexer events give you the token and the line and column number. The
>parser events give you how the tokens fit into the parse. The parser
>events map directly to the bison C code which are basically the
>parser's reduce actions.
>The beauty here is that the parser really does parse Ruby -- by
>definition. I timed the parse time for the largest ruby file I could
>find -- 2000+ lines. And it took 0m0.098s real time. While back
>porting Ripper to 1.8.6 may not be easy (or worth while), it appears
>as if it is in Ruby to stay so, anyone on 1.9 or later will be able to
>use Ripper. And, since it is built from the Ruby source, it will
>track the syntax of the version of Ruby that the user is using. I'm
>really ecstatic about this.
>So, my thoughts at this point is to hook up a parser as a separate
>program much like cscope or ispell. Send text to it, retrieve the
>output, and then process it. Since the functions that catch the parse
>events can do anything they want, they could produce lisp directly
>which would then just be executed -- maybe. It could also produce
>Semantic tags. And, I'm wanting cross reference like cscope. That
>seems plausible at this point too.
[ ... ]
Eric Ludlam: eric <at> siege-engine.com
Siege: www.siege-engine.com Emacs: http://cedet.sourceforge.net
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.