Stephan Maka | 1 Oct 02:39
Gravatar

Re: parsing xml (xmpp) with ruby

Eric Will wrote:
> It doesn't complain about a partial file. If I pass it a partial
> stanza, with say the end tag only halfway finished, like "</prese" it
> raises an exception. The exception tells me nothing that I can use to
> isolate the incomplete stanza and save it to be appended onto later.
> 
> I feed it data that I pull out of the socket's read() method. It has
> listeners for start tags, text, and end tags. The method that handles
> the start tag sets everything up, and starts to build a tree. The end
> tag method passes the tree to a handler function.

If you stick to REXML (which you shouldn't if you aim for performance)
you should read XMPP4R's streamparser.rb. The usage of REXML's SAX2
parser is really straight-forward there.

If you're using libxml-ruby then I recognize your problem. The key is
using the parser in "push mode". Take a look at the last patch I sent to
their list.

Don't explain that much, but share your code.

Stephan
_______________________________________________
JDev mailing list
FAQ: http://www.jabber.org/discussion-lists/jdev-faq
Forum: http://www.jabberforum.org/forumdisplay.php?f=20
Info: http://mail.jabber.org/mailman/listinfo/jdev
Unsubscribe: JDev-unsubscribe <at> jabber.org
_______________________________________________

(Continue reading)

Jonathan Dickinson | 1 Oct 07:29
Gravatar

Re: parsing xml (xmpp) with ruby

I can give you a XML parser in C# that you can port. I have had no problems with it so far and I have been using it
for a couple of months.

> -----Original Message-----
> From: jdev-bounces <at> jabber.org [mailto:jdev-bounces <at> jabber.org] On
> Behalf Of Stephan Maka
> Sent: Wednesday, October 01, 2008 2:40 AM
> To: jdev <at> jabber.org
> Subject: Re: [jdev] parsing xml (xmpp) with ruby
>
> Eric Will wrote:
> > It doesn't complain about a partial file. If I pass it a partial
> > stanza, with say the end tag only halfway finished, like "</prese" it
> > raises an exception. The exception tells me nothing that I can use to
> > isolate the incomplete stanza and save it to be appended onto later.
> >
> > I feed it data that I pull out of the socket's read() method. It has
> > listeners for start tags, text, and end tags. The method that handles
> > the start tag sets everything up, and starts to build a tree. The end
> > tag method passes the tree to a handler function.
>
> If you stick to REXML (which you shouldn't if you aim for performance)
> you should read XMPP4R's streamparser.rb. The usage of REXML's SAX2
> parser is really straight-forward there.
>
> If you're using libxml-ruby then I recognize your problem. The key is
> using the parser in "push mode". Take a look at the last patch I sent
> to
> their list.
>
(Continue reading)

Eric Will | 1 Oct 13:50
Gravatar

Re: parsing xml (xmpp) with ruby

On Tue, Sep 30, 2008 at 6:22 PM, Lucas Nussbaum
<lucas <at> lucas-nussbaum.net> wrote:
> We don't deal with it, REXML does it for us by not pushing incomplete
> events to us through the SAX2Parser.

I guess this is because you guys use it attached to a socket? I don't
do that. I read from the socket and pass that value to a new parser
every time. I use select, not threads. I don't like Ruby's threads.
_______________________________________________
JDev mailing list
FAQ: http://www.jabber.org/discussion-lists/jdev-faq
Forum: http://www.jabberforum.org/forumdisplay.php?f=20
Info: http://mail.jabber.org/mailman/listinfo/jdev
Unsubscribe: JDev-unsubscribe <at> jabber.org
_______________________________________________

Picon

Re: parsing xml (xmpp) with ruby

Hello

On Wed, Oct 01, 2008 at 07:50:26AM -0400, Eric Will wrote:
> On Tue, Sep 30, 2008 at 6:22 PM, Lucas Nussbaum
> <lucas <at> lucas-nussbaum.net> wrote:
> > We don't deal with it, REXML does it for us by not pushing incomplete
> > events to us through the SAX2Parser.
> 
> I guess this is because you guys use it attached to a socket? I don't
> do that. I read from the socket and pass that value to a new parser
> every time. I use select, not threads. I don't like Ruby's threads.

You can not use a new parser every time. You need it to know the
beginning of the stream.

However, I do not have it attached directly to the socket, I read the
socket and feed the data into the parser manually.

Have a nice day

--

-- 
Something is obviously wrong. The thing works.

Michal 'vorner' Vaner
_______________________________________________
JDev mailing list
FAQ: http://www.jabber.org/discussion-lists/jdev-faq
Forum: http://www.jabberforum.org/forumdisplay.php?f=20
(Continue reading)

Peter Saint-Andre | 1 Oct 16:37
Favicon

special-purpose xmpp.org lists

Over the last few months I have created a number of special-purpose
mailing lists for more focused discussion of XMPP-related topics, most
of them a combination of protocol and development topics. Since some
people don't seem to know about these lists, here is a reminder (the
purpose of each list is probably self-explanatory from its name).

1. bosh <at> xmpp.org

2. jingle <at> xmpp.org

3. mobile <at> xmpp.org

4. muc <at> xmpp.org

5. pubsub <at> xmpp.org

6. security <at> xmpp.org

7. social <at> xmpp.org

8. sip-xmpp <at> xmpp.org

9. ws-xmpp <at> xmpp.org

You can find more information, including sign-up links, here:

http://xmpp.org/about/discuss.shtml

Peter

(Continue reading)

Eric Will | 1 Oct 17:04
Gravatar

Re: parsing xml (xmpp) with ruby

On Wed, Oct 1, 2008 at 8:21 AM, Michal 'vorner' Vaner <vorner <at> ucw.cz> wrote:
> You can not use a new parser every time. You need it to know the
> beginning of the stream.

Why? It works fine without knowing it. How can I reuse the same parser
when the source changes?

That is, I have to do parser = REXML::Parsers::SAX2Parser.new(source)
every time, and feed it the source. Is there a way to feed an existing
SAX2Parser a new source?

> --
> Something is obviously wrong. The thing works.
>
> Michal 'vorner' Vaner

-- Eric Will // rakaur --
_______________________________________________
JDev mailing list
FAQ: http://www.jabber.org/discussion-lists/jdev-faq
Forum: http://www.jabberforum.org/forumdisplay.php?f=20
Info: http://mail.jabber.org/mailman/listinfo/jdev
Unsubscribe: JDev-unsubscribe <at> jabber.org
_______________________________________________

Picon

Re: parsing xml (xmpp) with ruby

Hello

On Wed, Oct 01, 2008 at 11:04:11AM -0400, Eric Will wrote:
> On Wed, Oct 1, 2008 at 8:21 AM, Michal 'vorner' Vaner <vorner <at> ucw.cz> wrote:
> > You can not use a new parser every time. You need it to know the
> > beginning of the stream.
> 
> Why? It works fine without knowing it. How can I reuse the same parser
> when the source changes?

If you take <stream thenamespace etc><first stanza/> and put it into
first parser and then <second stanza/><third stanza> to second and
</thind stanza> into another, then you get mess and not data. Or do you
reuse it in some other way I do not get?

When a stanza gets split into two chunks, you get even more mess.

> That is, I have to do parser = REXML::Parsers::SAX2Parser.new(source)
> every time, and feed it the source. Is there a way to feed an existing
> SAX2Parser a new source?

This is my code when data come. It is C++ and Qt, but you might see:

source.setData( text );
reader.parseContinue();

--

-- 
"Don't worry about people stealing your ideas.	 If your ideas are any good, 
you'll have to ram them down people's throats."
	-- Howard Aiken
(Continue reading)

Eric Will | 1 Oct 17:33
Gravatar

Re: parsing xml (xmpp) with ruby

On Wed, Oct 1, 2008 at 11:15 AM, Michal 'vorner' Vaner <vorner <at> ucw.cz> wrote:

> If you take <stream thenamespace etc><first stanza/> and put it into
> first parser and then <second stanza/><third stanza> to second and
> </thind stanza> into another, then you get mess and not data. Or do you
> reuse it in some other way I do not get?

I'm using a SAX parser. It doesn't care about the structure of the
overall document. I build the nodes by myself, a tag at a time.

> When a stanza gets split into two chunks, you get even more mess.

I handle this at the moment, but not in the best way. When my parser
gets to a partial stanza it reads and processes up to the partial
part, it does one of two bad things. The first one is when i get half
a tag or something, and it raises an exception saying it's invalid
XML. The second one is when it lands in the middle of an open tag, but
everything is well-formed, but there's no closing tag. In this case it
parses as far as it can, but without closing tags (which is where I
fire my events) it doesn't DO anything, so it appears to ignore it...
I'm not sure how to fix this.

> This is my code when data come. It is C++ and Qt, but you might see:
>
> source.setData( text );
> reader.parseContinue();

REXML doesn't have this. There's no way to change the source except to
make a new parser instance.

(Continue reading)

Remko Tronçon | 1 Oct 17:45
Picon
Favicon
Gravatar

Re: parsing xml (xmpp) with ruby

> I'm using a SAX parser. It doesn't care about the structure of the
> overall document. I build the nodes by myself, a tag at a time.

That doesn't really make much sense. Parsing
    <document xmlns="bla"><foo>...
in one pass, and parsing
    <document xmlns="bla">
and then
    <foo>
is something entirely different (foo will have a different namespace).

A SAX parser *does* care about the document structure. The only
difference is that you don't need to parse the whole document before
you know anything about its structure.

Creating a new parser for every piece of data doesn't work. If you
don't understand the difference above, then consider parsing "<foo><"
and then "/foo>" in one step, and consider parsing it with a new
parser. The second chunk will not parse in a new parser. It will parse
if you hand it to an existing parser that already parsed <foo><.

cheers,
Remko

cheers,
Remko
_______________________________________________
JDev mailing list
FAQ: http://www.jabber.org/discussion-lists/jdev-faq
Forum: http://www.jabberforum.org/forumdisplay.php?f=20
(Continue reading)

Picon

Re: parsing xml (xmpp) with ruby

Hello

On Wed, Oct 01, 2008 at 11:33:44AM -0400, Eric Will wrote:
> On Wed, Oct 1, 2008 at 11:15 AM, Michal 'vorner' Vaner <vorner <at> ucw.cz> wrote:
> 
> > If you take <stream thenamespace etc><first stanza/> and put it into
> > first parser and then <second stanza/><third stanza> to second and
> > </thind stanza> into another, then you get mess and not data. Or do you
> > reuse it in some other way I do not get?
> 
> I'm using a SAX parser. It doesn't care about the structure of the
> overall document. I build the nodes by myself, a tag at a time.

You don't get it. Sax does not need to load the whole document in
memory. But it needs some information from the parent nodes (like depth,
namespace declarations, etc). You can't start parsing from the middle.

> > When a stanza gets split into two chunks, you get even more mess.
> 
> I handle this at the moment, but not in the best way. When my parser
> gets to a partial stanza it reads and processes up to the partial
> part, it does one of two bad things. The first one is when i get half
> a tag or something, and it raises an exception saying it's invalid
> XML. The second one is when it lands in the middle of an open tag, but
> everything is well-formed, but there's no closing tag. In this case it
> parses as far as it can, but without closing tags (which is where I
> fire my events) it doesn't DO anything, so it appears to ignore it...
> I'm not sure how to fix this.

That is the „more mess“ I talk about. You need to set up the parser so
(Continue reading)


Gmane