Re: [Moses-developers] SGML/XML markup
Philipp Koehn <
pkoehn@...>
2007-01-10 15:37:11 GMT
Hi,
I have to say that I am not enirely happy with this, especially
because I don't quite get why that would be usedful for.
I don't think our planned XML efforts have evolved to any degree at
this point...
-phi
On 1/10/07, Chris Dyer <redpony@...> wrote:
> The code isn't terrible fragile in the running sense (although a
> malicious user might be able to find a way to make it behave somewhat
> oddly), but it is fragile in the 'I want to extend this "parser" to be
> a general SGML parser' sense. So I wouldn't be overly worried about
> it "breaking" per se. But your point is well taken about having clear
> documentation and warnings about the code.
>
> I could put this in a separate class, but I'm a little torn as to
> whether it might be more appropriate to extend Sentence rather than
> IOStream. IOStream is used for both CN input and normal sentences,
> but how to mark up CN input with SGML is a little less clear, and, in
> any case, would need to be handled quite differently from the way we
> handle it for regular sentences. Because of this need for input-type
> specific handling, it seems like it belongs in Sentence (or in an
> extension thereof) rather than IOStream. If this is the case, I
> suppose my inclination would not to be to introduce an
> XMLMarkedSentence since this isn't qualitatively different from a
> normal sentence- a normal sentence would just be an XMLMarkedSentence
> where the markup was NULL.
>
> Chris
>
> On 1/10/07, Hieu Hoang <hieuhoang1972@...> wrote:
> > ps. why don't you inherit from IOStream and override GetInput() ?
> >
> > this is kinda the reason why IO was separated out into a separate class.
> >
> >
> > Hieu Hoang <hieuhoang1972@...> wrote:
> > i don't mind a bit of hackish code. put it in if u need it - it might be
> > food for thought for someone thinking of doing xml
> >
> > it;ll be good if u can make it as readable as possible so people won;t be
> > confused of what its doing
> >
> > if its ugly and fragile, expect it to break sooner rather than later
> >
> > Chris Dyer <redpony@...> wrote:
> > Hello Mosians-
> >
> > In a moment of desperation, I had to add some code to moses to allow
> > it to respond properly to input that looks like the following:
> >
> > translate this!
> >
> > This allows you to set the translationId (which is currently only
> > produced in the nbest list output) for a sentence to something other
> > than moses's internal counter. I've hacked in a hand-written and
> > quite fragile parser for "seg" tags. It's ugly, but it's localized
> > and will be easily replaceable when we decide what should be done to
> > really handle SGML/XML (what's the status of this anyway?).
> >
> > My question is- would it be okay for me to check this in, or are there
> > objections to this?
> >
> > Best-
> > Chris
> > _______________________________________________
> > Moses-developers mailing list
> > Moses-developers@...
> > http://mailman.mit.edu/mailman/listinfo/moses-developers
> >
> >
> >
> > Hieu Hoang
> > http//www.hoang.co.uk/hieu
> > ________________________________
> > New Yahoo! Mail is the ultimate force in competitive emailing. Find out more
> > at the Yahoo! Mail Championships. Plus: play games and win prizes.
> >
> >
> > Hieu Hoang
> > http//www.hoang.co.uk/hieu
> >
> > ________________________________
> > All New Yahoo! Mail – Tired of unwanted email come-ons? Let our SpamGuard
> > protect you.
> >
> >
>
> _______________________________________________
> Moses-developers mailing list
> Moses-developers@...
> http://mailman.mit.edu/mailman/listinfo/moses-developers
>
>