Re: baseline, tuning, bleu, parameters, prob
Sergey Protasov <
svp@...>
2007-06-03 09:29:54 GMT
Thank you, Hieu!
1.
I found that the tuning stage can add only 0.4-3.4 points to the BLEU metric
http://hermes.itc.it/people/bertoldi/JHU-CLSP/met-result.pdf
2. It takes one day for each tuning iteration for baseline system for
my P4-2Gz-2Gb
So 14-15 days for all tuning stage is very long time.
3. Ok, I will try it
4. For my language pair (english-russian) it is difficult to get
parallel corpora like europarl transcripts.
I have many text (literature,fiction,fairy-tale) that I need to align
and remove all bad pairs.
I need some criteria to sort sentence pairs and remove with low scores.
2007/6/2, Hieu Hoang <h.hoang@...>:
> Hi sergey,
>
> Saw you last email. Don't really know the answers, but try my best:
>
> 1. you must do parameter tuning, it helps a lot.
> 2. mert should take about a dozen iterations.
> 3. the 7 distortion paramters -
> 1st - distance based distortion penalty
> next 6 parameters - for lexicalised re-ordering
> u can reduce distortion param to 1 by not doing lexicalised
> re-ordering.
> The 5 translation params are detailed here:
> http://www.statmt.org/moses/?n=FactoredTraining.ScorePhrases
> u can reduce this to 1 by only using 1 translation, I suppose the
> best to keep would be phi(e|f). After doing the mert, u can reduce it
> from 5 to 1 by combining all the params together, since u know have the
> optimal weights.
>
> 4. not sure what u mean. However, in general, the decoder deals with
> scores, or 'feature functions'. U can't convert scores to the
> probability.
> U can get the score for any translation, by calling the function
> GetTotalScore() in the hypothesis class.
>
> Hieu Hoang
> www.hoang.co.uk/hieu
>
>
> -----Original Message-----
> From: moses-support-bounces@...
> [mailto:moses-support-bounces@...] On Behalf Of Sergey Protasov
> Sent: 01 June 2007 21:00
> To: moses-support@...
> Subject: [Moses-support] baseline, tuning, bleu, parameters, prob
>
>
> Dear moses experts,
>
> I have some questions, help pls
>
> 1. Baseline french-to-english system on europarl corpora.
> What BLEU scores should we have with and without tuning stage? Does
> tuning stage help us?
>
> 2. tuning stage and MERT algorithm. How long it takes for baseline
> system? How many iterations? What are the best BLEU score on each
> iteration? I have 0.03, 0.04, 0.11 on the first 3 iterations...
>
> 3. we have 7 parameters for distortion model (-d) and 5 parameters for
> translation model (-tm) (baseline system). Where this numbers come from?
> Can I reduce the number of this parameters to 1 (-d) and 1 (-tm) for a
> good robustness?
>
> 4. For a given moses decoder (translation model "f->e", language model
> e, distortion model, word penalty) and for a given foreign sentence
> "f", how can I compute (using
> mosedecoder) probability (cross-entropy per sentence) for a given
> sentence "e", that is translation of sentence "f"?
> For example I have mosedecoder "f->e" and I would like to align
> sentences more exactly and remove bad alignments with low translation
> probability. How can I compute that probability?
>
>
> Thanx in advance!
>
> http://sz.ru/parser/ _______________________________________________
> Moses-support mailing list
> Moses-support@...
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>