Panos Kanavos | 2 Oct 2011 11:12
Picon
Favicon

create unfactored model from tagged files

Hi all,

Is it possible to train an unfactored model when one of the files is tagged 
with POS tags? I don't want to use factors when I train with the tagged file 
as the source language. I have tried to add the parameters "--alignment-
factors 0-0", "--reordering-factors 0-0 " and "--translation-factors 0-0" but 
the resulting tables always contain both factors (words and tags) on the 
source language.

Thanks.

Panos
Hieu Hoang | 2 Oct 2011 16:37
Picon
Gravatar

Re: create unfactored model from tagged files

I used to do this quite often and it worked ok, however, the training 
scripts may have changed slightly change.

You can also create a new corpus file of just the factors you need by 
running the following script from the toolkit
     reduce_combine.pl

On 02/10/2011 16:12, Panos Kanavos wrote:
> Hi all,
>
> Is it possible to train an unfactored model when one of the files is tagged
> with POS tags? I don't want to use factors when I train with the tagged file
> as the source language. I have tried to add the parameters "--alignment-
> factors 0-0", "--reordering-factors 0-0 " and "--translation-factors 0-0" but
> the resulting tables always contain both factors (words and tags) on the
> source language.
>
> Thanks.
>
> Panos
> _______________________________________________
> Moses-support mailing list
> Moses-support@...
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
Panagiotis Kanavos | 2 Oct 2011 17:33
Picon

Re: create unfactored model from tagged files

Thanks a lot Hieu, I will use this script then.

2011/10/2 Hieu Hoang <hieuhoang-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
I used to do this quite often and it worked ok, however, the training
scripts may have changed slightly change.

You can also create a new corpus file of just the factors you need by
running the following script from the toolkit
    reduce_combine.pl


On 02/10/2011 16:12, Panos Kanavos wrote:
> Hi all,
>
> Is it possible to train an unfactored model when one of the files is tagged
> with POS tags? I don't want to use factors when I train with the tagged file
> as the source language. I have tried to add the parameters "--alignment-
> factors 0-0", "--reordering-factors 0-0 " and "--translation-factors 0-0" but
> the resulting tables always contain both factors (words and tags) on the
> source language.
>
> Thanks.
>
> Panos
> _______________________________________________
> Moses-support mailing list
> Moses-support-3s7WtUTddSA@public.gmane.org
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>

_______________________________________________
Moses-support mailing list
Moses-support-3s7WtUTddSA@public.gmane.org
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
Moses-support@...
http://mailman.mit.edu/mailman/listinfo/moses-support
Kaveh Taghipour | 2 Oct 2011 19:52
Picon

Calculating final scores for sentences in an N-Best list

Hi,

I have generated an N-best list with moses, but I do not know how to compute the final score. I tried a log-linear model but came up with a wrong number. For example:

moses.ini:
----------------------------------------
# distortion (reordering) weight
[weight-d]
0.120662

# language model weights
[weight-l]
0.251853

# translation model weights
[weight-t]
0.0675335
0.103843
0.0720954
0.00630806
0.101934

# word penalty
[weight-w]
-0.275771

Assuming a log-linear model, the score for the following sentence should be -54.076 but the decoder prints:

0 ||| ....  ||| d: 0 lm: -219.367 w: -6 tm: -3.17999 -4.43847 -2.63049 -3.87513 3.99959 ||| -254.076

Thank you in advance for helping.

Cheers,
Kaveh
_______________________________________________
Moses-support mailing list
Moses-support@...
http://mailman.mit.edu/mailman/listinfo/moses-support
Bill Tang | 2 Oct 2011 20:04
Picon

Question on model for the demo system

I tried the moses online demo and found it to be very good translating French to English.  Is the model for the demo available online?  I have compiled mosesdecoder, irstlm, srilm. But it appears my local build can only translate simple sentences. I am new to moses.  Which language model does the online demo uses and what type of training data set was used?

Cheers,

Bill Tang

_______________________________________________
Moses-support mailing list
Moses-support@...
http://mailman.mit.edu/mailman/listinfo/moses-support
Christian Hardmeier | 2 Oct 2011 20:23
Picon
Favicon

Re: Calculating final scores for sentences in an N-Best list

I think you're on the right track. For some reason, moses doesn't report the OOV penalty feature, which adds
a hardcoded penalty of -100 to the total score for each input word that was copied to the output because no
suitable translation was found in the phrase table. Your test sentence probably contains two unknown
words that account for the -200 difference between the score you calculated and the one output by the
decoder. Does that make sense?

Best,
Christian 

Kaveh Taghipour <kaveh.taghipour@...> wrote:

>Hi,
>
>I have generated an N-best list with moses, but I do not know how to compute
>the final score. I tried a log-linear model but came up with a wrong number.
>For example:
>
>moses.ini:
>----------------------------------------
># distortion (reordering) weight
>[weight-d]
>0.120662
>
># language model weights
>[weight-l]
>0.251853
>
># translation model weights
>[weight-t]
>0.0675335
>0.103843
>0.0720954
>0.00630806
>0.101934
>
># word penalty
>[weight-w]
>-0.275771
>
>Assuming a log-linear model, the score for the following sentence should be
>*-54.076* but the decoder prints:
>
>0 ||| ....  ||| d: 0 lm: -219.367 w: -6 tm: -3.17999 -4.43847 -2.63049
>-3.87513 3.99959 ||| *-254.076*
>
>Thank you in advance for helping.
>
>Cheers,
>Kaveh
>
>_______________________________________________
>Moses-support mailing list
>Moses-support@...
>http://mailman.mit.edu/mailman/listinfo/moses-support
Philipp Koehn | 2 Oct 2011 21:58
Picon
Picon
Favicon

Re: Question on model for the demo system

Hi,

the French-English system was trained from data available
for the WMT 2011 workshop ( http://www.statmt.org/ ). Fairly
standard settings were used. In fact, if you check out the
example configuration files for experiment.perl, then you
should be able to pretty much reproduce the model.

-phi

On Sun, Oct 2, 2011 at 7:04 PM, Bill Tang <billtang@...> wrote:
> I tried the moses online demo and found it to be very good translating
> French to English.  Is the model for the demo available online?  I have
> compiled mosesdecoder, irstlm, srilm. But it appears my local build can only
> translate simple sentences. I am new to moses.  Which language model does
> the online demo uses and what type of training data set was used?
>
> Cheers,
>
> Bill Tang
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@...
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
Bill Tang | 3 Oct 2011 02:02
Picon

Re: Question on model for the demo system

Thank you so much. I will try it.

Cheers,

Bill Tang


On Sun, Oct 2, 2011 at 12:58 PM, Philipp Koehn <pkoehn-HVB9PtLUh64@public.gmane.org.ac.uk> wrote:
Hi,

the French-English system was trained from data available
for the WMT 2011 workshop ( http://www.statmt.org/ ). Fairly
standard settings were used. In fact, if you check out the
example configuration files for experiment.perl, then you
should be able to pretty much reproduce the model.

-phi

On Sun, Oct 2, 2011 at 7:04 PM, Bill Tang <billtang-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> I tried the moses online demo and found it to be very good translating
> French to English.  Is the model for the demo available online?  I have
> compiled mosesdecoder, irstlm, srilm. But it appears my local build can only
> translate simple sentences. I am new to moses.  Which language model does
> the online demo uses and what type of training data set was used?
>
> Cheers,
>
> Bill Tang
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support-3s7WtUTddSA@public.gmane.org
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>

_______________________________________________
Moses-support mailing list
Moses-support@...
http://mailman.mit.edu/mailman/listinfo/moses-support
Jehan Pages | 3 Oct 2011 06:25
Favicon
Gravatar

Re: Compiling Moses with IRSTLM

Hi,

On Mon, Sep 26, 2011 at 6:44 PM, Kenneth Heafield <moses@...> wrote:
> Hi,
>
>    Since the sample language models are provided for you, it is no
> longer necessary to compile SRILM or IRSTLM (though you can if you want
> to use the specific features they provide; otherwise they're just
> slower).  I've updated the getting started documentation.

Thanks. It works better now that I know where to configure the LM
toolkit. :-) You just made a small mistake by updating the page: it
seems you removed the command for running Moses with the samples:
../moses-cmd/src/moses -f phrase-model/moses.ini < phrase-model/in > out

(Note: this is to be run from sample-models/ as it tries to open
lm/europarl.srilm.gz relatively to the current directory apparently)

Anyway thanks for the fast answers! Now that the basic test seems to
work well, I will try to follow the more comprehensive training guide.

Jehan

> Kenneth
>
> On 09/26/11 09:32, Jehan Pages wrote:
>> Hi,
>>
>> On Mon, Sep 26, 2011 at 3:48 PM, Nicola Bertoldi <bertoldi@...> wrote:
>>> I am going to release (very soon) a new version of Moses including  new LM types
>>> Stay tuned on IRSTLM webpage
>>>
>>> If you need immediately, get the code from the IRSTLM SF repository
>>>
>>> you can download revision 452, which properly interfaces with the latest revision of Moses
>> Thanks for the answer. As right now, I am mainly testing this engine,
>> the development version from the repo suits me ok. Anyway Moses
>> compiled fine using revision 452 of IRSTLM. So that's great. Thanks
>> again!
>>
>> Also just to be sure, in the "getting started" page, the sample models
>> which are linked are only for SRILM, right? Because I wanted to test
>> as explained in the page, and I get:
>>
>> [...]
>> Start loading LanguageModel lm/europarl.srilm.gz : [0.000] seconds
>> ERROR:Language model type unknown. Probably not compiled into library
>> Segmentation fault
>>
>>
>> Seeing the srilm.gz extension, I guess that won't work with only
>> IRSTLM compiled in. That information may be worth being updated into
>> the "Getting started" page. :-)
>> I guess I'll have to test directly with more complete data.
>>
>> Jehan
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@...
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@...
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
Kaveh Taghipour | 3 Oct 2011 12:28
Picon

Re: Calculating final scores for sentences in an N-Best list

Hi Christian,

Thanks. You are right. But I think there is no need for such a penalty, since all candidates for a given source sentence contain the same number of OOVs and so the penalty does not help at all. Do you know the reason?

Cheers,
Kaveh



On Sun, Oct 2, 2011 at 9:53 PM, Christian Hardmeier <ch-WX4DGwxHaCo@public.gmane.org> wrote:
I think you're on the right track. For some reason, moses doesn't report the OOV penalty feature, which adds a hardcoded penalty of -100 to the total score for each input word that was copied to the output because no suitable translation was found in the phrase table. Your test sentence probably contains two unknown words that account for the -200 difference between the score you calculated and the one output by the decoder. Does that make sense?

Best,
Christian

Kaveh Taghipour <kaveh.taghipour-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

>Hi,
>
>I have generated an N-best list with moses, but I do not know how to compute
>the final score. I tried a log-linear model but came up with a wrong number.
>For example:
>
>moses.ini:
>----------------------------------------
># distortion (reordering) weight
>[weight-d]
>0.120662
>
># language model weights
>[weight-l]
>0.251853
>
># translation model weights
>[weight-t]
>0.0675335
>0.103843
>0.0720954
>0.00630806
>0.101934
>
># word penalty
>[weight-w]
>-0.275771
>
>Assuming a log-linear model, the score for the following sentence should be
>*-54.076* but the decoder prints:
>
>0 ||| ....  ||| d: 0 lm: -219.367 w: -6 tm: -3.17999 -4.43847 -2.63049
>-3.87513 3.99959 ||| *-254.076*
>
>Thank you in advance for helping.
>
>Cheers,
>Kaveh
>
>_______________________________________________
>Moses-support mailing list
>Moses-support-3s7WtUTddSA@public.gmane.org
>http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
Moses-support@...
http://mailman.mit.edu/mailman/listinfo/moses-support

Gmane