Peter Reutemann | 1 Apr 2007 02:35
Picon
Picon

Re: Problem re-opening experiment xml file in Experimenter: "Illegal options"

> i'm sure i'm doing something wrong here...
>
> i downloaded the source from the cvs and compiled it without any problems.
> then i made sure my classpath points to the new distribution. however, it
> turns out i still can't open the experiment definition file; weka gives me
> the exact same cause for failure as it did previously.
>
> how can i make sure that the code i downloaded contains the bug fix you
> submitted?

Grabbing the code from CVS is the guarantee that you have the bug fix. The
problem is in your saved experiment file, since that still contains the
wrong order of the parameters. Delete the "-P 100" there and you'll be
able to open/save it again from then on.

HTH

Cheers, Peter
--

-- 
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174
Peter Reutemann | 1 Apr 2007 02:42
Picon
Picon

Re: creating arff file: problem with minus number

> Pls help me to solve the problem which I have encountered when creating
> the arff file. Actually, I have the minus value (e.g. -120, -45) in each
> variable. When I load the arff file into the weka, it says:
> File "Tc.arff" not recognised as an arff file. Reason: premature end of
> line, read Token(EOL), line 17.

Without seeing the files it's hard to tell (it would help, if you could
post the first 20-30 lines of your "Tc.arff" file - the error happens in
line 17). The EOL (= end of line) error normally happens if Weka is
expecting to read more attribute values than there are in that line. This
could derive from not enough values in that line or from wrongly quoted
strings/nominal values (thus resulting in a wrong number of values).

> I am loading all *.arff with value over "0", it works fine. I definitely
> sure that the problem happens because of minus value in my training set.

As what type of attribute are these negative numbers declared? numeric or
nominal?

> Any ideal to tell the weka to accept those minus number.

If it's a numeric attribute, nothing needs to be done. Nominal values must
list all the possible values in the header.

HTH

Cheers, Peter
--

-- 
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174
(Continue reading)

Peter Reutemann | 1 Apr 2007 02:45
Picon
Picon

Re: A nominal attribute (Moment_3) cannot have duplicate labels ('(0-0]').

> I tried to make feature selection using Gain Ratio on a dataset and I got
> the error message :
> A nominal attribute ... cannot have duplicate labels ('(0-0]').
>
> Its an error that people encountered previously though I didn't see any
> solutions.
> Does anyone have specific solutions ? (I have a dataset with very small
> numbers)

Weka saves numbers only with at most 6 digits after the decimal point. If
the Discretize filter (which is used internally) produces bins with
numbers that are smaller than 10E-6, than this exception will be thrown,
since one is trying to add a label that is exactly as one that got added
previously. Scaling that attribute as preprocessing step should help.

HTH

Cheers, Peter
--

-- 
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174
Peter Reutemann | 1 Apr 2007 07:24
Picon
Picon

Re: how to set up an iteration in experimenter advacned setup

> In the book <<DATA MINING - Practical Machine Learning Tools and
> Techniques
> Second Edition>> written by Ian H. Witten and Eibe Frank, it is said "For
> example, in advanced mode you can set up an iteration to test an algorithm
> with a succession of different parameter values..." (TOP of PP.443), but
> it
> doesn't explain the details of how to do it. So how to set it up and how
> to
> generate learning curves£¿

Start the Experimenter
-> click on "new" experiment
-> select "advanced"
-> choose the result generator that you wanna use
   e.g., CrossValidationResultProducer
-> edit the result generator by clicking on it
-> edit the "splitEvaluator" property
-> select the classifier that you wanna try several options for,
   e.g., J48
-> accept all dialogs
-> set "Generator properties" to "enabled"
-> choose the classifier property that you wanna traverse
   (open "splitEvaluator" and then "classifier"),
   e.g., the "seed" property
-> add different seed values, e.g., 1, 2 and 3
-> run the experiment

Drawback (maybe bug?):
In the analysis you can't distinguish between those three classifier
setups (I didn't find any field I could use in addition to get a unique
(Continue reading)

Peter Reutemann | 1 Apr 2007 07:32
Picon
Picon

Re: adaboosting.weightThreshold

>  Hi,  I think the parameter "weightThreshold" of AdaBoosting algorithm can
> be set from 0 to 100, but when I searched "weightThreshold", I found it
> was set to 1000 or even 10000.  Am I right?

Yes, you're right. If you take a look at the code, you'll see that only
values of less than 100 are considered for selecting a subset. If it's 100
or above then the full training set is used.

HTH

Cheers, Peter
--

-- 
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174
clint | 1 Apr 2007 08:11
Picon
Favicon

Re: A nominal attribute (Moment_3) cannot have duplicate labels ('(0-0]').


Thank you

Peter Reutemann-2 wrote:
> 
>> I tried to make feature selection using Gain Ratio on a dataset and I got
>> the error message :
>> A nominal attribute ... cannot have duplicate labels ('(0-0]').
>>
>> Its an error that people encountered previously though I didn't see any
>> solutions.
>> Does anyone have specific solutions ? (I have a dataset with very small
>> numbers)
> 
> Weka saves numbers only with at most 6 digits after the decimal point. If
> the Discretize filter (which is used internally) produces bins with
> numbers that are smaller than 10E-6, than this exception will be thrown,
> since one is trying to add a label that is exactly as one that got added
> previously. Scaling that attribute as preprocessing step should help.
> 
> HTH
> 
> Cheers, Peter
> -- 
> Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
> http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174
> 
> 
> 
> _______________________________________________
(Continue reading)

Fredrik Olsson | 1 Apr 2007 08:52
Picon

Re: Problem re-opening experiment xml file in Experimenter: "Illegal options"



On 4/1/07, Peter Reutemann <fracpete <at> cs.waikato.ac.nz> wrote:
> how can i make sure that the code i downloaded contains the bug fix you
> submitted?

Grabbing the code from CVS is the guarantee that you have the bug fix. The
problem is in your saved experiment file, since that still contains the
wrong order of the parameters. Delete the "-P 100" there and you'll be
able to open/save it again from then on.


but of course, i should've thought about that;)

thanks again!

best,

fredrik


_____________________
fredrik olsson
http://smudo.org

_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Le Thi Kim | 1 Apr 2007 15:22
Picon
Picon
Favicon

RE: creating arff file: problem with minus number

Many thanks, Peter. I have double checked. Yes, you are right, it works
now. 

-----Original Message-----
From: wekalist-bounces <at> list.scms.waikato.ac.nz
[mailto:wekalist-bounces <at> list.scms.waikato.ac.nz] On Behalf Of Peter
Reutemann
Sent: 01 April 2007 01:42
To: Weka machine learning workbench list.
Subject: Re: [Wekalist] creating arff file: problem with minus number

> Pls help me to solve the problem which I have encountered when 
> creating the arff file. Actually, I have the minus value (e.g. -120, 
> -45) in each variable. When I load the arff file into the weka, it
says:
> File "Tc.arff" not recognised as an arff file. Reason: premature end 
> of line, read Token(EOL), line 17.

Without seeing the files it's hard to tell (it would help, if you could
post the first 20-30 lines of your "Tc.arff" file - the error happens in
line 17). The EOL (= end of line) error normally happens if Weka is
expecting to read more attribute values than there are in that line.
This could derive from not enough values in that line or from wrongly
quoted strings/nominal values (thus resulting in a wrong number of
values).

> I am loading all *.arff with value over "0", it works fine. I 
> definitely sure that the problem happens because of minus value in my
training set.

As what type of attribute are these negative numbers declared? numeric
or nominal?

> Any ideal to tell the weka to accept those minus number.

If it's a numeric attribute, nothing needs to be done. Nominal values
must list all the possible values in the header.

HTH

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist

This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.

_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
davidl | 1 Apr 2007 19:18
Favicon

Graph Size

Hi,

I'm running J48 with around 20 values per attribute for 5 attributes and
am very pleased with your product. When piping the results into the graph
viewer, however, the tree it produces appears a bit crammed together and
unreadable. Is there any way of manually stretching/increasing the size of
the tree so as to increase readability?

best regards,

David
Peter Reutemann | 1 Apr 2007 23:20
Picon
Picon
Favicon

Re: Graph Size

> I'm running J48 with around 20 values per attribute for 5 attributes and
> am very pleased with your product. When piping the results into the graph
> viewer, however, the tree it produces appears a bit crammed together and
> unreadable. Is there any way of manually stretching/increasing the size of
> the tree so as to increase readability?

There's no "manual" way, only automatic. If you right-click (or 
alt-left-click), you get a popup-menu: either select "fit on screen" or 
"auto scale". If you left-click on the graph and hold, you can move it 
around.
If you alt-shift-left-click, you can save the tree also to a file - and 
with the latest snapshot (available from the Download section on the 
Weka homepage) you can also specify the size of the image (so far, you 
only got what was visible on the screen).

HTH

Cheers, Peter
--

-- 
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

Gmane