Peter Reutemann | 1 Jul 2005 01:11
Picon
Picon
Favicon

Re: Issues with Weka GUI

> I've just tried to install weka3-4-4 on a new winXP pc. jdk1.5.0_04 /
> jre1.5.0_04 installed and pc has been rebooted since installation.
> 
> The GUI chooser displays after selecting 'RunWeka.bat' but then if I
> select the 'Explorer' the subsequent screen opens but there are no
> graphical icons and it certainly doesn't look the same as it did on my
> other PC.
> 
> When I select open file to load a file there are numerous errors
> reported about inability to find java.awt etc...

Hmmm... I just installed 1.5.0_04 on my XP machine and it works fine...

Is the 1.5.0_04 the only java version you installed? You might wanna 
check what the default java version on your system is. Open a DOS Box 
(Start -> Run -> cmd.exe <Enter>) and type "java -version" followed by 
<Enter>. You should get something like this:

   java version "1.5.0_04"
   Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_04-b05)
   Java HotSpot(TM) Client VM (build 1.5.0_04-b05, mixed mode)

If this displays a different version, you might need to check (and 
modify) your registry (Start -> Run -> regedit.exe <Enter>). Go to the 
following key:

   HKEY_LOCAL_MACHINE\SOFTWARE\JavaSoft\Java Runtime Environment

There, check what version is listed under the value "CurrentVersion". It 
should be "1.5". Then go the subkey "1.5" and check the "JavaHome" value 
(Continue reading)

Paul | 1 Jul 2005 01:23
Picon
Gravatar

Re: Issues with Weka GUI

Thanks to the suggestions however still no luck.

Response from java -version was as expected. Hardcoding the java path
into the .bat file didn't improve things. I've the latest JVM - was
the suggestion to try an older version?

In the past I've installed weka onto winXP box's with no hassles what so ever...

Any other suggestions ...

/paul

On 7/1/05, Peter Reutemann <fracpete <at> waikato.ac.nz> wrote:
> > I've just tried to install weka3-4-4 on a new winXP pc. jdk1.5.0_04 /
> > jre1.5.0_04 installed and pc has been rebooted since installation.
> >
> > The GUI chooser displays after selecting 'RunWeka.bat' but then if I
> > select the 'Explorer' the subsequent screen opens but there are no
> > graphical icons and it certainly doesn't look the same as it did on my
> > other PC.
> >
> > When I select open file to load a file there are numerous errors
> > reported about inability to find java.awt etc...
> 
> Hmmm... I just installed 1.5.0_04 on my XP machine and it works fine...
> 
> Is the 1.5.0_04 the only java version you installed? You might wanna
> check what the default java version on your system is. Open a DOS Box
> (Start -> Run -> cmd.exe <Enter>) and type "java -version" followed by
> <Enter>. You should get something like this:
(Continue reading)

Peter Reutemann | 1 Jul 2005 01:44
Picon
Picon
Favicon

Re: Issues with Weka GUI

> Thanks to the suggestions however still no luck.
> 
> Response from java -version was as expected. Hardcoding the java path
> into the .bat file didn't improve things. I've the latest JVM - was
> the suggestion to try an older version?

Hmm... The only thing I can think of, would be to check your JAVA_HOME 
environment variable. Does that point to your JDK dir, just above the 
"bin" directory?

Otherwise try to compile the attached class and run it with no 
parameters. It outputs some Java system properties. Check out the 
following properties, whether they're correct (i.e., pointing to the 
correct directories):
   java.library.path
   java.home
   java.ext.dirs
   sun.boot.class.path

Sort of running out of ideas... Did you try re-installing the JDK?

Cheers, Peter
--

-- 
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/     +64 (7) 838-4466 Ext. 5174
import java.util.Properties;
import java.util.Enumeration;

(Continue reading)

Santi Planet | 1 Jul 2005 11:34

IBk and IB1

Hello,
 
I don't understand very well the differences between IBk and IB1. I have found this in the wekalist:
 
"There are a couple of differences. The biggest one is probably that IBk
will keep extending the list of neighbours when several instances are
equally far away. This happens quite frequently on datasets with
nominal attributes (e.g. more than one neighbour might be used for
prediction if k=1). IB1 doesn't do this and uses the first nearest
neighbour it finds."
 
Thinking about nominal attributes, I understant that, in IBk, you choose "several instances if they are equally far away", and then you choose the majority class of these instances, while, in the same case, IB1 only chooses "the first nearest neighbour it finds"... but if you had three instances equally far away, two of them belonging to class A and the other belonging to class B, which instance would you choose? All of the three instances are equally far away so I suppose that IBk would choose all three and the class assigned would be class A, but in IB1... is the choice of the first nearest neighbour a random choice? Is this the difference?
 
Thanks,
 
Santi.
_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
P. Klaas-Welter | 1 Jul 2005 11:35
Picon

Re: error measures of nominal attributes

Dear Eibe, dear Weka-users,

 

thank you a lot! You were right: I was mistaken with the vectors. But finally I’ve got it J

 

So how are the values of the error measures to interpret? Here are some assumptions of what I understood (from your book, this mailing list etc.):
*) If the root mean squared error is much different from the absolute value it could point to that the data has big and/or many outliers.
*) The relative absolute error compares the actual result to the result from a “simple” calculation (ZeroR for nominals).
If this value is >100% the simple calculation does better.
*) If the root relative squared error is >100% and the relative absolute error <100% it points to that the actual algorithm has more problems
with outliers than the “simple” calculation (ZeroR for nominals).
Am I right with these assumptions?

 

But there are some points that I not clear to me. Please have a look on my example:

Decision Table: Options: -R -I

Number of training instances: 17177

Number of Rules : 5049

Non matches covered by IB1.

Best first search for feature set,

terminated after 5 non improving subsets.

Evaluation (for feature selection): CV (leave one out)

Feature set: 1,2,3,4,5

 

Correctly Classified Instances        5531               32.2019 %

Incorrectly Classified Instances     11645               67.7981 %

Kappa statistic                          0.1378

Mean absolute error                      0.0073

Root mean squared error                  0.0663

Relative absolute error                 88.7076 %

Root relative squared error            103.7543 %
Total Number of Instances            17176 

About each third instance was correctly classified. But the Kappa statistic is quite bad (possible best value: 1, worst value: 0).
On the other hand the mean absolute error is quite good (possible best value: 0, worst value: 1). How is this to interpret?

With best wishes, Petra


Eibe Frank <eibe <at> cs.waikato.ac.nz> schrieb am 30.06.05 23:10:27:


The formula for the sum is correct but it seems like you are
misunderstanding how the two vectors are computed. One of the vectors
contains the predicted class probabilities that are output by the model
for a particular instance, the other vector contains the observed class
probabilities for that particular instance. The latter(!) vector has
one element that is 1 (the one for the actual class of the instance)
and all other elements are 0.

Cheers,
Eibe

On Jul 1, 2005, at 1:47 AM, P. Klaas-Welter wrote:

> I just noticed that the formula for the sum is not readable, therefore:
>
> dj = ¡Æi=1m | pi ¨C aji |  means: The sum from i=1 to m over | pi -
> aji |
>
>
> "P. Klaas-Welter" <P.Klaas-Welter <at> web.de> schrieb am 30.06.05 12:06:43:
>
>
> Dear Eibe,
>
> thank you very much! This was very helpful (and now I also found the
> right point in the book ;-)
>
> Just to be sure and because the error measures are so important I like
> to describe the other error values and I like to please you to check
> wether I'm right:
> Let the nominal attribute have m different values. Let the vector P
> contain all probabilities pi, that the nominal attribute has the value
> i. Those probabilities come from the frequencies of each value i. Let
> the vector A be the result from the model for the instance j. When k
> is the value that the model computed for instance j then all entries
> in A are zero but ajk, which is 1.
> To compute the mean absolute error you have to compute the absolute
> difference for each instance of vector P and vector Aj. This is done
> component-wise and is then summed up: dj = ¡Æi=1m | pi ¨C aji | .
> These differences dj (for the single instances) are then summed up
> over all instances and then divided by the number of instances.
>  
> And for the root mean squared error in dj you don¡¯t take the absolute
> value but the square. And before dividing through the number of
> instances you take the square root.
>
> Thank you very much! And with best regards, Petra
>
>  
>
>
> Eibe Frank schrieb am 29.06.05 23:35:59:
> >
> >
> > On Jun 30, 2005, at 12:50 AM, P. Klaas-Welter wrote:
> >
> > > Could someone please help me to understand the mean absolute error,
> > > root mean squared error,
> > > relative absolute error and root relative squared error of nominal
> > > attributes?
> > >
> > > I know that one can find this question several times in this
> > > mailing-list. But none of these
> > > could really help me. Or does someone know where to find a
> > > comprehensive explanation?
> > >
> > > As far as I understood (with help from what I read from Eibe
> Frank):
> > > root relative squared error: Let Y be the root mean squared error
> that
> > > is computed for the
> > > single class prior probabilities (frequencies). These probabilities
> > > are estimated from the training data
> > > with a simple Laplace estimator. Let X be the root mean squared
> > > error that came from the prediction of the model. Then the ~ is
> 100 *
> > > X / Y.
> > > So what is done with the mean value for numerical classes is done
> with
> > > estimated probabilities
> > > for nominal classes. The same for the relative absolute error.
> >
> > Yes, thats correct. Y is the error obtained from the probability
> > estimates generated by ZeroR (which just estimates the prior
> > probabilities).
> >
> > The squared error for a particular instance is given by the
> "quadratic
> > loss" function mentioned in our book (where we talk about evaluating
> > probability estimates). Its the sum of the squared differences
> between
> > the predicted class probabilities for a particular instance and the
> > observed class probabilities for that instance (which are either 0 or
> > 1). The absolute error is computed in the same way by taking the
> > absolute value of each difference instead of the square.
> >
> > Cheers,
> > Eibe
> >
>



Mit der Gruppen-SMS von WEB.DE FreeMail können Sie eine SMS an alle    
Freunde gleichzeitig schicken: http://freemail.web.de/features/?mc=021179   
   
  
_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Stanley, Yemin Shi | 1 Jul 2005 12:32
Picon
Favicon

Dear All,

Hi, All, I need some help on the dataset output.
After discretization, I need to output the discretized features. For example, a feature ranges from 1 to 100, could be divided into 1-15,16-28,29-100. Can I output features as 1,2 and 3?
 
Thank you
Stanley, Yemin Shi

Yahoo! Sports
Rekindle the Rivalries. Sign up for Fantasy Football
_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Stanley, Yemin Shi | 1 Jul 2005 14:51
Picon
Favicon

how to output the discretized features

Well, I will rephrase in this way.
My question is how to output the discretized features into some format that could be imported by matlab.
Since the discretized attributes, ranging from 1 to 100, are represented by '1-15','16-18','29-100'. I would like to transfer them into 1,2,3 instead of ranges,
Anyone know how to handle that?
 
Best
Stanley


"Stanley, Yemin Shi" <shi_yemin <at> yahoo.com> wrote:
Hi, All, I need some help on the dataset output.
After discretization, I need to output the discretized features. For example, a feature ranges from 1 to 100, could be divided into 1-15,16-28,29-100. Can I output features as 1,2 and 3?
 
Thank you
Stanley, Yemin Shi

Yahoo! Sports
Rekindle the Rivalries. Sign up for Fantasy Football _______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Peter Reutemann | 2 Jul 2005 09:15
Picon
Picon
Favicon

Re: how to output the discretized features

> Well, I will rephrase in this way.
> My question is how to output the discretized features into some format 
> that could be imported by matlab.
> Since the discretized attributes, ranging from 1 to 100, are represented 
> by '1-15','16-18','29-100'. I would like to transfer them into 1,2,3 
> instead of ranges,

This link might help you:
   http://weka.sourceforge.net/wiki/index.php/Rename_Attribute_Values

Cheers, Peter

--

-- 
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/     +64 (7) 838-4466 Ext. 5174
Andreia Vieira | 2 Jul 2005 23:11
Picon
Picon

Model tree cpu.arff

You can help me!!
Observing LM1 and LM2, as we can evaluate that the results were good? I
don't understand what wants to say the percentil.  How is he calculated?

The correlation coefficient is loud, even so, the% of LM1 it is very
low. Can we say that in this example the results were good?

LMMAX <= 14000 : LM1 (141/4.178%)
LMMAX >  14000 : LM2 (68/50.073%)
Correlation coefficient                  0.9766
Correlation coefficient                  0.9766
Mean absolute error                     13.6917
Root mean squared error                 35.3003
Relative absolute error                 15.6194 %
Root relative squared error             22.8092 %
Total Number of Instances              209

Thank you!!
Paul | 3 Jul 2005 23:29
Picon
Gravatar

Re: Model tree cpu.arff

To interpret your question - your asking what is the meaning of 
"Relative absolute error " is and whether the results you posted are
'good' or not.

What seems to be a good explanation can be found here :
http://grb.mnsu.edu/grbts/doc/manual/Error_Measurements.html

Something more weka related, look here :
https://list.scms.waikato.ac.nz/mailman/htdig/wekalist/2004-August/002817.html

/paul

On 7/3/05, Andreia Vieira <andreia <at> cpatc.embrapa.br> wrote:
> You can help me!!
> Observing LM1 and LM2, as we can evaluate that the results were good? I
> don't understand what wants to say the percentil.  How is he calculated?
> 
> The correlation coefficient is loud, even so, the% of LM1 it is very
> low. Can we say that in this example the results were good?
> 
> LMMAX <= 14000 : LM1 (141/4.178%)
> LMMAX >  14000 : LM2 (68/50.073%)
> Correlation coefficient                  0.9766
> Correlation coefficient                  0.9766
> Mean absolute error                     13.6917
> Root mean squared error                 35.3003
> Relative absolute error                 15.6194 %
> Root relative squared error             22.8092 %
> Total Number of Instances              209
> 
> 
> Thank you!!
> 
> 
> 
> 
> _______________________________________________
> Wekalist mailing list
> Wekalist <at> list.scms.waikato.ac.nz
> https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
>

Gmane