Thomas Vestskov Terney | 1 Dec 2005 13:51
Picon
Picon
Favicon

Macro and micro-averaging (repost)

Hi Weka people,

Sorry for the post in html last time...no one replied.

I can't figure out how to calculate macro vs. micro-averaging. The principle is easy to understand, but I
always seem to get the same precision/recall when performing micro averaging. Can you please help me
getting from a confusion matrix like this:

	A	B	C
A	4	1	0
B	0	1	1
C	0	1	2

to a 2x2 contingency table for performing micro-averaging? I get macro-averaged precision to
4/5+1/2+2/3=66% and recall to 4/4+1/3+2/3=67%, and micro-averaged to precision/recall to 7/10. (If
this is right, please change the numbers so that they illustrate a confusion matrix which results in
different precision recall when performing micro-averaging)

Cheers,
Thomas

PS. Does Weka perform this kind of calculation for me?
Susana Ferreiro | 1 Dec 2005 13:59
Picon
Favicon

3-classes FalsePositives, FalseNegatives, TruePositives, TrueNegatives

Skipped content of type multipart/alternative-------------- next part -----=
---------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 2799 bytes
Desc: logo.gif
Url : https://list.scms.waikato.ac.nz/pipermail/wekalist/attachments/200512=
01/6a840c1e/attachment.gif
Susana Ferreiro | 1 Dec 2005 14:02
Picon
Favicon

using cost-matrix

Skipped content of type multipart/alternative-------------- next part -----=
---------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 2799 bytes
Desc: logo.gif
Url : https://list.scms.waikato.ac.nz/pipermail/wekalist/attachments/200512=
01/d92fbac1/attachment.gif
Stefanos Maragos | 1 Dec 2005 19:02
Picon

Re: Percentage Split - from command line

thank you Peter for you interest,
but could you be more specific about RemovePercentage filter.
For example if I want to use J48 with percentage split 40% in the
input file: dataset.arff , are the follow commands right??:

java weka.filters.unsupervised.instance.RemovePercentage -P 40 -i
dataset.arff  -o  train.arff
java weka.filters.unsupervised.instance.RemovePercentage -P 60 -i
dataset.arff  -o  test.arff
java weka.classifiers.trees.J48 -t train.arff -T test.arff

Thanks in advance

On 11/24/05, Peter Reutemann <fracpete <at> waikato.ac.nz> wrote:
>  > I am writing a program, that uses Weka from command-line, and I don't
>  > know what option i have to use to modify  " Percentage Split " on
>  > input file.
>
> You will have to write it yourself or use the RemovePercentage filter
> (weka.filters.unsupervised.instance.RemovePercentage).
>
> The following code fragement from the Explorer
> (weka.gui.explorer.ClassifierPanel) generates two datasets called
> "train" and "test":
>
>    inst.randomize(new Random(rnd));
>    int trainSize = inst.numInstances() * percent / 100;
>    int testSize = inst.numInstances() - trainSize;
>    Instances train = new Instances(inst, 0, trainSize);
>    Instances test = new Instances(inst, trainSize, testSize);
(Continue reading)

Peter Reutemann | 1 Dec 2005 21:05
Picon
Picon
Favicon

Re: Percentage Split - from command line

> For example if I want to use J48 with percentage split 40% in the
> input file: dataset.arff , are the follow commands right??:
> 
> java weka.filters.unsupervised.instance.RemovePercentage -P 40 -i
> dataset.arff  -o  train.arff
> java weka.filters.unsupervised.instance.RemovePercentage -P 60 -i
> dataset.arff  -o  test.arff
> java weka.classifiers.trees.J48 -t train.arff -T test.arff

The second call of RemovePercentage should be the following:

java weka.filters.unsupervised.instance.RemovePercentage -P 40 -i
dataset.arff  -o  test.arff -V

The "-V" inverts the matching, i.e., instead of removing the first 40 
percent it skips them and removes the remaining 60.

Cheers, Peter
--

-- 
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/     +64 (7) 838-4466 Ext. 5174
Peter Reutemann | 2 Dec 2005 01:22
Picon
Picon
Favicon

Re: RE: how to locate misclassified instances?

> I performed the operation I described using shell scripts. Though with the 
> exception of adding the RecID -- I did it manually, though there may be a 
> filter which would do it for you -- the rest of it can definitely be done 
> in explorer.

I've added a new filter to the CVS that adds a unique ID to each instance:
   weka.filters.unsupervised.attribute.AddID

More about CVS access:
   http://weka.sourceforge.net/wiki/index.php/CVS

HTH

Cheers, Peter
--

-- 
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/     +64 (7) 838-4466 Ext. 5174
TUED (Tue Deleuran | 2 Dec 2005 08:13
Favicon

RE: RE: how to locate misclassified instances?

thanks a lot,
I will use that next time. For now it works as Andrew described. It is a
bit more difficult but it works
tue

-----Original Message-----
From: Peter Reutemann [mailto:fracpete <at> waikato.ac.nz] 
Sent: 2. december 2005 01:23
To: Andrew Rosenberg
Cc: TUED (Tue Deleuran); WekaList
Subject: Re: [Wekalist] RE: how to locate misclassified instances?

> I performed the operation I described using shell scripts. Though with

> the
> exception of adding the RecID -- I did it manually, though there may
be a 
> filter which would do it for you -- the rest of it can definitely be
done 
> in explorer.

I've added a new filter to the CVS that adds a unique ID to each
instance:
   weka.filters.unsupervised.attribute.AddID

More about CVS access:
   http://weka.sourceforge.net/wiki/index.php/CVS

HTH

(Continue reading)

Andrew Rosenberg | 2 Dec 2005 14:17

Re: RE: how to locate misclassified instances?

Talk about service with a smile!

Thanks a ton, Peter.

-a.

On Fri, 2 Dec 2005, Peter Reutemann wrote:

> > I performed the operation I described using shell scripts. Though with the 
> > exception of adding the RecID -- I did it manually, though there may be a 
> > filter which would do it for you -- the rest of it can definitely be done 
> > in explorer.
> 
> I've added a new filter to the CVS that adds a unique ID to each instance:
>    weka.filters.unsupervised.attribute.AddID
> 
> More about CVS access:
>    http://weka.sourceforge.net/wiki/index.php/CVS
> 
> HTH
> 
> Cheers, Peter
> 
Jianye GE | 2 Dec 2005 20:57
Picon

Maximum Likelihood V.S. Machine Learning

I found in many cases machine learning methods beat maximum likelihood
method. Anyone can explain theoretical  why machine learning is better
than maximum likelihood method?

I appreciate!

Jianye Ge
Chu Tan | 2 Dec 2005 21:10
Picon
Favicon

Generate instances from BN

I realized that the generate instance method will not be correct if it 
the BN contains edges that are not obeying the node ordering.

Here's how I come to the conclusion:

    public void generateInstances(){
        // Iterate # of intances to generate
        for (int iInstance = 0; iInstance < m_nNrOfInstances; iInstance++) {
            ... ...
            // Assign attributes sequentially
            for (int iAtt = 0; iAtt < nNrOfAtts; iAtt++) {
                ... ...
                // Look for parents of attribute
                for (int iParent = 0; iParent < 
m_ParentSets[iAtt].getNrOfParents(); iParent++) {
                  int nParent = m_ParentSets[iAtt].getParent(iParent);

                  // ERROR: What if instance.value(nParent) is not yet 
generated?
                  iCPT = iCPT * 
m_Instances.attribute(nParent).numValues() + instance.value(nParent);
                }

             ... ... the rest

Gmane