help me | 1 Jun 01:24 2005
Picon

FROM A LOT OF COLUMN TO A FEW/LITTLE ANTRIBUT IN .ARFF ???

if entire/all data of my Transkip,IPS and KRS mention as antribut format . arff
 will  101 antribut in format . arff.
  <at>  ATTRIBUTE TRANSKIP_1{ here there [is] 9 different value}
  <at>  ATTRIBUTE TRANSKIP_2{ here there [is] 9 different value}
  <at>  ATTRIBUTE TRANSKIP_50{ here there [is] 9 different value 
  <at>  ATTRIBUTE IPS {...}
  <at>  ATTRIBUTE KRS_1{YA}
  <at>  ATTRIBUTE KRS_2{YA }
  <at>  ATTRIBUTE KRS_50{YA}
 the data having the following meaning.
 Transkip_1 until transkip_50 have same nominal, weither different antributnya.
 Nominal IPS diferent other antribut or cn same each other.
 Transkip and KRS its differ preposition here, in fact its meaning
much the same to merely
 differing data content and analyzable so that its transkip,ips and krs.
 Practically admission filling data used biggest KRS possibility only
9 KRS. But
 KRS_X will be able to differ to usher the data which is one with the
other data
  <at>  data ?,?,...,? etc 
 if its data one [do] not experience of the problem. If its data 2
arising problem
 java.lang.outOfMemoryError 
 I overcome by java - Xmx1000M - jar weka.jar 
 its problem in the reality my memory  insufficient. according to
Peter Reutemann require 850MB.  Unhappily
 computer which I utilize 128MB 
 becoming I require for the modification of return my antribut, but I confuse?
 become the me request its aid user's WEKA which often make its data
become the . arff. Of course
(Continue reading)

help me | 1 Jun 01:27 2005
Picon

FROM A LOT OF COLUMN TO A FEW/LITTLE ANTRIBUT IN .ARFF ???

if entire/all data of my Transkip,IPS and KRS mention as antribut format . arff
 will  101 antribut in format . arff.
  <at>  ATTRIBUTE TRANSKIP_1{ here there [is] 9 different value}
  <at>  ATTRIBUTE TRANSKIP_2{ here there [is] 9 different value}
  <at>  ATTRIBUTE TRANSKIP_50{ here there [is] 9 different value 
  <at>  ATTRIBUTE IPS {...}
  <at>  ATTRIBUTE KRS_1{YA}
  <at>  ATTRIBUTE KRS_2{YA }
  <at>  ATTRIBUTE KRS_50{YA}
 the data having the following meaning.
 Transkip_1 until transkip_50 have same nominal, weither different antributnya.
 Nominal IPS diferent other antribut or cn same each other.
 Transkip and KRS its differ preposition here, in fact its meaning
much the same to merely
 differing data content and analyzable so that its transkip,ips and krs.
 Practically admission filling data used biggest KRS possibility only
9 KRS. But
 KRS_X will be able to differ to usher the data which is one with the
other data
  <at>  data ?,?,...,? etc 
 if its data one [do] not experience of the problem. If its data 2
arising problem
 java.lang.outOfMemoryError 
 I overcome by java - Xmx1000M - jar weka.jar 
 its problem in the reality my memory  insufficient. according to
Peter Reutemann require 850MB.

Unhappily 
 computer which I utilize 128MB 
 becoming I require for the modification of return my antribut, but I confuse?
(Continue reading)

Mark Hall | 1 Jun 01:57 2005
Picon

Re: WrapperSubsetEval and GeneticSearch from command line

Hi Terry,

Try something like:

java weka.attributeSelection.WrapperSubsetEval -I iris.arff -B
weka.classifiers.trees.J48 -S "weka.attributeSelection.GeneticSearch
-Z 20 -G 10" -- -C 0.2

This shows setting some options for the GeneticSearch and some for the
base classifier (J48). Note that the options for J48 come last on the
command line following a "--".

Some more examples of using Weka's attribute selection package from
the command line can be found at:

http://www.laps.ufpa.br/aldebaro/weka/feature_selection.html

Cheers,
Mark.

--

-- 
Mark Hall
Department of Computer Science
University of Waikato
Hamilton
New Zealand
www.cs.waikato.ac.nz

On 6/1/05, wekalist-request <at> list.scms.waikato.ac.nz
<wekalist-request <at> list.scms.waikato.ac.nz> wrote:
(Continue reading)

Terry Letsche | 1 Jun 04:41 2005
Picon

Re: WrapperSubsetEval and GeneticSearch from command line

Thanks, Mark. I'll give it a go!

Terry

----- Original Message ----- 
From: "Mark Hall" <mhallster <at> gmail.com>
To: <wekalist <at> list.scms.waikato.ac.nz>; <terry <at> letsche.net>
Sent: Tuesday, May 31, 2005 6:57 PM
Subject: Re: [Wekalist] WrapperSubsetEval and GeneticSearch from command 
line

Hi Terry,

Try something like:

java weka.attributeSelection.WrapperSubsetEval -I iris.arff -B
weka.classifiers.trees.J48 -S "weka.attributeSelection.GeneticSearch
-Z 20 -G 10" -- -C 0.2

This shows setting some options for the GeneticSearch and some for the
base classifier (J48). Note that the options for J48 come last on the
command line following a "--".

Some more examples of using Weka's attribute selection package from
the command line can be found at:

http://www.laps.ufpa.br/aldebaro/weka/feature_selection.html

Cheers,
Mark.
(Continue reading)

Ashraf Kibriya | 1 Jun 05:09 2005
Picon
Picon

RE: Question of the identification of each instance in test dataset

Hi Jenny,
What sort of output do you exactly mean? Currently ComplementNaiveBayes simply aggregates the
counts/frequencies of all the attributes for all the instances of each class and simply outputs the
weight of each attribute given class (the toString() method in ComplementNaiveBayes). The weight of an
attribute i for class c is calculated from the aggregate of counts/frequencies of attribute i in all the
training instances belong to class c. 

How are you doing the evaluation? Are you using separate test and training instances? In that case the order
of input is the same as in the arff file. However, if you are using cross-validation then it is not.

Hope this helps.

Regards,
Ashraf

------------------------------

Message: 5
Date: Tue, 31 May 2005 18:05:41 +0800
From: "Jenny Wang" <m924020004 <at> student.nsysu.edu.tw>
Subject: [Wekalist] Question of the identification of each instance in
	test	dataset
To: wekalist <at> list.scms.waikato.ac.nz
Message-ID: <20050531095727.M83628 <at> student.nsysu.edu.tw>
Content-Type: text/plain;	charset=big5

Dear all,
I would like to know how to get the identification of each instance when 
using weka.classifiers.bayes.ComplementNaiveBayes.
Is the order of output the same as the order of instances in arff file?
(Continue reading)

Mark Hall | 1 Jun 05:24 2005
Picon

Re: cluster visualize stars?

Hi Ankie,

If you get little boxes when you visualize the result of clustering
then you must have done a "classes to clusters" evaluation. This
process does a minimum error assignment of class labels to clusters
and the boxes indicate which examples have been misclassified.

Cheers,
Mark.

-- 
Mark Hall
Department of Computer Science
University of Waikato
Hamilton
New Zealand
www.cs.waikato.ac.nz

> 
> Message: 2
> Date: Tue, 31 May 2005 11:54:58 +0200
> From: "Zanden, Ankie van der (50134)"
>         <a.van.der.zanden <at> haaglanden.politie.nl>
> Subject: [Wekalist] cluster visualize stars?
> To: "'wekalist <at> list.scms.waikato.ac.nz'"
>         <wekalist <at> list.scms.waikato.ac.nz>
> Message-ID:
>         <286E71ED07F16340B85965336142221DB3C41C <at> HGL00-008.west.politie.nl>
> Content-Type: text/plain;       charset="ISO-8859-1"
> 
(Continue reading)

Eibe Frank | 1 Jun 06:12 2005
Picon
Picon

Re: EM and dependent attributes

It assumes that the attributes are independent per cluster, not over 
the whole space.

Cheers,
Eibe

On May 31, 2005, at 9:58 PM, Zanden, Ankie van der (50134) wrote:

> Hello again,
>
> I have one other question about the EM clustering.
> Does EM assume that all attributes are independent?
> In other words: should I delete attributes on forehand if I know that 
> they have correlation to another attribute in the dataset?
>
> Thanks!
> Ankie
>
> Disclaimer (English text below)
>
> De informatie, die deze E-mail en de daaraan gekoppelde bestanden 
> bevat, is vertrouwelijk en kan wettelijk beschermd zijn. De informatie 
> is alleen bestemd voor de persoon of de organisatie waaraan deze 
> informatie is gericht. In geval u niet de gerechtigde ontvanger bent, 
> dan wijs ik u erop dat het openen, kopiëren, distribueren of handelen 
> in relatie tot de inhoud van deze E-mail en de daaraan gekoppelde 
> bestanden niet toegestaan is en mogelijk strafbaar kan zijn.
> Politie Haaglanden of Politie Hollands Midden is niet aansprakelijk 
> voor onjuiste en onvolledige overdracht van de inhoud van de E-mail en 
> de daaraan gekoppelde bestanden, noch is Politie Haaglanden of Politie 
(Continue reading)

Jenny Wang | 1 Jun 07:44 2005
Picon

RE: Question of the identification of each instance in test dataset

Dear Ms.,
I add the following code in the "classifyInstance" class in order to get the 
P(c|d) of each instance when testing.
// System.out.print(instance.dataset().numInstances()+"|"+"="+minidx+"|"+ 
valueForClass[minidx]+"\t");
I not sure if the "valueForClass[c]" is the P(c|d) in testing.

However, after adding the code, it returns two results. 
The second is the result of each instance with 10 fold cross-validation.
The first part looks like the information of training data, but I don't 
realize what is the probability returned.
Another question are the proababilities presented in log?

> What sort of output do you exactly mean? Currently 
> ComplementNaiveBayes simply aggregates the counts/frequencies of all 
> the attributes for all the instances of each class and simply 
> outputs the weight of each attribute given class (the toString() 
> method in ComplementNaiveBayes). The weight of an attribute i for 
> class c is calculated from the aggregate of counts/frequencies of 
> attribute i in all the training instances belong to class c.

When doing evaluation, I would like to manipulate my dataset and separate it 
myself.
I think the key point for me is how to get the identification of each 
instance when using ComplementNaiveBayes. ???
If I want to change the core function of ComplementNaiveBayes ( change to: 
min[P(c|d)* f] ) in testing,
should I modify the ComplementNaiveBayes.java or the Evaluation.java?

> How are you doing the evaluation? Are you using separate test and 
(Continue reading)

Picon

RE: EM and dependent attributes

Thanks, but I do not know weather to delete some attributes or not. My situation:
I have a dataset with each record a unique person.
I have an attribute that tells in wich city the person lives, and an attribute that tells the size of the city
the person lives in. So these are dependent. (everyone who lives in Amsterdam, lives in a big city)
Now I get 7 clusters. Most of them are obvious based on the cities. But isn't that trivial because I already
know that there is a relation? Should I delete one?

Thanks in advance,
Ankie

-----Oorspronkelijk bericht-----
Van: Eibe Frank [mailto:eibe <at> cs.waikato.ac.nz]
Verzonden: woensdag 1 juni 2005 6:13
Aan: Zanden, Ankie van der (50134)
CC: 'wekalist <at> list.scms.waikato.ac.nz'
Onderwerp: Re: [Wekalist] EM and dependent attributes

It assumes that the attributes are independent per cluster, not over 
the whole space.

Cheers,
Eibe

On May 31, 2005, at 9:58 PM, Zanden, Ankie van der (50134) wrote:

> Hello again,
>
> I have one other question about the EM clustering.
> Does EM assume that all attributes are independent?
> In other words: should I delete attributes on forehand if I know that 
(Continue reading)

cibelle | 1 Jun 11:47 2005
Picon

SMO Weka

Hello all,

I am using a dataset of 490 instances with 5 attributes in the SMO Classifier
and I found 50% of corrects instances with polynomial kernel of 2 degrees?it
will be that I can better my results??
What is the most important in my classification are accuracy ? precision??
Because there are three attributes with zero precision, is it possible? or
every attributes have an precision?
What describe the separating optimal hyperplane in a Linear SMO are the smallest
attribute weights??
Anyone have some paper or tutorial about SMO Classifier that explain your
filters (normalize and standardize training data) and test options???

Pleases send me!

Thank you for any help.

Simone

-----------------------------------------------------------
Esta mensagem foi enviada atraves da pagina Correio.UFPA.BR

Gmane