Daniel Arndt | 1 Sep 2010 01:59
Picon

Re: Help: model output

Hi Amy,

The three commands should output the same model. See page 38 of the manual (the manual is refering to using explorer, but the same principle should apply):

"Note: No matter which evaluation method is used, the model that is output is
always the one build from all the training data"

If you're still curious, read the output of performing your commands. There should be a representation of the tree in human readable format under a heading "J48 pruned tree".

Or, the explanation of the -d option may help:

-d
The model after training can be saved via this parameter. Each classifier has a different binary format for the model, so it can only be read back by the exact same classifier on a compatible dataset. Only the model on the training set is saved, not the multiple models generated via cross-validation.

Best of luck,

--
Daniel Arndt


2010/8/31 TanAmy <amytanxy <at> hotmail.com>:
> Hi All,
>
> I was a little confused about the model output when I was using weka
> building classifiers.
>
> I generated J48 classifiers by using the following three commands. I'd like
> to know whether model 1,2,3 are the same or not. What are the differences
> between them. Thanks! 
>
>  
> weka.classifiers.trees.J48 -t input.arff -d 1.model
>
> weka.classifiers.trees.J48 -t input.arff -d 2.model -T test.arff
>
> weka.classifiers.trees.J48 -t input.arff -d 3.model -split-percentage 66
>
>
>
>
>
> Best Regards!
> Amy
>
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: Wekalist <at> list.scms.waikato.ac.nz
> List info and subscription status:
> https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette:
> http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
>




_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Daniel Arndt | 1 Sep 2010 02:22
Picon

Re: Help: CostSensitve Feature Selection vs CostsSensiveClassifier

Hi again Amy,

Can't say I'm an expert on Cost Sensitive Learning, but until the
other side of the world wakes up you may want to look at the Resample
class

http://weka.sourceforge.net/doc.dev/weka/filters/supervised/instance/Resample.html

The Resample class will let you make a balanced training set --
however, at the cost of either repeated information (up-sampling) or
less information provided (down-sampling)

Given that you have 500 features, feature selection is probably a good
choice (many of them are likely unnecessary) but perhaps someone else
can educate you more on that.

Best of luck,

--
Daniel Arndt
Dalhousie University, Nova Scotia, Canada

2010/8/31 TanAmy <amytanxy <at> hotmail.com>
>
> Hi All,
>
> I am dealing with umbalanced data set in a binary classification problem,  in which the Class 1 is only
around 5% and Class 0 is about 95% of the whole data set. However detecting Class 1 instances are more important.
>
> After doing some research, I got to know CostSensiveLearning might help in this case. Because each
instance contains more than 500 attributes. An appropriate feature selection may be needed here.
>
> I am wondering if I should run  CostSensitve Feature Selection fisrt and followed by normal
classifications OR run normal feature selection first and followed by CostsSensiveClassifier
training OR what else.
>
> If you have any experience on this, please advise me. Thanks!
>
> Best Regards!
>
> Amy
>
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: Wekalist <at> list.scms.waikato.ac.nz
> List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Mark Hall | 1 Sep 2010 06:35
Favicon

Re: P-values of the variables in classification

On 28/08/10 4:53 AM, daniel d'andrea wrote:
> Dear all,
>
> I execute a classification on dataset (i.e. by scheme
> weka.classifiers.functions.Logistic ). WEKA returns me the B values of
> coefficients but I need the relative p-values (or significance). Can I
> obtain they with WEKA?

I'm afraid that Weka's Logistic class does not report p-values. It does 
give you the odds ratios for the variables which is an indication of the 
importance of each variable with respect to predicting the target.

Cheers,
Mark.

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Mark Hall | 1 Sep 2010 06:51
Favicon

Re: Manual threshold for ThresholdSelect

On 28/08/10 8:30 AM, Yonghee Shin wrote:
>
>
>>>/
> />>/  Hi,
> />>/
> />>/  Would you please let me know what is the meaning of setting a manual
> />>/  threshold with ThresholdSelect?
> />/>  As far as I know, weka classifiers classify an instance as positive if
> />/>  the predicted probability of "positive"' is greater than 0.5.
> />/>  If I want to use set the threshold to 0.8, would the following code work?
> />>/
> />/>  ts.setMeasure(new SelectedTag(ThresholdSelector.TRUE_POS,
> />/>  ThresholdSelector.TAGS_MEASURE));
> />/>  ts.setManualThresholdValue(0.8);
> />>/
> />>/
> />/>  If I use RECALL instead of TRUE_POS for the performance measure to be
> />/>  optimized as below, what's the meaning of the manual threshold value,
> />/>  say 0.8, for RECALL?
> />>/
> />/>  ts.setMeasure(new SelectedTag(ThresholdSelector.RECALL,
> />/>  ThresholdSelector.TAGS_MEASURE));
> />/>  ts.setManualThresholdValue(0.8);
> /
>>  The threshold is always applied to the predicted probability, so it
>>  doesn't matter which measure you choose when using a manually set
>>  threshold. If you don't use a manually set threshold then the method
>>  finds the optimal threshold value with respect to the measure of choice.
>
>>  Cheers,
>>  Mark.
>
> Thanks a lot for your answer.
> Then I guess I should set either a manual threshold or a measure, not both.
> If I happen to set both, the measure will be ignored and only the manual threshold will be applied.
>
> Still a strange thing is if I make the threshold over 0.5, I expect recall becomes lower and precision
becomes higher.
> However, in my classification, recall became higher and precision became lower.
> Am I misunderstanding somethings?

I'm not sure. A test on the German credit data from UCI using Naive 
Bayes as the base classifier shows that precision does increase as the 
threshold increases over 0.5. Did you set the correct class value to 
consider with the -C option?

Cheers,
Mark.

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Mark Hall | 1 Sep 2010 10:02
Favicon

Re: Weka web-based call

On 8/28/10 8:51 AM, Rawad Hammad wrote:
> Hi,
> I am going to use Weka to cluster student data from web-based elearning
> system (Moodle)? I am asking about the possibility of embedding a code
> in Moodle that is able to call Weka and execute the process of clustering?
> Thanks a lot

I don't know anything about Moodle, but if you can execute a java 
program from Moodle via a system call or script, or if the app server 
that Moodle runs in can handle JSPs, servlets, EJBs etc., then you can 
call out to Weka to process some data.

Cheers,
Mark.

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
Mark Hall | 1 Sep 2010 10:10
Favicon

Re: change attributes immediately

On 8/31/10 2:04 AM, Hsiang-Min Yu wrote:
> hi there:
>
> I'm now implementing the Collective Classification, the ICA (Iterative
> Classification Algorithm)
> so, i need to change the attributes of each instance immediatelly when
> the label of related instance is changed.
> then classify the instance with new attribute
>
> i'm wondering
> is there any method to do it without printing an arff file then read it
> to classify each time?
> or any hints ?

Take a look at the javadoc for the Instance class:

http://weka.sourceforge.net/doc.stable/weka/core/Instance.html

Cheers,
Mark.

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Mark Hall | 1 Sep 2010 10:25
Favicon

Re: modifying weka algorithm with Eclipse

On 8/31/10 11:07 AM, loperam wrote:
>
> Hello
> Hope someone is able to help me.
> I'm trying to modify a weka algorithm with Eclipse but I keep getting an
> error.
>
> According to this page:
> http://weka.wikispaces.com/Eclipse+3.4.x+%28weka-src.jar%29
> I should use ant exejar to extract the source code. However, when I use ANT
> I get this error:
>
> Buildfile: /home/user/temp/weka/weka-3-7-2/weka-src/build.xml
> BUILD FAILED
> /home/user/temp/weka/weka-3-7-2/weka-src/build.xml:74:
> /home/user/temp/weka/weka-3-7-2/packages/internal does not exist.
> Total time: 0 seconds

Line 74 (<property name="pack.cp" refid="package.class.path"/>) is not 
needed and can be removed or commented out in the build file. You should 
be able compile the core of Weka with ant once this is done without 
needing the packages.

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Mark Hall | 1 Sep 2010 10:31
Favicon

Re: sample java code for selecting attributes using ClassifierSubsetEval and WrapperSubsetEval

On 8/31/10 1:55 PM, kck08 wrote:
>
>
> Mark Hall-9 wrote:
>>
>> On 8/20/10 6:11 PM, kck08 wrote:
>>> hi, do you have samples java code for selecting attributes using
>>> ClassifierSubsetEval and WrapperSubsetEval? i'm able to code for filter
>>> approach but facing problem for the above two. thank you. regards, kck
>>
>> Have you taken a look at this Wiki page?:
>>
>> http://weka.wikispaces.com/Use+Weka+in+your+Java+code#Attribute selection
>>
>> Cheers,
>> Mark.
>>
>> _______________________________________________
>> Wekalist mailing list
>> Send posts to: Wekalist <at> list.scms.waikato.ac.nz
>> List info and subscription status:
>> https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
>> List etiquette:
>> http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>>
>>
>
> thanks for your reply. i have checked your link.
> however, i still have problem with the WrapperSubsetEval.
> the following is my code and it gives me some output. but the output i get
> is different from the out generated using weka explorer and i'm not sure
> what is wrong.

You do realize that the search() method in any Weka subset evaluator 
returns an array of attribute indexes that is zero-based? The Explorer 
outputs a one-based list of selected attributes.

Cheers,
Mark.

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Mark Hall | 1 Sep 2010 10:35
Favicon

Re: correspondence between class index and class value

On 8/31/10 9:52 PM, giuseppe morgese wrote:
>> Take a look at the javadoc for the Attribute class:
>
>> http://weka.sourceforge.net/doc.stable/weka/core/Attribute.html
>
>> There is a method to get the (string) value of a nominal or string
>> attribute given a value index.
>
>
> I know that the class Attribute can let me do this but I prefer the double
> value instead of the string value of the class.
> This is why I make my question. The class Attribute hasn't something like
> that for double value.

I guess I don't understand what you are asking (or trying to do). Given 
a double value index you can get the corresponding String label for any 
nominal attribute (class or otherwise).

Cheers,
Mark.

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Mark Hall | 1 Sep 2010 10:45
Favicon

Re: Help: CostSensitve Feature Selection vs CostsSensiveClassifier

On 9/1/10 9:58 AM, TanAmy wrote:
> Hi All,
> 
> I am dealing with umbalanced data set in a binary classification 
> problem, in which the Class 1 is only around 5% and Class 0 is about 95% 
> of the whole data set. However detecting Class 1 instances are more 
> important.
> 
> After doing some research, I got to know CostSensiveLearning might help 
> in this case. Because each instance contains more than 500 attributes. 
> An appropriate feature selection may be needed here.
> 
> I am wondering if I should run CostSensitve Feature Selection fisrt and 
> followed by normal classifications OR run normal feature selection first 
> and followed by CostsSensiveClassifier training OR what else.
> 
> If you have any experience on this, please advise me. Thanks!

What about trying the AttributeSelectedClassifier as the base classifier
for the CostSensitiveClassifier (using cost-sensitive learning, not the
minimum expected cost method)? This way the attribute selection process
gets the re-weighted training data created by the CostSensitiveClassifier.

Cheers,
Mark.

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Gmane