JRIP rules and n-fold Crossvalidation
spaniard81 <raveesh.tmh <at> gmail.com>
2015-07-29 13:54:35 GMT
When using JRIP and 10 fold cross-validation a new set of rules is learned
and tested in each iteration. Thus a total of 10 rule-sets were learned.
However, the Weka Explorer shows just one rule-set.
Where does this rule-set come from? Is it the best one among the set or a
rule learned from a 80-20 split of the entire dataset (in fact the rule is
printed on the Explorer even before the 10 folds are completed). If the
latter is true then what does the performance summary suggest? The average
of the i) figures from the 10 iterations of cross-validation (this would not
make sense) or iii) the figures of the 10 folds on just this rule-set that
was learned from entire data?
I am tweaking with JRIP code. I want to learn rules that achieve high
precision without worrying about the recall, i.e. I am only interested in
learning rules that precisely describe the class of interest. I have not yet
figured the solution as it seems the pruning and optimization steps already
rely on accuracy. That is work in progress. But I do not know how to print
the rule set as I see in Weka Explorer on my console. What I see on my
console is rules from each iteration of the crossvalidation.
Any help on clarification about the rule-set in Weka Explorer and how to
obtain that through the JAVA api would we appreciated.
View this message in context: http://weka.8497.n7.nabble.com/JRIP-rules-and-n-fold-Crossvalidation-tp35256.html
Sent from the WEKA mailing list archive at Nabble.com.
Wekalist mailing list
Send posts to: Wekalist <at> list.waikato.ac.nz
List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html