Statistics of classification's results
Alban Levy <pmxal9 <at> nottingham.ac.uk>
2014-10-24 22:48:01 GMT
Dear list members,
Greetings from Nottingham.
Is there a known relationship between some statistics of a dataset and the statistics obtained from running various classification algorithms on it (kappa statistic,...)?
More precisely: After running many classification algorithms with 10-fold CV on 6 datasets (each being preprocessed in various ways), we normalised some information scores (namely: percentage of correct answers, K&B Mean information, kappa, weighted Area under ROC) and accumulated the values obtained from each classification (see attached picture).
The surprise came from the various behaviours of the curves, and I couldn't find any satisfying explanation of why, for example, on some dataset the blue curve is on top (K&BMeanInformation), when on some other the violet is (weighted area under ROC). Is there any?
As this was rather puzzling, any lead would be appreciated.
Thanks for reading.
This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.
This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system, you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.
Wekalist mailing list
Send posts to: Wekalist <at> list.waikato.ac.nz
List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html