1 Mar 2010 02:13
Re: cluster model validation
Mark Hall <mhall <at> pentaho.com>
2010-03-01 01:13:55 GMT
2010-03-01 01:13:55 GMT
On 26/02/10 5:59 AM, wessel van persie wrote: > Dear All, > > How to estimate the performance of models build by unsupervised > learning algorithms in WEKA? > I'm talking about algorithms which can be found in the associate or cluster tab. > > Is it possible to de a "data reproduction validation" in WEKA? > This validation is like "classes to cluster validation" but without a > fixed class. > > // X percent of the test data will be removed and validated on > global X > > // pseudo code "data reproduction validation" > for each record in testset: > randomly remove X values from record > using model(trainingset) reproduce these removed values > success rate = number of values correctly reproduced There is no general evaluation process to do this in Weka. I guess the within cluster sum of squares (computed by SimpleKMeans on the training data) is sort of similar to this. Cheers, Mark. _______________________________________________ Wekalist mailing list Send posts to: Wekalist <at> list.scms.waikato.ac.nz(Continue reading)
RSS Feed