2 Oct 08:14 2014

### Differences in using WEKA versions for SpreadSubSample filter

Dear All

Can some one clarify if there is a change in the implementation of SpreadSubSample filter between versions 3-6-5 and later? I did not find anything in the CHANGELOG but there seems to be some difference in the outputs I get.

I applied the SpreadSubSample filter to create a data-subset containing equal instances of all classes. I then used SMO to classify the dataset. The results have a 1.5% difference between 3-6-5 and (3-6-10, 3-6-11) in terms of classification accuracy. The last two versions show the same result. But, 3-6-5 shows a different result. The seed value in all the cases remained the same for the filter.

This is not an issue with SMO implementation - because without using SpreadSubSample, I get the same accuracy with SMO for all the three versions.

If this additional detail is useful: I used 3-6-5 and 3-6-11 on MacOSX and 3-6-10 on Ubuntu.

Thanks.
Sowmya.

1 Oct 20:53 2014

### Setting Weka as the Default Program for .ARFF Files

​Hello,

While trying to open an .arff in Notepad, my default settings were changed to where I cannot open an .arff file in Weka through my Explorer. When I double-click an .arff file in my explorer, a command prompt blinks on the screen and nothing happens after that. Before, I was able to double-click an .arff file and it would open in Weka automatically. When I tried to reset Weka as the default program, it makes the default as RunWeka which is a Batch File. Weka 3-6 is located in my Program Files (86x). I am currently using Windows 8. In my explorer, it used to show the Weka icon next to the .arff file, now it shows a blank page.

How do I set the .arff files to open automatically in Weka?

Thank you.

1 Oct 12:55 2014

### In Re: Weka Logistic function

Hello All,

I have this challenge, and I am just wondering if there could be a way out.

I experimented with a particular dataset (which has many IDs) using various classifiers on a
classification problem, and then I found some to be very good.
Now, I need to save the resultant model as a template for testing other type(s) of classification problem,
to see which class(es) the new dataset belongs to. I attempted to try it on Weka, but the dataset only
contain one ID so cannot be used.
Please what can I do to get this problem solved.

Thanks
Richard
1 Oct 08:42 2014

### Weka does not free memory when using Explorer

Hello

Wekda does not free memory when it doesn't need that memory space anymore. I've tested this issue on Weka 3.6.11 and 3.7.11 (under Windows) in many cases. For example when you load multiple datasets, weka doesn't free memory space event when the last one is loaded! There is the same problem with classifiers and clustering algorithms. Briefly, weka only increases memory usage.
23 Sep 09:11 2014

### customize the result from SVM

Dear All,

Is it possible to customize the result from SVM? For example, the SVM library will receive a training set (x_1, y_1), .... (x_m, y_m) where x_i \in R^n and y_i \in {-1, 1}.

The result is a vector W \in R^n. I want to put some constraints in the vector W. For example, for some variables v_i, I want the coefficient W_i = 0 or W_i != 0. Or I want to limit the number of coefficients in W which are not equal to 0.

Can we do it in Wala?

Thanks and best regards,
Nguyen Truong Khanh
24 Sep 11:17 2014

### How to define confusion matrix of classification rules

How to define confusion matrix of the database and the classification rules
are found below. And calculation precision and recall.

How to define the components of the confusion matrix

<http://weka.8497.n7.nabble.com/file/n32287/1.jpg>
TP : The number of samples of class c are correctly classified into class c
FP: The number of samples not belonging to class c misclassified into class
c
TN: The number of samples not belonging to class c is classified (correctly)
FN: The number of samples of class c misclassified (in other classes c)
How to define TP, FP, TN, FN ?
Thanks you.

25 Sep 18:56 2014

### MultilayerPerceptron outputs constant prediction

Hi,

I am currently working on a time series + overlay data forecaster. However
if I choose the MLP algorithm, whatever the model is, i get constant
predictions.

I moved to the classify tab in the explorer to check if it was perhaps due
to the forecasting package, but the same happens here when i choose my
class-value and use the previously selected overlay variables for a
regression without lagged data.

I was wondering if anyone knows what I am doing wrong or if this is a bug in
weka or something.
My version is 3.7.10 btw.

example output:

=== Run information ===

Scheme:       weka.classifiers.functions.MultilayerPerceptron -L 0.3 -M 0.2
-N 1000 -V 0 -S 0 -E 20 -H a -B -I
Relation:
NormalizedData-weka.filters.unsupervised.attribute.ClassAssigner-Clast
Instances:    12767
Attributes:   17
Date
Month
Day
Hour
Weekday
Total_TotPMaxAva10W
Coal
Fuel
Gas
Nuclear
Water
Wind
Other
Load
SolarForecast
WindForecast
Value
Test mode:    split 66.0% train, remainder test

=== Classifier model (full training set) ===

Linear Node 0
Inputs    Weights
Threshold    0.16641924752243192
Node 1    0.16570806076365163
Node 2    0.023089908491941452
Node 3    0.029211377402233618
Node 4    0.03773641533232357
Node 5    0.017526716672682213
...

Sigmoid Node 8
Inputs    Weights
Threshold    -0.0012640390989610423
Attrib Date    -0.03218897577106701
Attrib Month    0.022069341299714176
Attrib Day    0.022707638189608914
Attrib Hour    -0.00464508228825368
Attrib Weekday    -0.04873298988322117
Attrib Total_TotPMaxAva10W    0.04158387595414298
Attrib Coal    -0.011209728638398475
Attrib Fuel    -0.03547959718548589
Attrib Gas    0.007847219600438674
Attrib Nuclear    0.0031969196415047377
Attrib Water    -0.03797063613118149
Attrib Wind    -0.02257505428592338
Attrib Other    0.04571861017759031
Attrib Load    -0.025678409355419207
Attrib SolarForecast    0.01987154673307641
Attrib WindForecast    0.025851666129866238
Class
Input
Node 0

Time taken to build model: 113.99 seconds

=== Predictions on test split ===

inst#     actual  predicted      error
1     37.4       51.377     13.977
2     31.62      51.377     19.757
3     31.81      51.377     19.567
4     39.18      51.377     12.197
5     48.2       51.377      3.177
6     58.45      51.377     -7.073
7     54.39      51.377     -3.013
8     56.5       51.377     -5.123

Anyone have an idea what i do wrong?

Thanks

30 Sep 07:08 2014

### Weka GUI KnowledgeFlow

Hi Everyone,

I am using  Weka 3.7.11 GUI, KnowledgeFlow.

I want a classifier (e.g. NaiveBayes) to classify my instances and then I want to save a 'new' data-set that includes the original data-set with additional attribute: the 'new' classification. (It works fine when I use the explorer.)

My problem is that the CSVSaver saves empty files.
Does anyone have an example of a knowledge flow that is doing something similar?

Thanks

Tali

24 Sep 17:05 2014

### Forecast package / Time series / Different results in Gui as in code

Hi,

I am new to the forum. I am trying to build a forecaster using the forecast
plugin.

The following is my code:
//////////////////////////////////////////////
String pathToForecastData = "//Raw Data//NewModellingARFF.arff";
Instances data =
ConverterUtils.DataSource.read(pathToForecastData);

WekaForecaster forecaster = new WekaForecaster();

forecaster.setFieldsToForecast("Value");

LinearRegression r = new LinearRegression();
r.setOptions(weka.core.Utils.splitOptions("-S 0 -R 1.0E-8"));
forecaster.setBaseForecaster(r);

forecaster.getTSLagMaker().setTimeStampField("Date"); // date
time stamp

forecaster.getTSLagMaker().setMinLag(1);
int maxHoursPast = 168;
forecaster.getTSLagMaker().setMaxLag(maxHoursPast); // monthly
data

forecaster.setOverlayFields("Year,Month,Day,Hour,Weekday,Total_TotPMaxAva10W,Coal,Fuel,Gas,Nuclear,Water,Wind,Other,Load,SolarForecast,WindForecast");

// build the model
Instances TrainingData = new Instances(data, 0, data.size() -
168);
forecaster.buildForecaster(TrainingData, System.out);

//This code is used to print forecasts of a week(168hours), by
consecutively adding 24 hours of data and then predicting the next 24. The
scenario in which i will use the forecaster.
// However currently i just predict the next 24 and the result is
different from the GUI. The results in the GUI are better...
for (int i = 0; i < 1; i++) {
forecaster.primeForecaster(new Instances(data, 0,
data.size() - 168 + 24 * i));
List<List&lt;NumericPrediction>> forecast =
forecaster.forecast(24, new Instances(data, data.size() - 168 + 24 * i, 24),
System.out);
printForecasts(forecast);
forecaster.primeForecaster(new Instances(data, data.size() -
168 + 24 * (i + 1)));
///////

I took screenshots from the gui and uploaded them to imgur here:

http://imgur.com/a/2gGSG

Does anyone have any idea why the values would be different?

Thanks already!
Florian

25 Sep 06:44 2014

### Re: Help required in Time Series Forecasting

Can someone help me in this regard?

30 Sep 11:37 2014

### Parameter estimation for general Bayesian Networks

Hello,

I would like to use Weka for the parameter estimation of a Bayesian network. The structure of the network is known in advance, and I have a set of instances with values for the variables in the network (including missing values for hidden variables). There is no class attribute and the structure of the net is more complex than Naive Bayes.

I managed to create a BIF file with the structure of the net, and an ARFF file containing the instances, and used them with the BayesNet object. However, the result I get is a Naive Bayes network with the last attribute set as the class. I wonder if it is at all possible to use it for other purposes rather than classifying, and if it is, how do I do that?

Thanks in advance!

Vered
