1 Apr 2012 02:38

Algorithm for Generating Artificial Data Set from Bayesian Network

Hello,

There are several  approximate inference algorithm such as logic sampling, likelihood weighting, Importance sampling, Backward sampling, Stochastic sampling methods which includes Gibbs sampling etc., all can be used to generate sample data from a given Bayesian Network.

Which algorithm is implemented in WEKA to generate artifical data from  Bayes Net. I need the literature reference please

S. I. Zimit

```_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
```
1 Apr 2012 05:51

RemoveRange()

Hello

I'm stuck up with the code that contains the logic of below example. Could anyone please help me out!

Example.......I've 82 instances and I used RemoveRange(1-8) to remove instances from 1-8 and store the remaining 9-82 as Train. I've also inverted the selection of RemoveRange() so that the removed 1-8 will be stored as test set. So, now one Train with 9-82 and Test with 1-8 are ready.

Next I've selected RemoveRange(9-16) so that I can use the Remaining (1-8 with16-82) as Train set and (9-16) as Test set. Finally, I get different Train and test sets for all the 10-folds.

Here is the code I've written on RemoveRange() in Java:

/* use RemoveFold to take generate different Train and Test sets*/
String[] options = new String[2];
options[0] = "-B";
options[1] = "0-8","9-16","17-24";
RemoveRange remrange = new RemoveRange();
RemoveRange remrange = new RemoveRange();
dis.setOptions(options);
dis.setInputFormat(randData);
Instances newData1 = Filter.useFilter(randData, remrange1);
Instances newData2 = Filter.useFilter(randData, remrange2);
newData.setClassIndex(0);
```_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
```
1 Apr 2012 10:28

Hi ,
Oleg.
```_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
```
1 Apr 2012 11:40

Re: Weka Error: Problem Evaluating Classifier

```On 28/03/12 2:18 AM, Arthur Gwatidzo wrote:
> I am trying to perform one class classification.
>
> I add a one class dataset for training and multiclass dataset for testing.
>
> Most of the algorithms are disabled
> And using the default ZeroR algorithm, an error pops up which says:
> *Problem evaluating classifier : Train and Test Set not compatible*
>
> How can I perform one class classification? One class classifier
> algorithm recommended by Eibe Frank's paper on one class classification,
> does not exist both in WEKA 3.6.5 and Weka 3.7.X

The algorithm is available as an installable package for Weka >= 3.7.2:

Cheers,
Mark.

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

```
1 Apr 2012 11:41

Re: Cost sensitive classification and probabilities

```On 28/03/12 4:39 AM, PiccoloBuddha wrote:
>
> Hi,
> i'm using J48 to develop a model for a binary class dataset with class
> imbalance problems. To solve it i decided to use Cost sensitive
> classification (so i use CostSensitiveClassifier using minimized expected
> misclasification cost). If I use standard J48 algorithm and i output the
> probability distribution I obtain reasonable number (e.g 0.984 and 0.016),
> when i use the CostSensitiveClassifier every probability for test set
> instances are either 0 and 1. How is this possible? If i'm not wrong
> CostSensitiveClassifier should adjust probabilities according to the cost
> matrix but not completely flat each probability!

Answered over in the Pentaho forums:

Cheers,
Mark.

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

```
1 Apr 2012 11:44

Re: weka.classifiers.trees.j48.C45PruneableClassifierTree

```On 28/03/12 7:47 AM, dam_all1 <at> libero.it wrote:
> I to all
>
> I use a dataset about car evaluator with this attribute:
>
>  <at> ATTRIBUTE maint		STRING
>  <at> ATTRIBUTE doors		STRING
>  <at> ATTRIBUTE persons	STRING
>  <at> ATTRIBUTE lug_boot	STRING
>  <at> ATTRIBUTE safety	        STRING
>
> I need to execute an experiment on it with J48 , but when i execute "run" I
> have this message
>
> "weka.classifiers.trees.j48.C45PruneableClassifierTree: Cannot handle string
> attributes!"
>
> how to do for using my dataset with string attribute and J48 algorithm?
>

Most algorithms in Weka operate on numeric and nominal attributes - see:

http://weka.wikispaces.com/ARFF+%28stable+version%29

Cheers,
Mark.

```
```_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
```
1 Apr 2012 16:44

(no subject)

Mark:

How can we find the best attributes from an .arff file? Do they vary with Evaluation, Search methods or classifiers?

Swetha
```_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
```
1 Apr 2012 22:57

Hi need know what algorithm of weka is plus convenient for my type of problem. My problem consist in establish the types of workers in who are classified youth 18 to 29 years in northwestern Argentina that have higher study. I am working with  data social , from of poll permante of house of institute of statistics and census of argentina. Se considered 35 variables of type numeric and nominal with few errors on certaian variables. total of data is 11360
i know that for type of problem the model is descriptivo, algorithm of rules of asociation or clasification are need . But what algorithm of weka?.THANK FOR HELP

**************   :) sonrei que te queda lindo :):):):): amy cgc  **************************

```_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
```
2 Apr 2012 08:09

Re: How to get a mutiple level J48 generated tree?

```On 28/03/12 6:24 PM, Bruce Dou wrote:
> Hi All,
>
> When I using J48 to generate the decision tree, always get only one
> level tree. How to get a multiple level tree?
>
> For example:
>
> cat test.arff
>
>  <at> RELATION test
>
>  <at> ATTRIBUTE score  NUMERIC
>  <at> ATTRIBUTE score2 NUMERIC
>  <at> ATTRIBUTE class {Good, Poor}
>
>  <at> DATA
> 11, 1, Poor
> 43, 0, Good
> 99, 0, Good
> 33, 1, Poor
> 59, 1, Poor
> 100, 0, Good
> 61, 0, Good
> 96, 0, Good
> 100, 1, Poor
> 77, 1, Poor
>
> java -cp weka.jar weka.classifiers.trees.J48 -t test.arff -x 10 -g
>
> digraph J48Tree {
> N0 [label="score2" ]
> N0->N1 [label="<= 0"]
> N1 [label="Good (5.0)" shape=box style=filled ]
> N0->N2 [label="> 0"]
> N2 [label="Poor (5.0)" shape=box style=filled ]
> }
>
> What I want is the first level split by score2, then split by score 1.
>
> --
> A decathlon Developer & programmer
> http://blog.eood.cn/

With only a few instances in your data, J48 has decided that further
splitting isn't statistically supported by the data. If you turn off
pruning you will probably get further splits though.

Cheers,
Mark.

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

```
2 Apr 2012 08:11

Re: OneClassClassifier problems

```On 28/03/12 10:27 PM, George Mamalakis wrote:
> Hello everybody,
>
> I am trying to run an experiment in weka (3.7.5-snapshot) that involves
> anomaly detection. I read that OneClassClassifier would be the algorithm
> of choice with regard to such problems. Since my dataset contains
> strings, I had to use an algorithm that supported strings when
> configuring the OneClassClassifier. So, first I chose
> OneClassClassifier, I then left the Bagging as its classifier, and then
> I edited its options to include NaiveBayesMultinomialText as its
> classifier, which is capable of handling text data. When I ran the
> algorithm I got the following error:
>
> 11:58:37: Started weka.classifiers.meta.OneClassClassifier
> 11:58:37: Command: weka.classifiers.meta.OneClassClassifier -num
> "weka.classifiers.meta.generators.GaussianGenerator -S 1 -M 0.0 -SD 1.0"
> -nom "weka.classifiers.meta.generators.NominalGenerator -S 1" -trr 0.1
> -tcl good -cvr 10 -cvf 10.0 -P 0.5 -S 1 -W weka.classifiers.meta.Bagging
> -- -P 100 -S 1 -num-slots 1 -I 10 -W
> weka.classifiers.bayes.NaiveBayesMultinomialText -- -P 0 -M 3.0 -norm
> 1.0 -lnorm 2.0 -tokenizer "weka.core.tokenizers.WordTokenizer
> -delimiters \" \\r\\n\\t.,;:\\\'\\\"()?!\"" -stemmer
> weka.core.stemmers.NullStemmer
> 11:58:37: Index: 11, Size: 11
>
> Where 11 is somehow related to the size of my dataset (I've tried
> running the same experiment with different sizes of my dataset in order
> to troubleshoot my problem). I've included a small sample of my dataset
> as an attachment.
>
> Is the problem related to the fact that OneClassClassifier is unable to
> handle string data in general? Because it seems like a bug to me.
>
>
> mamalos

This was answered over on the Pentaho forums:

```_______________________________________________