Mark Hall | 1 Jul 2007 09:36
Picon
Picon

Re: resample -no-replacement: problem filtering instances: n must be positive

Hi there,

Are you using the latest version of Weka? I don't seem to get this
problem when running a quick test on the iris data. What does your
data look like?

Cheers,
Mark.

On Sat, Jun 30, 2007 at 12:14:19PM +0100, Konstantinos Pachopoulos wrote:
> Hi,
> when i use the resample filter, the "-no-replacement"
> argument causes a "problem filtering instances: n must
> be positive" message to appear.
> 
> Ideas?
> 
> Thanks
> 
> 
>       ___________________________________________________________ 
> Yahoo! Mail is the world's favourite email. Don't settle for less, sign up for
> your free account today
http://uk.rd.yahoo.com/evt=44106/*http://uk.docs.yahoo.com/mail/winter07.html 
> 
> _______________________________________________
> Wekalist mailing list
> Wekalist <at> list.scms.waikato.ac.nz
> https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist

(Continue reading)

Ning Yu | 1 Jul 2007 15:56
Picon

how to use weka.core.tokenizers.NGramTokenizer from command line

Hello,

I kept getting an "Illegal options" error when I use the following option:
-tokenizer weka.core.tokenizers.NGramTokenizer
in a weka.filters.unsupervised.attribute.StringToWordVector command.

Any help on how to set tokenizer via command line will be appreciated.

Thanks,
 Ning

_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Mark Hall | 2 Jul 2007 00:28
Picon
Picon

Re: how to use weka.core.tokenizers.NGramTokenizer from command line

Hi,

Are you using the latest version of Weka (3.5.6)? This works fine for
me.

Cheers,
Mark.

On Sun, Jul 01, 2007 at 09:56:05AM -0400, Ning Yu wrote:
>    Hello,
> 
>    I kept getting an "Illegal options" error when I use the following option:
>    -tokenizer weka.core.tokenizers.NGramTokenizer
>    in a weka.filters.unsupervised.attribute.StringToWordVector command.
> 
>    Any help on how to set tokenizer via command line will be appreciated.
> 
>    Thanks,
>     Ning

> _______________________________________________
> Wekalist mailing list
> Wekalist <at> list.scms.waikato.ac.nz
> https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist

--

-- 
Mark Hall
Senior Lecturer
Department of Computer Science
University of Waikato
Hamilton
New Zealand
www.cs.waikato.ac.nz/~mhall

_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Ning Yu | 2 Jul 2007 01:22
Picon

Re: how to use weka.core.tokenizers.NGramTokenizer from command line

Sorry, It works now. I installed 3.5.6 and changed the .profile, but I guess I have to totally log out the server and relog in to meke it work. (I did source .profile and apparently this didn't work)

Thanks!
 Ning

On 7/1/07, Mark Hall <mhall <at> cs.waikato.ac.nz> wrote:
Hi,

Are you using the latest version of Weka (3.5.6)? This works fine for
me.

Cheers,
Mark.

On Sun, Jul 01, 2007 at 09:56:05AM -0400, Ning Yu wrote:
>    Hello,
>
>    I kept getting an "Illegal options" error when I use the following option:
>    -tokenizer weka.core.tokenizers.NGramTokenizer
>    in a weka.filters.unsupervised.attribute.StringToWordVector command.
>
>    Any help on how to set tokenizer via command line will be appreciated.
>
>    Thanks,
>     Ning

> _______________________________________________
> Wekalist mailing list
> Wekalist <at> list.scms.waikato.ac.nz
> https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist


--
Mark Hall
Senior Lecturer
Department of Computer Science
University of Waikato
Hamilton
New Zealand
www.cs.waikato.ac.nz/~mhall


_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist


_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Ning Yu | 2 Jul 2007 03:32
Picon

Re: how to use weka.core.tokenizers.NGramTokenizer from command line

Sorry to bother you again.

I wonder how to set the minimum and maximum size of the ngram by using the NGramTokenizer.
Below is how I write the command line and it didn't work.

java weka.filters.unsupervised.attribute.StringToWordVector
   -b -i str_training.arff -o training.arff -r str_test.arff -s test.arff
   -R 2 -W 5000 -C -T -I -N 1 -L -M 2
  -tokenizer weka.core.tokenizers.NGramTokenizer -min 2 -max 3

Thank you very much.
 Ning

On 7/1/07, Mark Hall <mhall <at> cs.waikato.ac.nz> wrote:
Hi,

Are you using the latest version of Weka (3.5.6)? This works fine for
me.

Cheers,
Mark.

On Sun, Jul 01, 2007 at 09:56:05AM -0400, Ning Yu wrote:
>    Hello,
>
>    I kept getting an "Illegal options" error when I use the following option:
>    -tokenizer weka.core.tokenizers.NGramTokenizer
>    in a weka.filters.unsupervised.attribute.StringToWordVector command.
>
>    Any help on how to set tokenizer via command line will be appreciated.
>
>    Thanks,
>     Ning

> _______________________________________________
> Wekalist mailing list
> Wekalist <at> list.scms.waikato.ac.nz
> https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist


--
Mark Hall
Senior Lecturer
Department of Computer Science
University of Waikato
Hamilton
New Zealand
www.cs.waikato.ac.nz/~mhall


_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist


_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Mark Hall | 2 Jul 2007 04:28
Picon
Picon

Re: how to use weka.core.tokenizers.NGramTokenizer from command line

On Sun, Jul 01, 2007 at 09:32:40PM -0400, Ning Yu wrote:
>    Sorry to bother you again.
> 
>    I wonder how to set the minimum and maximum size of the ngram by using the
>    NGramTokenizer.
>    Below is how I write the command line and it didn't work.
> 
>    java weka.filters.unsupervised.attribute.StringToWordVector
>       -b -i str_training.arff -o training.arff -r str_test.arff -s test.arff
>       -R 2 -W 5000 -C -T -I -N 1 -L -M 2
>      -tokenizer weka.core.tokenizers.NGramTokenizer -min 2 -max 3

Try placing quotes around the tokenizer spec - e.g:

-tokenizer "weka.core.tokenizers.NGramTokenizer -min 2 -max 3"

Cheers,
Mark.

--

-- 
Mark Hall
Senior Lecturer
Department of Computer Science
University of Waikato
Hamilton
New Zealand
www.cs.waikato.ac.nz/~mhall

_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Remco Bouckaert | 2 Jul 2007 05:05
Picon

Re: K2 in learning Bayes net

On Fri, 29 Jun 2007 10:31, WN wrote:
> Hi everyone,
>
> I was using Weka (Version 3.5.3) for a Bayes net application.  The search
> algorithm I chose is the K2 algorithm, with other parameters set as
>
> InitialAsNaiveBayes = False
> MarkovBlanketClassifier = True
> ...
>
> To my knowledge, K2 depends on the order of the nodes (as well as the 1st
> node, I think).  So I just tried to set as the 1st node an attribute
> variable (say, var5), rather than the class variable.  My question is the
> "Network structure" part in the output, which told me that var5 has a
> parent node var 1 and var 1 has a parent node var5.  So, this means in the
> final network, there is a connection which looks like this: var1 <---> var5
>  (A connection with two arrowheads)
>
> Is this normal?  It seems that K2 should not add a parent node to the 1st
> node.  Am I right?

K2 keeps the order of the attributes, except for the class variable, which is 
placed at the beginning of the ordering. The behavior you describe is not 
what is expected, since it looks like a cycle is introduced somewhere. Which 
dataset did you use? How did you set an attribute as first variable?

Cheers,

Remco
Ning Yu | 2 Jul 2007 05:06
Picon

Re: how to use weka.core.tokenizers.NGramTokenizer from command line

Hi Mark,

It works now!!! Thanks:)

Ning

On 7/1/07, Mark Hall <mhall <at> cs.waikato.ac.nz> wrote:
On Sun, Jul 01, 2007 at 09:32:40PM -0400, Ning Yu wrote:
>    Sorry to bother you again.
>
>    I wonder how to set the minimum and maximum size of the ngram by using the
>    NGramTokenizer.
>    Below is how I write the command line and it didn't work.
>
>    java weka.filters.unsupervised.attribute.StringToWordVector
>       -b -i str_training.arff -o training.arff -r str_test.arff -s test.arff
>       -R 2 -W 5000 -C -T -I -N 1 -L -M 2
>      -tokenizer weka.core.tokenizers.NGramTokenizer -min 2 -max 3

Try placing quotes around the tokenizer spec - e.g:

-tokenizer "weka.core.tokenizers.NGramTokenizer -min 2 -max 3"

Cheers,
Mark.

--
Mark Hall
Senior Lecturer
Department of Computer Science
University of Waikato
Hamilton
New Zealand
www.cs.waikato.ac.nz/~mhall


_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist


_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Christian Schulz | 2 Jul 2007 14:36
Picon
Favicon

preprocessing data


Hi everybody ,
I am involved in a project named AMIDA at the DFKI.
The Goal in the project is develop systems for signal processing and 
knowledge management of business meetings.

So what I am doing is to use Weka's Bayes Net for classifying Dialogue 
Act boundaries.

there are a few questions I want to set:

1. Which  filter is best to convert Strings, since Bayes Net cannot 
handle with Strings?
   Is StringToWordVector the best option ? (the training takes then so 
long!)
   StringToNominal only can handle with one String Attribute per 
instance, is that right ?

2.  Among Bayes Net Capabilities  there is also the numeric type for 
class attribute.
    But having this simple sample extract with the class attribute at 
the first position,
    Bayes Net says that it cannot handle numeric class!
    (By the way also the StringToWordVector throws an 
IndexOutOfBoundException with)

thanks
best chris

 <at> relation 'Segment Data Set'

 <at> attribute distance_of_words_to_the_last_segment numeric
 <at> attribute word_of_segment string

 <at> data
0,Okay
0,Right
0,[other]
1,Um
2,well
3,this
4,is
5,the
6,kick-off
7,meeting
8,for
9,our
10,our
11,project
0,Um
1,[laugh]
2,and
...

_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Marjolein Jansen | 2 Jul 2007 20:08
Picon
Favicon

Error an arff file

Hi Mark,

I want to perfom an association task on a datafile with many attributes (around 120). When I transform my data from Excel to csv format an then into arff format and I load the file in Weka, I get the error: Premature end of line, read token [EOL], line 131. Line 131 is the first line with instances and is for each record 3 lines long. How can I solve this problem?

Thanks in advance!

Marjolein

 

PC Magazine’s 2007 editors’ choice for best Web mail—award-winning Windows Live Hotmail.
_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist

Gmane