Chan Ai Ling | 1 Apr 03:49 2004
Picon

CHAID algorithm

Hi,

 

I was wondering does anyone know which algorithm in WEKA is similar to the CHAID (CHI SQUARED AUTOMATIC INTERACTION DETECTION) algorithm in SPSS?

 

Thanks a lot,

Ai Ling


MSN 8 helps ELIMINATE E-MAIL VIRUSES. Get 2 months FREE*.
_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Ari Chanen | 2 Apr 11:55 2004
Picon
Picon

Ranked classification for multiclass SVM?

Hi

I have a question about multiclass SVM (Weka's SMO classifier) problems.

Say there are K classes in the problem.  The first question is how does weka determine which of the K classes "wins" (i.e. gets the #1 ranking) for a given test instance?  And how does weka determine this in the all-pairs case and is that different from the one-against-the-rest case?

Ideally, what I would really like is to get back a ranked list from weka of the K classes ranked in order of how likely the software thinks that an instance matches the various classes.

On a related question, is there an easy way to get the distance of a instance vector from the separating hyperplane?  Is such a measure the way a that weka ranks the different classes?

Any help would be most appreciated.  :-)
--
Sydney Uni Signature
Thanks!

Ari Chanen 

Ph.D. Scholar
University of Sydney
Sydney Language Technology Research Group

    ,-_|\_
   /      \
   \_,-._*/
         ^

Office: +61 02 9351 5639
Mobile: +61 0439 411 476

"Every moment of your life is infinitely creative, and the universe is
endlessly bountiful.  Just put forth a clear enough request and everything
your heart desires, will come to you.." -Shakti Gawain
_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Woodward, Vernon L | 2 Apr 17:49 2004

Oracle connection

Hello all,

We're trying to connect Weka Explorer to an Oracle database, We can connect Weka to an Access DB, and the
Access DB connect to Oracle. We can also get Mathlab to see Oracle. When we try to connect Weka to Oracle we
get the error: "Can't open database: NULL." We know it is connecting to the ORacle DB because it detects bad userID/passowrds.

1. Is the Java code that creates / translates the connection string available?
2. Does anyone have an example of the strings they put in the "Open DB" GUI for an Oracle connection?
3. Should I even be using the JDBC-ODBC bridge?
4. Any advice? (Besides switching from Oracle - customer driven decision)

Thank you!
Vern

----------system data---------------------
OS: Win2000P
Weka: V3.4
Oracle:" V9i      DB:snook.orl     TNS Service: nook     Table:vwoodward.depot_data2
JBDC: sun.jdbc.odbc.JdbcOdbcDriver
OBDC: whatever MS uses  DSN:testdata
-------------------------------------------

----------Weka output---------------------
D:\Weka-3-4>java -jar weka.jar
Loaded driver: sun.jdbc.odbc.JdbcOdbcDriver
Connecting to jdbc:odbc:testdata
GenericObjectEditor: Problem making backup object
java.io.NotSerializableException: sun.jdbc.odbc.JdbcOdbcConnection
Executing query: SELECT * from depot_table2
-------------------------------------------------

--------------DatabaseUtils.props-------------
jdbcDriver=sun.jdbc.odbc.JdbcOdbcDriver
jdbcURL=jdbc:odbc:testdata
------------------------------------------------------

---------Explorer GUI-----------
jdbc:odbc:testdata
SELECT * from depot_data2
--------------------------------------

The problems that exist in the world today cannot be solved by the level of thinking that created them.
-A.Einstein 
Ali Alkan | 2 Apr 20:41 2004
Picon

Oracle connection

Hi,

 

I advise you to try Cahit Arf v1.0. http://cahitarf.sourceforge.net/

 

Ali

_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Peter Waltman | 2 Apr 21:23 2004
Picon

error when using arff files with string attributes

when i try to use a classifier on an arff file with a string attributes, 
I get the following error:

[pwaltman <at> pwaltman weka]$ java weka.classifiers.trees.j48.J48 -t test.arff
Cannot handle string attributes!

the format of my file is:
 <at> relation lgNaive

 <at> attribute string_ID string
 <at> attribute percentA {1,2,3,4}
 <at> attribute percentC {1,2,3,4,5,6,7,8,9,10}
....  more attributes
 <at> attribute class {Collagen,Elastin,Silk}

 <at> data
NP_892013.1,2,7,4,4,6,5,8,4,9,5,13,15,9,5,4,3,3,2,4,3,2,2,4,4,5,5,4,7,1,3,9,10,1,12,1,10,1,1,2,2,3,7,8,1,7,7,2,1,3,1,3,18,23,7,5,6,6,8,2,4,Collagen
NP_038627.1,1,8,4,4,6,4,9,4,9,6,12,15,6,5,4,3,3,3,7,6,6,4,3,6,6,5,4,9,4,3,2,10,8,17,1,2,1,1,2,3,5,3,3,3,5,12,11,1,3,3,11,18,18,1,16,13,7,22,1,7,Collagen

I have also tried putting the string attribute in double quotes, but 
still get the same error.  Has something changed in weka's 
implementation?  The README file says that it's supposed to support 
string attributes as a means of identifying the examples.  I've done a 
seardh of the mailist archive, and see posts from people who have had 
similar problems, but can't find any posts explaining how to resolve the 
problem.

as for weka version, I'm using weka 3.4

thanks,

Peter Waltman
Christian Schulz | 4 Apr 18:43 2004
Picon

Re: CHAID algorithm

IMHO the Decsion-Tree' Algorithm's from Quinlan are similar. 

weka.classifiers.rules.part.PART
weka.classifiers.trees.j48.J48

But a more comparable recursive-partioning algorithm to Chaid or CART  you 
find in www.r-project.org  with rpart.

christian

Am Donnerstag, 1. April 2004 03:49 schrieb Chan Ai Ling:
> Hi,
>
>
>  
>
>
> I was wondering does anyone know which algorithm in WEKA is similar to the
> CHAID (CHI SQUARED AUTOMATIC INTERACTION DETECTION) algorithm in SPSS?
>
>
>  
>
>
> Thanks a lot,
>
>
> Ai Ling
>
>
>
> MSN 8 helps ELIMINATE E-MAIL VIRUSES. Get 2 months FREE*.
Christian Schulz | 4 Apr 18:51 2004
Picon

Re: error when using arff files with string attributes

IMHO why you want classify with a ID what looks like for me?
If it isn't only a ID and have additional information recode it to nominal 
attribute.

christian

Am Freitag, 2. April 2004 21:23 schrieb Peter Waltman:
> when i try to use a classifier on an arff file with a string attributes,
> I get the following error:
>
> [pwaltman <at> pwaltman weka]$ java weka.classifiers.trees.j48.J48 -t test.arff
> Cannot handle string attributes!
>
> the format of my file is:
>  <at> relation lgNaive
>
>  <at> attribute string_ID string
>  <at> attribute percentA {1,2,3,4}
>  <at> attribute percentC {1,2,3,4,5,6,7,8,9,10}
> ....  more attributes
>  <at> attribute class {Collagen,Elastin,Silk}
>
>  <at> data
> NP_892013.1,2,7,4,4,6,5,8,4,9,5,13,15,9,5,4,3,3,2,4,3,2,2,4,4,5,5,4,7,1,3,9
>,10,1,12,1,10,1,1,2,2,3,7,8,1,7,7,2,1,3,1,3,18,23,7,5,6,6,8,2,4,Collagen
> NP_038627.1,1,8,4,4,6,4,9,4,9,6,12,15,6,5,4,3,3,3,7,6,6,4,3,6,6,5,4,9,4,3,2
>,10,8,17,1,2,1,1,2,3,5,3,3,3,5,12,11,1,3,3,11,18,18,1,16,13,7,22,1,7,Collage
>n
>
>
> I have also tried putting the string attribute in double quotes, but
> still get the same error.  Has something changed in weka's
> implementation?  The README file says that it's supposed to support
> string attributes as a means of identifying the examples.  I've done a
> seardh of the mailist archive, and see posts from people who have had
> similar problems, but can't find any posts explaining how to resolve the
> problem.
>
> as for weka version, I'm using weka 3.4
>
> thanks,
>
> Peter Waltman
>
>
>
>
>
> _______________________________________________
> Wekalist mailing list
> Wekalist <at> list.scms.waikato.ac.nz
> https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Harry Wells | 4 Apr 21:47 2004
Picon

a Retail Application with WEKA - Please share your thoughts

Dear All,

I was wondering how might data mining be applicable to sales forecasting, for example shoes.

What i was thinking is to use a number of shoe characteristics (ie color, leather type, front shape, retail price, total sales of item) and build a model to predict total sales per item.

What sort of algorithms could be used with WEKA, in order to make this analysis? Do you feel there is something here i am missing?

 

Thank you in advance

 

Do you Yahoo!?
Yahoo! Mail - More reliable, more storage, less spam

_______________________________________________
Wekalist mailing list
Wekalist <at> list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
Sionep | 4 Apr 04:11 2004
Picon

Re: Retail Application with WEKA - Please share your

Harry Wells wrote:

> Dear All,
> 
> I was wondering how might data mining be applicable to sales forecasting, for example shoes.

I am not sure if you can do forecasting using WEKA. Perhaps the WEKA 
team can clarify this for this list. As my understanding of the area of 
forecasting based on my experience in numerical computing , you can 
attack this problem from different methodologies.

- Use System Identifications techniques (such as ARMA - Autoregressive
   Moving Average)

- Use Kalman Filtering techniques

- Use Wavelets Analysis techniques

- Use Hybrids Soft Computing techniques such
   as ANFIS (Adaptive Neuro-Fuzzy Inference Systems) and
   CANFIS (Co-Active Neuro-Fuzzy Inference Systems).

- and many more other methods, ...

Now I have used all of the above using MATLAB. The system ID toolbox in 
MatLab is very sophisticated. ANFIS & CANFIS are available in the Neural 
Network & Fuzzy Logic toolbox. I have written some Java classes in 
System ID for time-series analysis including forecasting. I did include 
some wavelets classes to fine-tune System ID algorithms. I did develop 
this work for a computational finance work I was involved before , for 
the analysis and forecasting of the stock-price movement. Wavelets does 
a very good job of decomposing the frequency components of the 
time-series and made the job of the SystemID algorithm easier in 
identifying trends, shocks,  and good prediction (forecasting) to a good 
degree. I have not done any Java work in Kalman Filtering yet but it is 
still on my to do list. I am currently trying to combine FuzzyJ (Fuzzy 
Logic Java Toolkit API - a commercial tool) which is available to be 
downloaded from the internet and JOONE (Java Object Oriented Neural 
Engine) which is an open source project in Java. My aim is to develop a 
Neuro-Fuzzy package where ANFIS & CANFIS would be available as well. 
Neuro-Fuzzy computing (soft-computing in general) is very popular in 
data mining at the moment.

There was an industrial mathematics week in Janurary (from 26th  to 
30th) of 2004 that was held at Auckland University, New Zealand which I 
attended and there were some presentations on forecastings. Neuro-Fuzzy 
techniques was shown of how to predict weather patterns (time-series) 
for a local power company (Transpower New Zealand). It is vital for this 
company to be able to forecast in advance the power demand and the wind 
velocity for the optimal performance of operating the wind-farm 
generators. Another group of mathematicians also showed by using the 
same data that Kalman Filtering is also a robust method for forecasting. 
Another group also used ARMA to do forecasting using exactly the same 
Transpower data. A local government research institute shows of how 
using wavelets techniques to forecast the frequency of earthquakes and 
where it might likely to occur. Now enough of that.

The JDMAPI (Java Data Mining API) which is JSR-73 will implement 
forecasting modules more likely to be in the next version 2. This a 
comment made to me by the lead-spec of JSR-73 Mark Honick of Oracle. 
Mark has also invited me personally to join the expert group for JSR-73 
which is responsible for the development of JDMAPI. If I will join this 
group (I have not decided yet) my main aim is to push for:

  - multi variate statistical analysis sub-package
  - systems IDs algorithms
  - soft-computing hybrids (Neuro-Fuzzy, Neuro-Genetics, Fuzzy-Bayesian,
    etc,.. )
  - numerical computing techniques as digital filters, wavelets
    and Kalman filtering.
  - Rough Sets Analysis

I have developed a full Java statistical API (Univariate & Multivariate) 
for my own work, some wavelets modules, system IDs and may be this work 
could be used as a basis to be modified and included in JSR-73 work for 
upcoming work in version 2 if I ever join the expert group for JSR-73.

Finally , there may be a way to do forecasting in WEKA , but I will 
leave that verdict to the WEKA team to comment.

Cheers,
Sione. Palu.
Chanen; Ari | 5 Apr 04:51 2004
Picon
Picon

Ranked classification for multiclass SVM?

Hi

I have a question about multiclass SVM (Weka's SMO classifier) problems.

Say there are K classes in the problem.  The first question is how does weka determine which of the K classes
"wins" (i.e. gets the #1 ranking) for a given test instance?  And how does weka determine this in the
all-pairs case and is that different from the one-against-the-rest case?

Ideally, what I would really like is to get back a ranked list from weka of the K classes ranked in order of how
likely the software thinks that an instance matches the various classes.

On a related question, is there an easy way to get the distance of a instance vector from the separating
hyperplane?  Is such a measure the way a that weka ranks the different classes?

Any help would be most appreciated.  :-)
--

-- 
Thanks!

Ari Chanen 

Ph.D. Scholar
University of Sydney
Sydney Language Technology Research Group

Gmane