Hello all,

This may sound extremely "out there" but I am curious if it is possible to
build madlib on a Raspberry Pi and bind it into Postgres.

I have run Postgres 9.1 which is installable via apt-get and am looking for
how to install 9.2 or 9.3 which may involve building on the Pi which is
rumored to take all night.
All this is on Raspbian which is a Debian Wheezy port.

Once I have the requisite Postgres can madlib be installed on it using the
usual install from source or is the ARM platform a non-starter?


Nitin Borwankar
[MADlib-user] summary function integer mfv_frequencies

I seem to be getting incorrect frequency results for integer columns using the summary function.  For
example, if I have a column of unique sequential integers of sufficient cardinality then frequencies
become greater than one as returned by the summary function. 

Select version();
PostgreSQL 9.2.8 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4), 64-bit

MadLib 1.6

Here is a script that demonstrates

--DROP TABLE summary_test;

CREATE TABLE summary_test as (
SELECT generate_series FROM generate_series(1,10000)

-- select * from summary_test;

select madlib.summary( 
    source_table := 'summary_test'::text
  , output_table := 'summary_output'::text
  , target_cols := 'generate_series'::text
  , grouping_cols := NULL::text
  , get_distinct := FALSE
  , get_quartiles := FALSE
  , ntile_array := NULL::FLOAT8[]
  , how_many_mfv := 10
[MADlib-user] nearest neighbours search in madlib

Hi all,

Does MADLib offer any facilities for nearest neighbours lookup on
vectorial data, approximate or exact?

Googling around, I found this:
But I am not sure if this is current.

Any help would be greatly appreciated!

Cheers, Benjamin.

[MADlib-user] optimal ARIMA order parameters

   I would like to know if a function has been developed to compute the optimal order for ARIMA (similar to
auto.arima - checking BIC or AIC or smallest MLE using stepwise selection process) in Madlib.

Please let me know..

[MADlib-user] Using PivotalR with SQL Server

Sorry if I missed this but I've been looking and haven't seen mention of how to use PivotalR with SQL Server.
 I know it isn't a native thing but  i can't tell if it's possible and I haven't figured it out, or if it
simply isn't possible. 

My RODBC equivalent is below.  VF9 is the server name.

>user_id <- "my_user_name"
>password <- "my_password"
>myconn <- odbcConnect("VF9", uid = user_id, pwd = password)
># Execute query
>data_frame <- sqlQuery(myconn, "
>                            select * 
>                            from Finance.dbo.finance_001
>                           ")

Thank you so much in advance.  This looks like literally the perfect package if I can use it with SQL Server. 

[MADlib-user] other flavor of SQL with MADlib

I’m new to MADlib. Just started to dig around this couple of days. I saw the instruction to run MADlib on top
of Postgre. Is it possible to run MADlib on top of MS SQL Server, MySQL or Oracle? could someone share if they
have done it in the past?

thank you!!

[MADlib-user] store an R data frame in GPDB

   I am playing around with PivotalR package. I am able to read an existing table in GPDB and do some basic
manupulations using R. After that, I need to store the modified data.frame into GPDB as a table. How do I do
that? What is the function/API to store a data.frame into GPDB as a table?

Re: [MADlib-user] User Digest, Vol 22, Issue 3

Hi Paul,

If you go through the LDA example, you can get the description for each
topic in Step 3 using *madlib*.*lda_get_topic_desc*. You can also get the
topic distribution for each document (in fact it gives the topic counts,
which can be converted to a probability distribution through normalization
very easily) as shown in Step 4. To get the most important topics of a
document, you can index sort the topic_count in a descending order and get
the top *k* topics (we have an internal function doing index sorting -

I'd suggest you run the algorithm in some real dataset, like reuters-21578
to get some meaningful results.


[MADlib-user] Recommendations for a beginner

Can anyone recommend good tutorial information for getting started with
Madlib?  Not really for installation and setup, but actually using the
tools and intepreting the results.

I successfully got everything downloaded, setup postgres, etc. and built
something like the example at  I felt a bit HHGTG
though at the end of it starting at "42", wondering what the question was...

In this particular case, I was looking to glean topics from text documents,
but I was having a hard time translating the results back into the real
world. I know I have a pretty steep learning curve, but my searching wasn't
giving me much that seemed all that helpful.

[MADlib-user] MADLib version support

I am new to MADLib, GreenPlum, etc. and have been asked to install MADLib on GreenPlum.  My first question is
which version of MADLib should I install and where can I find an installation guide for this release?  We
currently have a multi-node cluster of GreenPlum 4.1.

Re: [MADlib-user] User Digest, Vol 21, Issue 3

* To create a new table, you can use "**". It can write a
data.frame / file / db.Rquery into the database, create a table and return
a object. It can also copy an existing table wrapped by
another object. Please read the user doc for more details.

* To append (or insert) a data.frame data to an existing table, you can
specify "append=TRUE" in to append the content of a file
or data.frame to an existing data table.

Note, "append" is not supported by the version on CRAN ( You will
need to use the latest version on GitHub ( There is detailed
instructions about how to install the latest PivotalR on the GitHub page.

Hope this helps.


