Mag Gam | 1 May 2008 03:32
Picon

Re: columns for count histograms of values

Just curious, how are you planning to display the histogram?
Are you allowed to use application code (C/C++/Perl, etc..) to generate the histogram? Personally, SQL is great for showing the data but not good for making graphs with data you can show.



On Wed, Apr 30, 2008 at 5:01 PM, Alexy Khrabrov <deliverable <at> gmail.com> wrote:
Greetings -- I have a table of the kind

Ratings:
id integer
rating smallint

-- where value can take any value in the range 1 to 5.  Now I want to have a statistical table Stats of the form

id integer
min smallint
max smallint
avg real
r1 integer
r2 integer
r3 integer
r4 integer
r5 integer

-- how can I create it in one pass over Ratings?  I can use min(), max(), avg() for
insert into stats values (id,select min(rating), max(rating), avg(rating), ...) from ratings

-- but what to do for r1,..,r5, short of subselects (select count(rating) from ratings where stats.id=ratings.id) for each, which is an overkill?

Also, if a table Stats already exists with some more columns, and we need to do an update, not insert, for the above, how would that work --

update stats set min=min(ratings), ... from ratings where stats.id=ratings.id -- how do we do the histogram in this case, where the id is fixed explicitly?

Cheers,
Alexy

--
Sent via pgsql-sql mailing list (pgsql-sql <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

chester c young | 1 May 2008 01:12
Picon
Favicon

Re: columns for count histograms of values


--- Alexy Khrabrov <deliverable <at> gmail.com> wrote:

> Greetings -- I have a table of the kind
> 
> Ratings:
> id integer
> rating smallint
> 
> -- where value can take any value in the range 1 to 5.  Now I want to
>  
> have a statistical table Stats of the form
> 
> id integer
> min smallint
> max smallint
> avg real
> r1 integer
> r2 integer
> r3 integer
> r4 integer
> r5 integer
> 
> -- how can I create it in one pass over Ratings? 

select id, min(rating), max(rating), avg(rating),
  sum( case rating = 1 then 1 else 0 end ),
  ...

      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

--

-- 
Sent via pgsql-sql mailing list (pgsql-sql <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Alexy Khrabrov | 1 May 2008 10:35
Picon
Gravatar

Re: columns for count histograms of values

Chester -- awesome!  Exactly what the doctor ordered.  Just one  
syntactic fix needed on 8.3.1:

case when rating=1 then 1 else 0 end

-- etc.

Cheers,
Alexy

On Apr 30, 2008, at 4:12 PM, chester c young wrote:

>
> --- Alexy Khrabrov <deliverable <at> gmail.com> wrote:
>
>> Greetings -- I have a table of the kind
>>
>> Ratings:
>> id integer
>> rating smallint
>>
>> -- where value can take any value in the range 1 to 5.  Now I want to
>>
>> have a statistical table Stats of the form
>>
>> id integer
>> min smallint
>> max smallint
>> avg real
>> r1 integer
>> r2 integer
>> r3 integer
>> r4 integer
>> r5 integer
>>
>> -- how can I create it in one pass over Ratings? 
>
> select id, min(rating), max(rating), avg(rating),
>  sum( case rating = 1 then 1 else 0 end ),
>  ...
>
>
>
>       
> ____________________________________________________________________________________
> Be a better friend, newshound, and
> know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

--

-- 
Sent via pgsql-sql mailing list (pgsql-sql <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Alexy Khrabrov | 2 May 2008 23:26
Picon
Gravatar

numbering rows on import from file

Greetings -- I have a huge table of the form  
(integer,integer,smallint,date).  Its origin is an ASCII file which I  
load with \copy.  Now I want to number the rows, adding an id column  
as an autoincrement from a sequence.  How should I do the import now  
for the sequence to work -- should I add the id column last, so it  
will not be filled by copy and presumably autoincrement?

Or, once the table is already in, can I add a column and force it to  
be filled with consecutive numbers, effectively numbering the rows?

Cheers,
Alexy

--

-- 
Sent via pgsql-sql mailing list (pgsql-sql <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

chester c young | 3 May 2008 00:33
Picon
Favicon

Re: numbering rows on import from file


--- Alexy Khrabrov <deliverable <at> gmail.com> wrote:

> Now I want to number the rows, adding an id column 
> as an autoincrement from a sequence.  How should I do the import now 
> for the sequence to work -- should I add the id column last, so it  
> will not be filled by copy and presumably autoincrement?

use a sequence
restart sequence to 1 before copy
have column id default to nextval('seq')

when doing copy don't have a column matching your id (duh)

      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

--

-- 
Sent via pgsql-sql mailing list (pgsql-sql <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Scott Marlowe | 3 May 2008 01:00
Picon

Re: numbering rows on import from file

On Fri, May 2, 2008 at 3:26 PM, Alexy Khrabrov <deliverable <at> gmail.com> wrote:
> Greetings -- I have a huge table of the form
> (integer,integer,smallint,date).  Its origin is an ASCII file which I load
> with \copy.  Now I want to number the rows, adding an id column as an
> autoincrement from a sequence.  How should I do the import now for the
> sequence to work -- should I add the id column last, so it will not be
> filled by copy and presumably autoincrement?

Add the sequence to the column.  something like this:

create table mytable (id serial primary key, int1 integer, int2
integer, smallint1 smallint, date1 date);
copy mytable (int1,int2, smallint1, date1) from STDIN;
120,2304,4,'2007-01-01'
204,3204,2,'2007-01-02'
(and so on)
\.

with a lot of other dbs, and a lot of languages, you're taught to
perform discrete steps when operating on your data.  Generally
speaking, PostgreSQL is much better at doing the most NOW, not later.
If you've got derived data you want to put into the table, put all the
data into a loading table, and transfer it to the real table with ONE
insert into select from query.

>  Or, once the table is already in, can I add a column and force it to be
> filled with consecutive numbers, effectively numbering the rows?

Bad idea.  As mentioned before, every update or insert, whether it
succeeds or not will create a dead row in the table.  If you update
(or attempt to update) all rows in a 10,000,000 row table three times,
you now have 30,000,000 dead rows in your table.

--

-- 
Sent via pgsql-sql mailing list (pgsql-sql <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Steve Crawford | 3 May 2008 02:18
Favicon

Re: numbering rows on import from file

Alexy Khrabrov wrote:
> Greetings -- I have a huge table of the form 
> (integer,integer,smallint,date).  Its origin is an ASCII file which I 
> load with \copy.  Now I want to number the rows, adding an id column 
> as an autoincrement from a sequence.  How should I do the import now 
> for the sequence to work -- should I add the id column last, so it 
> will not be filled by copy and presumably autoincrement?
>
> Or, once the table is already in, can I add a column and force it to 
> be filled with consecutive numbers, effectively numbering the rows?
>
Depends on your goal.

Do you just want a unique ID, make the id column a serial. Done. If you 
already have the table, add the column, update the table setting 
id=nextval('the serial columns sequence name') before bringing in 
additional data. If you were using plain \copy, you will now have to 
name the input columns since you don't have data for the id column: 
\copy (int1, int2, smallint3, date4) from ..... This is usually good 
practice anyway as additions to your table or column ordering changes 
won't affect your import.

Do you want the IDs to match row numbers in the source file? Use 
something like nl or whatever scripting language you like to add numbers 
in the source file.

Alternately, if the table is static, you can create a temporary sequence 
to fill the id column on import.

You can also fill the id column after import if necessary by updating 
the table setting id=nextval('yoursequence') but this will generate lots 
of empty space by updating all tuples (not good with a "huge" table) and 
may not associate tuples with source-file line-numbers.

Regardless of how you initially fill the id column, if you add/delete 
records your numbers will not be consecutive. Deletes and rollbacks will 
cause holes.

If you just need consecutive row-numbering on output (not in the table) 
and if the row numbering doesn't need to match the same record each 
time, you can create a temporary sequence and select 
nextval('tempsequence'),.... from yourtable.

Cheers,
Steve

--

-- 
Sent via pgsql-sql mailing list (pgsql-sql <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Alexy Khrabrov | 3 May 2008 10:05
Picon
Gravatar

update with multiple fields as aggregates

I need to fill two columns of a Rats table from an Offset1 table,  
where for each Id row in Rats we need to fill an average offset and  
the sum of all offset from Offset1 with the same Id.  I can create a  
derivative table like this:

create table ofrats as (select customer_id as cid,avg(o),sum(o) from  
offset1 group by cid);

But if I want to insert the two values into the Rats directly, I get  
an error:

netflix=> update rats r1 set of1=s.ao, sumof1=s.so from (select avg(o)  
as ao,sum(o) as so from rats,offset1 o1 where o1.customer_id=r1.id  
group by id) as s;
ERROR:  subquery in FROM cannot refer to other relations of same query  
level

-- is there a way to formulate the subquery for Update properly here  
so it's as efficient as the transfer table above?  Currently I have to  
create that auxiliary table and then transfer values into Rats via Id,  
then drop the table -- is it an idiom too or there's a better way to  
do it?

Cheers,
Alexy

--

-- 
Sent via pgsql-sql mailing list (pgsql-sql <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Volkan YAZICI | 3 May 2008 16:36

Re: update with multiple fields as aggregates

On Sat, 3 May 2008, Alexy Khrabrov <deliverable <at> gmail.com> writes:
> I need to fill two columns of a Rats table from an Offset1 table,
> where for each Id row in Rats we need to fill an average offset and
> the sum of all offset from Offset1 with the same Id.  I can create a
> derivative table like this:
>
> create table ofrats as (select customer_id as cid,avg(o),sum(o) from
> offset1 group by cid);
>
> But if I want to insert the two values into the Rats directly, I get
> an error:
>
> netflix=> update rats r1 set of1=s.ao, sumof1=s.so from (select avg(o)
> as ao,sum(o) as so from rats,offset1 o1 where o1.customer_id=r1.id
> group by id) as s;
> ERROR:  subquery in FROM cannot refer to other relations of same query
> level

I didn't try but, here is my 2 cents:

  UPDATE rats
     SET of1 = tmp.of1,
         sumof1 = tmp.sumof1
    FROM (SELECT id, AVG(o) AS of1, SUM(o) AS sumof1
            FROM rats,
                 offset1
        GROUP BY id)
      AS tmp
   WHERE tmp.id = rats.id;

Regards.

--

-- 
Sent via pgsql-sql mailing list (pgsql-sql <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

seiliki | 4 May 2008 18:28
Picon
Favicon

LEFT OUTER JOIN question

Hi!

I expect the SELECT to return two rows. Would some kind 
soul explain for me why it gives only one row?

TIA

CN
=============
CREATE TABLE x(c1 text,c2 int2);
INSERT INTO x VALUES('a',10);
INSERT INTO x VALUES('b',NULL);

CREATE TABLE y(c1 int2,c2 int2,c3 text);
INSERT INTO y VALUES(10,9,'yyy');

CREATE TABLE z(c1 text,c2 text);
INSERT INTO z VALUES('a','zzz');
INSERT INTO z VALUES('b','zzzz');

SELECT x.c1,y.c3,z.c2
FROM x JOIN z USING (c1)
  LEFT OUTER JOIN y ON (x.c2=y.c1)
WHERE y.c2=9;

 c1 | c3  | c2  
----+-----+-----
 a  | yyy | zzz

=========================
Comment: The following version of SELECT does return two 
rows as expected, however:

SELECT x.c1,y.c3,z.c2
FROM x JOIN z USING (c1)
  LEFT OUTER JOIN y ON (x.c2=y.c1);

 c1 | c3  |  c2  
----+-----+------
 a  | yyy | zzz
 b  |     | zzzz

--

-- 
Sent via pgsql-sql mailing list (pgsql-sql <at> postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Gmane