Bob Pawley | 1 Feb 2006 01:02
Picon
Favicon

Data Conversion

 

 

I’m looking for ideas on the direction I should take.

 

I’m constructing an engineering database. I want to include the capability of converting engineering data to different systems. For instance – Fahrenheit to Celsius, lbs/hr to gph to gpm and to liters per minute.

 

My thinking is bouncing between

            1 – creating a single table of data in the format of the users’ choice, then converting the data en masse as the user requests. Sort of like conversion-on-demand.

 

            2 – creating tables for each format (for temperature that would be one Fahrenheit table and one Celsius table) and do the conversion as the data is entered.

 

Any thoughts on how best to accomplish this task with Postgresql would be appreciated. (I’m on Windows XP)

 

Bob Pawley

Michael Glaesemann | 1 Feb 2006 01:20
Picon

Re: Data Conversion


On Feb 1, 2006, at 9:02 , Bob Pawley wrote:

>             1 – creating a single table of data in the format of  
> the users’ choice, then converting the data en masse as the user  
> requests. Sort of like conversion-on-demand.

I've been thinking about a similar application recently, and leaning  
towards your first option. A rough sketch of the schema that I've  
been considering looks like this:

create table measurement_types
(
	measurement_type text primary key
);

copy measurement_types (measurement_type) from stdin;
length
mass
temperature
\.

create table measurement_units
(
	measurement_unit text primary key
	, measurement_type text not null
		references measurement_types (measurement_type)
);

copy measurement_units (measurement_type, measurement_unit) from stdin;
length	in
length	m
length	miles
temperature	F
temperature	C
mass	kg
\.

create table measurement_conversions
(
	measurement_type text not null
	, measurement_unit_from	text not null
	, measurement_unit_to	text not null
	, conversion_factor	numeric not null
	, unique (measurement_unit_from, measurement_unit_to)
	, foreign key (measurement_type, measurement_unit_from)
		references measurement_units (measurement_type, measurement_unit)
	, foreign key (measurement_type, measurement_unit_to)
		references measurement_units (measurement_type, measurement_unit)
);

-- Haven't thought through yet how to handle conversions in the other  
direction. I'd rather not include, for example, F => C and C => F.  
Also, do I need to include F => F?

create table data_records
(
	measurement_id serial primary key
	, measurement_unit text not null
		references measurement_units (measurement_unit)
	, measurement_value numeric not null

);

>             2 – creating tables for each format (for temperature  
> that would be one Fahrenheit table and one Celsius table) and do  
> the conversion as the data is entered.
This smacks of duplication of data, which relational databases are  
meant to avoid.

Anyway, hope this helps.

Michael Glaesemann
grzm myrealbox com

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Bob Pawley | 1 Feb 2006 01:53
Picon
Favicon

Re: Data Conversion

Two way conversion will be a neccesity. My thought was that dual conversion 
could be not only complex but also have problems with stability.

Option 2 would be less complex and there would be less potential stability 
problems. However, there is some perception of redundancy in having two or 
more tables contain similar information. But, is it only a perception???

Perhaps there is another way???

Bob Pawley

----- Original Message ----- 
From: "Michael Glaesemann" <grzm <at> myrealbox.com>
To: "Bob Pawley" <rjpawley <at> shaw.ca>
Cc: "Postgresql" <pgsql-general <at> postgresql.org>
Sent: Tuesday, January 31, 2006 4:20 PM
Subject: Re: [GENERAL] Data Conversion

On Feb 1, 2006, at 9:02 , Bob Pawley wrote:

>             1 – creating a single table of data in the format of  the 
> users’ choice, then converting the data en masse as the user  requests. 
> Sort of like conversion-on-demand.

I've been thinking about a similar application recently, and leaning
towards your first option. A rough sketch of the schema that I've
been considering looks like this:

create table measurement_types
(
measurement_type text primary key
);

copy measurement_types (measurement_type) from stdin;
length
mass
temperature
\.

create table measurement_units
(
measurement_unit text primary key
, measurement_type text not null
references measurement_types (measurement_type)
);

copy measurement_units (measurement_type, measurement_unit) from stdin;
length in
length m
length miles
temperature F
temperature C
mass kg
\.

create table measurement_conversions
(
measurement_type text not null
, measurement_unit_from text not null
, measurement_unit_to text not null
, conversion_factor numeric not null
, unique (measurement_unit_from, measurement_unit_to)
, foreign key (measurement_type, measurement_unit_from)
references measurement_units (measurement_type, measurement_unit)
, foreign key (measurement_type, measurement_unit_to)
references measurement_units (measurement_type, measurement_unit)
);

-- Haven't thought through yet how to handle conversions in the other
direction. I'd rather not include, for example, F => C and C => F.
Also, do I need to include F => F?

create table data_records
(
measurement_id serial primary key
, measurement_unit text not null
references measurement_units (measurement_unit)
, measurement_value numeric not null

);

>             2 – creating tables for each format (for temperature  that 
> would be one Fahrenheit table and one Celsius table) and do  the 
> conversion as the data is entered.
This smacks of duplication of data, which relational databases are
meant to avoid.

Anyway, hope this helps.

Michael Glaesemann
grzm myrealbox com

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend 

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to majordomo <at> postgresql.org so that your
       message can get through to the mailing list cleanly

Dann Corbit | 1 Feb 2006 02:05

Re: Data Conversion

I would create functions.  Store the data in one format, and convert
with functions as needed on the fly.

E.g.:
SELECT Fahrenheit_to_Celcius(Fahrenheit_temp) FROM temperatures;

If that seems kludgy for the users, then create views for them that
perform the needed conversions using the functions.

Whatever the most frequently used units are, store on those units.  Then
the conversions are not needed as often.  E.g.:

CREATE VIEW Celcius_Temperatures AS
SELECT Fahrenheit_to_Celcius(Fahrenheit_temp) AS Celcius_Temp FROM
temperatures;

________________________________________
From: pgsql-general-owner <at> postgresql.org
[mailto:pgsql-general-owner <at> postgresql.org] On Behalf Of Bob Pawley
Sent: Tuesday, January 31, 2006 4:02 PM
To: Postgresql
Subject: [GENERAL] Data Conversion

I'm looking for ideas on the direction I should take.

I'm constructing an engineering database. I want to include the
capability of converting engineering data to different systems. For
instance - Fahrenheit to Celsius, lbs/hr to gph to gpm and to liters per
minute.

My thinking is bouncing between 
	1 - creating a single table of data in the format of the users'
choice, then converting the data en masse as the user requests. Sort of
like conversion-on-demand. 

	2 - creating tables for each format (for temperature that would
be one Fahrenheit table and one Celsius table) and do the conversion as
the data is entered.

Any thoughts on how best to accomplish this task with Postgresql would
be appreciated. (I'm on Windows XP)

Bob Pawley 

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Michael Glaesemann | 1 Feb 2006 02:18
Picon

Re: Data Conversion


On Feb 1, 2006, at 9:53 , Bob Pawley wrote:

> Two way conversion will be a neccesity. My thought was that dual  
> conversion could be not only complex but also have problems with  
> stability.

I'm not sure why it would be a stability issue. As for the  
complexity, I think once it's implemented you wouldn't have to worry  
about it by properly encapsulating that complexity, perhaps in  
procedures. I guess one way to handle the dual conversion issue is to  
produce a view (based on my previous example)

create view measurement_conversions_view as
	select measurement_type
		, measurement_unit_in
		, measurement_unit_out
		, factor
	from measurement_conversions
	union
	select measurement_type
		, measurement_unit_out as measurement_unit_in
		, measurement_unit_in as measurement_unit_out
		, 1::numeric / factor as factor
	from measurement_conversions
	union
	select measurement_type
		, measurement_unit as measurement_unit_in
		, measurement_unit as measurement_unit_out
		, 1 as factor
	from measurement_units

It'd also be good to add a constraint (through a trigger) that  
guarantees that if, for example, the length conversion m => in is the  
measurement_conversions table, the conversion in => m can't be  
inserted. This would prevent duplicates in the  
measurement_conversions_view (and corresponding possible errors  
arising from slightly different conversion results).

> Option 2 would be less complex and there would be less potential  
> stability problems. However, there is some perception of redundancy  
> in having two or more tables contain similar information. But, is  
> it only a perception???

It's not just a perception. You're duplicating the values. You need  
to always make sure that you're inserting into, updating, and  
deleting from all of the relevant tables. I think that would be a  
maintenance nightmare.

Michael Glaesemann
grzm myrealbox com

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to majordomo <at> postgresql.org so that your
       message can get through to the mailing list cleanly

Bob Pawley | 1 Feb 2006 02:32
Picon
Favicon

Re: Data Conversion

I'm a little concerned about stability since my Postgresql application has 
failed three times in the last couple of months. It seems to have failed 
when too many things are happening at the same time - mostly things that 
have been instigated by my pointing and clicking.

Bob

----- Original Message ----- 
From: "Michael Glaesemann" <grzm <at> myrealbox.com>
To: "Bob Pawley" <rjpawley <at> shaw.ca>
Cc: "Postgresql" <pgsql-general <at> postgresql.org>
Sent: Tuesday, January 31, 2006 5:18 PM
Subject: Re: [GENERAL] Data Conversion

>
> On Feb 1, 2006, at 9:53 , Bob Pawley wrote:
>
>> Two way conversion will be a neccesity. My thought was that dual 
>> conversion could be not only complex but also have problems with 
>> stability.
>
> I'm not sure why it would be a stability issue. As for the  complexity, I 
> think once it's implemented you wouldn't have to worry  about it by 
> properly encapsulating that complexity, perhaps in  procedures. I guess 
> one way to handle the dual conversion issue is to  produce a view (based 
> on my previous example)
>
> create view measurement_conversions_view as
> select measurement_type
> , measurement_unit_in
> , measurement_unit_out
> , factor
> from measurement_conversions
> union
> select measurement_type
> , measurement_unit_out as measurement_unit_in
> , measurement_unit_in as measurement_unit_out
> , 1::numeric / factor as factor
> from measurement_conversions
> union
> select measurement_type
> , measurement_unit as measurement_unit_in
> , measurement_unit as measurement_unit_out
> , 1 as factor
> from measurement_units
>
> It'd also be good to add a constraint (through a trigger) that  guarantees 
> that if, for example, the length conversion m => in is the 
> measurement_conversions table, the conversion in => m can't be  inserted. 
> This would prevent duplicates in the  measurement_conversions_view (and 
> corresponding possible errors  arising from slightly different conversion 
> results).
>
>> Option 2 would be less complex and there would be less potential 
>> stability problems. However, there is some perception of redundancy  in 
>> having two or more tables contain similar information. But, is  it only a 
>> perception???
>
> It's not just a perception. You're duplicating the values. You need  to 
> always make sure that you're inserting into, updating, and  deleting from 
> all of the relevant tables. I think that would be a  maintenance 
> nightmare.
>
> Michael Glaesemann
> grzm myrealbox com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo <at> postgresql.org so that your
>       message can get through to the mailing list cleanly 

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to majordomo <at> postgresql.org so that your
       message can get through to the mailing list cleanly

Michael Glaesemann | 1 Feb 2006 02:41
Picon

Re: Data Conversion


On Feb 1, 2006, at 10:32 , Bob Pawley wrote:

> I'm a little concerned about stability since my Postgresql  
> application has failed three times in the last couple of months. It  
> seems to have failed when too many things are happening at the same  
> time - mostly things that have been instigated by my pointing and  
> clicking.

It sounds like you definitely have something to track down in either  
your application or your PostgreSQL installation. What do you mean by  
failed? Is it the application or the PostgreSQL backend? While  
performance can be affected by your DDL, it shouldn't influence  
anything that could be described as a "failure". Also, what pointing  
and clicking are you doing? Is this in your own app or a PostgreSQL  
administration app such as phpPgAdmin or pgAdmin?

What version of PostgreSQL are you running? What platform? What  
hardware? This information may help others on the list help you nail  
down your "stability" issues.

Michael Glaesemann
grzm myrealbox com

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Bob Pawley | 1 Feb 2006 03:11
Picon
Favicon

Re: Data Conversion

I'm running version 8.1 on XP.

When I point and click on pgadmin tables too quickly, sometimes, the program 
freezes and I get the Windows message about reporting the failure.

If I stay cool and 'deterministic' (in other words - slow) there doesn't 
seem to be a problem.

However, this may not be a Postgresql problem as other applications have 
'failed' at various times as well (Empire Earth being one).

I would look at Mac or others if I didn't need to develop in Windows.

Bob
----- Original Message ----- 
From: "Michael Glaesemann" <grzm <at> myrealbox.com>
To: "Bob Pawley" <rjpawley <at> shaw.ca>
Cc: "Postgresql" <pgsql-general <at> postgresql.org>
Sent: Tuesday, January 31, 2006 5:41 PM
Subject: Re: [GENERAL] Data Conversion

>
> On Feb 1, 2006, at 10:32 , Bob Pawley wrote:
>
>> I'm a little concerned about stability since my Postgresql  application 
>> has failed three times in the last couple of months. It  seems to have 
>> failed when too many things are happening at the same  time - mostly 
>> things that have been instigated by my pointing and  clicking.
>
> It sounds like you definitely have something to track down in either  your 
> application or your PostgreSQL installation. What do you mean by  failed? 
> Is it the application or the PostgreSQL backend? While  performance can be 
> affected by your DDL, it shouldn't influence  anything that could be 
> described as a "failure". Also, what pointing  and clicking are you doing? 
> Is this in your own app or a PostgreSQL  administration app such as 
> phpPgAdmin or pgAdmin?
>
> What version of PostgreSQL are you running? What platform? What  hardware? 
> This information may help others on the list help you nail  down your 
> "stability" issues.
>
> Michael Glaesemann
> grzm myrealbox com
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster 

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Michael Fuhr | 1 Feb 2006 03:44
Favicon

Re: libpq questions

On Wed, Feb 01, 2006 at 09:47:12AM +1100, James Harper wrote:
> > Be careful what you infer from such a scan: not finding any NULLs
> > doesn't necessarily mean a column isn't nullable, it just means the
> > result set didn't contain any NULLs.
> 
> I understand that limitation, but haven't figured out if it matters in
> my situation. The only time it might is if the client wants to infer a
> schema as a result of a query, eg 'SELECT * FROM TableX WHERE 0 = 1'.

Even if such a query did return a "nullable" flag, plenty of other
metadata would be absent that might be just as interesting from a
schema-viewing standpoint (CHECK, PRIMARY KEY, etc.).  A better way
to view the schema is to query the system catalogs or the Information
Schema.

> In the above example, does the database engine assign internally a
> 'nullability' flag? I guess it must do... because how would the
> following be evaluated:
> 
> SELECT f1 + f2 AS f INTO TableY FROM TableX WHERE f1 < 30
> 
> Would the column f in the created table be nullable or not?
> 
> I guess I need to do some testing unless you know off the top of your
> head?

I'm not familiar enough with PostgreSQL internals to comment on
what's happening underneath, but I could tell you from experience
what the above query would do.  But with a quick test you could
figure it out for yourself :-)

> Hmmm... so a select statement with result set of a million rows is going
> to stall for a while before the results are usefully available to the
> client, and is then going to use a significant amount of memory on the
> client...

Correct.

> Is this a limitation of libpq or of the underlying database engine?

The "Incremental results from libpq" thread from a few months ago
might answer your questions:

http://archives.postgresql.org/pgsql-interfaces/2005-11/msg00010.php

> Are there any alternative (but native - eg not ODBC) interfaces to
> postgresql?

What problem do you want the alternative to solve?  If it's just
the libpq-fetches-all-rows problem then you could use a cursor.

--

-- 
Michael Fuhr

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

James Harper | 1 Feb 2006 03:51
Picon

Re: libpq questions

> 
> Even if such a query did return a "nullable" flag, plenty of other
> metadata would be absent that might be just as interesting from a
> schema-viewing standpoint (CHECK, PRIMARY KEY, etc.).  A better way
> to view the schema is to query the system catalogs or the Information
> Schema.

I now know enough about it to agree with you :)

> > In the above example, does the database engine assign internally a
> > 'nullability' flag? I guess it must do... because how would the
> > following be evaluated:
> >
> > SELECT f1 + f2 AS f INTO TableY FROM TableX WHERE f1 < 30
> >
> > Would the column f in the created table be nullable or not?
> >
> > I guess I need to do some testing unless you know off the top of
your
> > head?
> 
> I'm not familiar enough with PostgreSQL internals to comment on
> what's happening underneath, but I could tell you from experience
> what the above query would do.  But with a quick test you could
> figure it out for yourself :-)

Did that once I got access to my postgres server. The f in the created
table is nullable in the above example, and also even if f1 and f2 are
not nullable themselves, so it looks like that as soon as you start to
make a field based on an expression, the 'not null' constraint goes out
the window.

> The "Incremental results from libpq" thread from a few months ago
> might answer your questions:
> 
> http://archives.postgresql.org/pgsql-interfaces/2005-11/msg00010.php

Found that eventually. I'd seen the thread earlier but assumed that the
word 'incremental' in the subject was to do with auto-incrementing
fields. An interesting read.

Thanks

James

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


Gmane