Craig Bruce | 1 Apr 2010 01:36
Favicon

Numerical Precision in Proj, WKT, and everywhere

I was reading the following web page and its discussion of numerical
precision written by Frank Warmerdam:

http://home.gdal.org/projects/opengis/wktproblems.html

I agree with what he said, but printf's "%.16g" does have a problem which
manifests itself frequently, such as in the "epsg" CRS definitions in
proj-4.6.0, e.g.:

# Anguilla 1957 / British West Indies Grid
<2000> +proj=tmerc +lat_0=0 +lon_0=-62 +k=0.9995000000000001 +x_0=400000 +y_0=0 +ellps=clrk80
+units=m +no_defs  <>

The "+k" parameter value is an eyesore.  I think a better approach is to
generate the "%.16g" number as indicated, but then scan its characters for
the ugly pattern of '01' or '99' in the 15th and 16th significant digits.
If this pattern is found, then the number should be generated again using
"%.15g".

If you want, you could atof() the re-generated string back to a double
and check to see if it is exactly equal to the given value.  If it is,
then the 15-digit value is definitely the correct value.  If not, then
it means that the .0000 and .0001 representations are distinct and the
"ugly" .0001 representation can be used instead, though as noted below,
this presumes that the source value was correct to begin with.  This step
is optional.

This approach gives you nearly full double precision all of the time
without the eyesores.

(Continue reading)

Gerald I. Evenden | 1 Apr 2010 02:26
Picon

Re: Numerical Precision in Proj, WKT, and everywhere

On Wednesday 31 March 2010 7:36:34 pm Craig Bruce wrote:
> I was reading the following web page and its discussion of numerical
> precision written by Frank Warmerdam:
>
> http://home.gdal.org/projects/opengis/wktproblems.html
>
> I agree with what he said, but printf's "%.16g" does have a problem which
> manifests itself frequently, such as in the "epsg" CRS definitions in
> proj-4.6.0, e.g.:
>
> # Anguilla 1957 / British West Indies Grid
> <2000> +proj=tmerc +lat_0=0 +lon_0=-62 +k=0.9995000000000001 +x_0=400000
> +y_0=0 +ellps=clrk80 +units=m +no_defs  <>

Prey tell, why is someone using %,16g to print out a numbet that is rarely, if 
ever, specified to more 4 to 6 digits? %g would be completely adequate and 
avoid the absurd gyration suggested in the original email.

Of course, the other question is why is the number, apparently from a data 
base, store in binary in the first place?

Sorry for the sarcasm but the nature of this non-problem overwhelms me.

--

-- 
The whole religious complexion of the modern world is due
to the absence from Jerusalem of a lunatic asylum.
-- Havelock Ellis (1859-1939) British psychologist
_______________________________________________
Proj mailing list
Proj <at> lists.maptools.org
(Continue reading)

strebe | 1 Apr 2010 23:08
Picon
Favicon

Re: Optimal Albers Standard parallels


Apologies for the long delay in responding. Determining the optimal Lambert azimuthal equal-area is equivalent to solving the "minimum covering circle problem" described here:

http://en.wikipedia.org/wiki/Smallest_circle_problem

Apparently it can be solved in linear time by an algorithm due to Meggido. This is unexpectedly (to me) efficient. However, I have not examined the algorithm itself yet; technically the optimal LAEA problem is not QUITE the equivalent to the "minimum covering circle problem" unless the latter requires in its solution not only the radius of the circle, but also its location. The radius is irrelevant to the optimal LAEA problem; it is the center point that is required.

Regards,
— daan Strebe


-----Original Message-----
From: Jan Hartmann <j.l.h.hartmann <at> uva.nl>
To: PROJ.4 and general Projections Discussions <proj <at> lists.maptools.org>
Cc: strebe <strebe <at> aol.com>
Sent: Mon, Feb 22, 2010 3:04 am
Subject: Re: [Proj] Optimal Albers Standard parallels

For me, I certainly would be interested in the algorithm.

Jan

On 21-2-2010 20:41, strebe wrote:

Oscar:


"A simple but useful way of appraising the location of the origin of an azimuthal projection is to plot a series of concentric circles of radii z representing the isograms of maximum Angular Deformation and those of Scale Error at the scale of a convenient atlas map and to shift this overlay about on the map until a good fit is obtained between some of the extreme points of the area of mapped."

While phrased in an unnecessarily complicated way, the recommendation in "Map Projections for Europe" is the equivalent of my recommendation. You have a bunch of points you know are within the area of interest. Whether manually (as described in that publication) or programmatically, you need to find the smallest small circle that circumscribes them all. I don't know of any non-iterative method for doing that, though possibly one could be devised. As an algorithmic process, it is not difficult, but unlike Albers, it is not just a matter of observing the most northerly and most southerly points.

I note that you talk about comparing Albers to LAEA. Once you know the optimal LAEA you can do that, since the distortion values at the extremes of the region of interest can be computed by known formulæ once the parameters for the projection are known.

Can I interpret your inquiry to mean that you want to know (presumably in complete detail) the algorithm for finding the projection center of the optimal LAEA for a region, given a list of points in that region? Or are you just wondering if one could be devised?

Regards,
— daan Strebe


On Feb 21, 2010, at 8:01:56 AM, OvV_HN <ovv <at> hetnet.nl> wrote:
In the article I was referring to, the angular deformation and the scale 
error of the Albers and the LAEA projections are compared for the whole of 
the European Union. The Albers has two standard parallels at 38 and 61 d N, 
with a CM at 9 E. The LAEA has its center at 9 E and 53 N.
I got the impression that the author just shifted the LAEA center up and 
down by trial and error until a reasonable comparison could be made with the 
Albers.
I wondered, could the center of the LAEA projection have been determined 
algorithmically?
Still an academic discussion, of course......


Oscar van Vlijmen




_______________________________________________

Proj mailing list

Proj <at> lists.maptools.org

http://lists.maptools.org/mailman/listinfo/proj
_______________________________________________
Proj mailing list
Proj <at> lists.maptools.org
http://lists.maptools.org/mailman/listinfo/proj
Craig Bruce | 2 Apr 2010 00:32
Favicon

Re: Numerical Precision in Proj, WKT, and everywhere

"Gerald I. Evenden" <geraldi.evenden <at> gmail.com> wrote:

> Prey tell, why is someone using %,16g to print out a numbet that is
> rarely, if ever, specified to more 4 to 6 digits? %g would be completely
> adequate and avoid the absurd gyration suggested in the original email.

Sometimes those 4 to 6 digits aren't round in decimal, so %g truncates the
value quite a lot.  When these values are used for subsquent calculations,
the error they contain grows.  Suppose that tiles are requested from
an OGC WMS using one-minute longitude intervals.  If generated using
%g, an example interval would be from 135.067 to 135.083 degrees.
If the application computes the span of this interval (max - min), the
result has only two significant digits (really, log10(16) = 1.2 digits).
Subsequent calculations have an enormous error.  If all sixteen digits
had been used in the WMS BBOX= parameter, the span result still has 11
significant digits.

When doing things like generating GML output, it's most practical
to truncate the numbers to 7 or so digits to reduce the data volume,
depending on the practical accuracy of the data and any transformations
that were applied.  When generating Shapefiles, you get full doubles
whether you want them or not.

In an projlib CRS definition like (proj-4.6.0, "epsg" file):

# NAD27 / Cuba Sur
<2086> +proj=lcc +lat_1=20.71666666666667 +lat_0=20.71666666666667 +lon_0=-76.83333333333333
+k_0=0.99994848 +x_0=500000 +y_0=229126.939 +ellps=clrk66 +datum=NAD27 +units=m +no_defs  <>

what is the accumulated error in the full transformation if lat_1 is
specified to only 4 to 6 digits in the config file instead of 16?  In a
situation like this, the specified value is infinitely precise (defined by
law), so it would be a shame not to represent it as accurately as feasible.

> Of course, the other question is why is the number, apparently from a
> data base, store in binary in the first place?

It isn't.  It's stored in BCD, base 100.  However, it is converted to a
double for processing inside of application programs.  I was saying that
Oracle bizarrely has an off-by-one error in its conversion, but this bug
is suppressed if the error control is always applied.

OTOH, the "epsg" file contains some bizarre off-by-one errors, such as in:

# NAD83 / Mississippi East (ftUS)
<2254> +proj=tmerc +lat_0=29.5 +lon_0=-88.83333333333333 +k=0.9999500000000001
+x_0=300000.0000000001 +y_0=0 +ellps=GRS80 +datum=NAD83 +to_meter=0.3048006096012192 +no_defs  <>

The x_0 value here presumably should be the integer 300000 which has
an infinitely precise double representation.  I can't imagine how
the .0000000001 got added to it.  Perhaps MS Access has some strange
BCD->double conversion faults.

> Sorry for the sarcasm but the nature of this non-problem overwhelms me.

The issue is extremely general, not just applying to projection parameters.
Most values pass through systems unchanged, except for the arbitrary
truncations that people apply blindly.  Values that are suspected to
have full source precision (e.g., those which haven't been derived from
complex computations) should remain unmolested, and the 16-digit+error
control approach allows this while suppressing off-by-one eyesores in
converting to text.

--------------------------+----------------------+--------------------------
Dr. Craig S. Bruce        | Ph 819-771-8303 x205 |             CubeWerx Inc.
Senior Software Developer |   Fax 819-771-8388   |  Gatineau, Québec, Canada
csbruce <at> cubewerx.com      |  http://csbruce.com/ |  http://www.cubewerx.com/
--------------------------+----------------------+--------------------------
           "XML is like violence; if it doesn't solve the problem,
                            just use more of it."
_______________________________________________
Proj mailing list
Proj <at> lists.maptools.org
http://lists.maptools.org/mailman/listinfo/proj
Gerald I. Evenden | 2 Apr 2010 01:38
Picon

Re: Numerical Precision in Proj, WKT, and everywhere

On Thursday 01 April 2010 6:32:15 pm Craig Bruce wrote:
> "Gerald I. Evenden" <geraldi.evenden <at> gmail.com> wrote:
> > Prey tell, why is someone using %,16g to print out a numbet that is
> > rarely, if ever, specified to more 4 to 6 digits? %g would be completely
> > adequate and avoid the absurd gyration suggested in the original email.
>
> Sometimes those 4 to 6 digits aren't round in decimal, so %g truncates the
> value quite a lot. 

Au contraire. When you have a number with few significant figures that have 
trailing zeroes such as "0.999600....000" any format that has sufficient 
width to contain the significant figures will preserve the significance. 
There is no rounding nor trucation problem.

The only hassle occurs when the the fractions are irrational such as pi, then 
there is a problem. But in these situations (and probably others) the 
continual bouncing back and forth between decimal and binary should be 
avoided. I strongly suspect that there is no good reason for it anyway.

As and aside, all the binary to decimal conversion procedures that I have run 
into always round. The only hassles I have seen is when the precision is 
close to the limits of the floating point binary mantissa.

--

-- 
The whole religious complexion of the modern world is due
to the absence from Jerusalem of a lunatic asylum.
-- Havelock Ellis (1859-1939) British psychologist
_______________________________________________
Proj mailing list
Proj <at> lists.maptools.org
http://lists.maptools.org/mailman/listinfo/proj

Craig Bruce | 2 Apr 2010 02:42
Favicon

Re: Numerical Precision in Proj, WKT, and everywhere

"Gerald I. Evenden" <geraldi.evenden <at> gmail.com> wrote:

> Au contraire. When you have a number with few significant figures that have 
> trailing zeroes such as "0.999600....000" any format that has sufficient 
> width to contain the significant figures will preserve the significance. 
> There is no rounding nor trucation problem.

Obviously, but I'm not talking about specific values that happen to
be round.

> The only hassle occurs when the the fractions are irrational such as pi,
> then there is a problem.

There's also a problem with rational fractions, like 1/60, as I showed
in the WMS example.  You can easily write x = 1/60.0 in a program get a
fully precise value for whatever representation you are using, but you
can't put this expression on the proj command line (AFAIK).

> But in these situations (and probably others) the continual bouncing
> back and forth between decimal and binary should be avoided. I strongly
> suspect that there is no good reason for it anyway.

I agree that it's best to avoid unnecessary conversions, but there are
many interfaces that require text, such as the proj command-line or OGC
web interfaces.  The same issue occurs with the database/application
interface with BCD.  We are forced to obey the interfaces of the various
systems we wish to chain together.

--------------------------+----------------------+--------------------------
Dr. Craig S. Bruce        | Ph 819-771-8303 x205 |             CubeWerx Inc.
Senior Software Developer |   Fax 819-771-8388   |  Gatineau, Québec, Canada
csbruce <at> cubewerx.com      |  http://csbruce.com/ |  http://www.cubewerx.com/
--------------------------+----------------------+--------------------------
  "Committee, n.: A group of people that, when given the task of deciding
  whether to start array indices from either 0 or 1, compromises to declare
                that they shall start from 0.5." -- unknown
_______________________________________________
Proj mailing list
Proj <at> lists.maptools.org
http://lists.maptools.org/mailman/listinfo/proj
Gerald I. Evenden | 2 Apr 2010 16:41
Picon

Re: Numerical Precision in Proj, WKT, and everywhere

On Thursday 01 April 2010 8:42:25 pm Craig Bruce wrote:
> "Gerald I. Evenden" <geraldi.evenden <at> gmail.com> wrote:
> > Au contraire. When you have a number with few significant figures that
> > have trailing zeroes such as "0.999600....000" any format that has
> > sufficient width to contain the significant figures will preserve the
> > significance. There is no rounding nor trucation problem.
>
> Obviously, but I'm not talking about specific values that happen to
> be round.

But this was the example you used in your first example.

> > The only hassle occurs when the the fractions are irrational such as pi,
> > then there is a problem.
>
> There's also a problem with rational fractions, like 1/60, as I showed
> in the WMS example.  You can easily write x = 1/60.0 in a program get a

I'm discussing decimal fractions--- xx.02935729350

> fully precise value for whatever representation you are using, but you
> can't put this expression on the proj command line (AFAIK).

No need to.

> > But in these situations (and probably others) the continual bouncing
> > back and forth between decimal and binary should be avoided. I strongly
> > suspect that there is no good reason for it anyway.
>
> I agree that it's best to avoid unnecessary conversions, but there are
> many interfaces that require text, such as the proj command-line or OGC
> web interfaces.  The same issue occurs with the database/application
> interface with BCD.  We are forced to obey the interfaces of the various
> systems we wish to chain together.

BCD? Binary Coded Decimal? Just that I have not heard the term in maybe 25 
years. Seriously. I am not being sarcastic.

In most cases there is an acceptible precision for representing a number. For 
example pi may only be needed as 3.1415927. It is assumed that zeroes are 
appended indefinitely. As long the the intermediate formating retains a 7 
digit significance, the number will be preserved regardless of binary/decimal 
conversions.

As I recall, the only place you demonstrated precision problems was an example 
that went to full mantissa precision where I admit there may be problems. 
*But* I know of no place in cartography where such precision is required.

Lastly, if one is involved with "round tripping" geographic/Cartesian 
coordinates there will be a problem but this problem more likely to be a 
hassle with the inverse projection nor exactly producing original value in 
addition. Formatting may also create a problem and this is why all my program 
proj versions had a binary pipe so that a round trip could be made without 
going through the formatter. But this round tripping should be avoided at all 
costs.

--

-- 
The whole religious complexion of the modern world is due
to the absence from Jerusalem of a lunatic asylum.
-- Havelock Ellis (1859-1939) British psychologist
_______________________________________________
Proj mailing list
Proj <at> lists.maptools.org
http://lists.maptools.org/mailman/listinfo/proj

Craig Bruce | 2 Apr 2010 21:52
Favicon

Re: Numerical Precision in Proj, WKT, and everywhere

Gerald I. Evenden <geraldi.evenden <at> gmail.com> wrote:

> BCD? Binary Coded Decimal? Just that I have not heard the term in maybe
> 25 years. Seriously. I am not being sarcastic.

All the common database systems use BCD to represent the very common
NUMBER data type.  This allows them to represent terminating decimal
fractions exactly and SQL allows numbers to have up to 38 decimal digits.
See the section "Packed decimal numbers" near the bottom of:

http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=/com.ibm.db2.luw.sql.ref.doc/doc/r0001030.html

The down side is that calculations are slower (not preformed directly
by hardware), but this slowness is hidden by the large overhead costs of
database access.

> As I recall, the only place you demonstrated precision problems was
> an example that went to full mantissa precision where I admit there
> may be problems.  *But* I know of no place in cartography where such
> precision is required.

I'm speaking much more generally than cartography.  I'm saying that all
doubles should be displayed in the way I suggest in general by default
unless you know the values have errors in them or the application
domain suggests using fewer digits.  (Most of the code I write is very
general purpose.)  Otherwise, you're introducing arbitrary errors into
the numbers, causing a greater divergence between a binary->binary and
a binary->text->binary round trip if such a round trip can't be avoided.
You need to use 17 digits to get a completely clean round trip, but this
exposes the eyesore patterns in the IEEE encoding.

--------------------------+----------------------+--------------------------
Dr. Craig S. Bruce        | Ph 819-771-8303 x205 |             CubeWerx Inc.
Senior Software Developer |   Fax 819-771-8388   |  Gatineau, Québec, Canada
csbruce <at> cubewerx.com      |  http://csbruce.com/ |  http://www.cubewerx.com/
--------------------------+----------------------+--------------------------
    "The purpose of The Law is to protect criminals from being lynched."
                [What, you though it was to protect *you*?] 
_______________________________________________
Proj mailing list
Proj <at> lists.maptools.org
http://lists.maptools.org/mailman/listinfo/proj
Gerald I. Evenden | 2 Apr 2010 22:44
Picon

Re: Numerical Precision in Proj, WKT, and everywhere

On Friday 02 April 2010 3:52:24 pm Craig Bruce wrote:
> Gerald I. Evenden <geraldi.evenden <at> gmail.com> wrote:
> > BCD? Binary Coded Decimal? Just that I have not heard the term in maybe
> > 25 years. Seriously. I am not being sarcastic.
>
> All the common database systems use BCD to represent the very common
> NUMBER data type.  This allows them to represent terminating decimal
> fractions exactly and SQL allows numbers to have up to 38 decimal digits.
> See the section "Packed decimal numbers" near the bottom of:
>
> http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=/com.i
>bm.db2.luw.sql.ref.doc/doc/r0001030.html
>
> The down side is that calculations are slower (not preformed directly
> by hardware), but this slowness is hidden by the large overhead costs of
> database access.

I have long ago given up worrying about speed when dealing with DBMS issues. 
Gives headaches. Just a good excuse to buy a faster machine.

> > As I recall, the only place you demonstrated precision problems was
> > an example that went to full mantissa precision where I admit there
> > may be problems.  *But* I know of no place in cartography where such
> > precision is required.
>
> I'm speaking much more generally than cartography.  I'm saying that all
> doubles should be displayed in the way I suggest in general by default
> unless you know the values have errors in them or the application
> domain suggests using fewer digits.  (Most of the code I write is very
> general purpose.)  Otherwise, you're introducing arbitrary errors into
> the numbers, causing a greater divergence between a binary->binary and
> a binary->text->binary round trip if such a round trip can't be avoided.
> You need to use 17 digits to get a completely clean round trip, but this
> exposes the eyesore patterns in the IEEE encoding.

I philosophically disagree with you here. Know the characteristics of the data 
you are storing and the expected precision and use appropriate field widths.

If you are worrying about speed, the general case is likely to be very 
wasteful.

It is hypothetically possible that you may encounter extended precision data 
and need to specify 20, 30 or more digit precision. You're stuck if you 
assume a 64 bit float limit of 17 decimal.

My old addage: know your data.

--

-- 
The whole religious complexion of the modern world is due
to the absence from Jerusalem of a lunatic asylum.
-- Havelock Ellis (1859-1939) British psychologist
_______________________________________________
Proj mailing list
Proj <at> lists.maptools.org
http://lists.maptools.org/mailman/listinfo/proj

Glynn Clements | 3 Apr 2010 00:08

Re: Numerical Precision in Proj, WKT, and everywhere


Gerald I. Evenden wrote:

> The only hassle occurs when the the fractions are irrational such as pi, then 
> there is a problem. But in these situations (and probably others) the 
> continual bouncing back and forth between decimal and binary should be 
> avoided.

Such is the reality of processing data using a multitude of tools
connected by textual file formats. Most such file formats use decimal
representation, while most software uses binary (e.g. IEEE-754
floating point) internally.

Coupled with the fact that so few of the programmers who use floating
point have a good understanding of the issues, it's ultimately
necessary to retain some "headroom" in order to minimise the impact of
the inevitable flaws.

--

-- 
Glynn Clements <glynn <at> gclements.plus.com>
_______________________________________________
Proj mailing list
Proj <at> lists.maptools.org
http://lists.maptools.org/mailman/listinfo/proj


Gmane