Opened 13 years ago

Closed 13 years ago

#17736 closed Bug (fixed)

django.contrib.gis.geos.polygon from_bbox loss of floating-point accuracy

Reported by: tdihp@… Owned by: David Eklund
Component: GIS Version: 1.3
Severity: Normal Keywords:
Cc: daek@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: yes Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

from_bbox use a string formatting to generate a geom instance, this will loose accuracy, when it matters

for example:

>>> import math
>>> from django.contrib.gis.geos import Polygon
>>> l,t,r,b = 0.0,0.0,math.pi,math.pi
>>> Polygon.from_bbox((l,t,r,b)).coords
(((0.0, 0.0), (0.0, 3.14159265359), (3.14159265359, 3.14159265359), (3.14159265359, 0.0), (0.0, 0.0)),)
>>> Polygon(((l,t),(r,t),(r,b),(l,b),(l,t))).coords
(((0.0, 0.0), (3.141592653589793, 0.0), (3.141592653589793,

Attachments (1)

17736_gis_precision.diff (1.6 KB ) - added by David Eklund 13 years ago.
Uses Polygon if the input is numerical. Test added.

Download all attachments as: .zip

Change History (10)

comment:1 by Claude Paroz, 13 years ago

Component: UncategorizedGIS

In any case, whenever you are using floating-point values, rounding will occur at some point. Currently, from_bbox uses the Python built-in str which defaults to 12 significant digits. Of course, you can always do the rounding yourself. For example, to keep 25 significant numbers:

>>> l,t,r,b = 0.0,0.0,format(math.pi, '.25f'), format(math.pi, '.25f')
>>> Polygon.from_bbox((l,t,r,b)).coords
(((0.0, 0.0),
  (0.0, 3.1415926535897931),
  (3.1415926535897931, 3.1415926535897931),
  (3.1415926535897931, 0.0),
  (0.0, 0.0)),)

So even with 25 significant numbers, it seems GEOS rounds to 16. Should from_bbox also default to 16 significant digits?

comment:2 by anonymous, 13 years ago

I think

return Polygon(((l,t),(r,t),(r,b),(l,b),(l,t)))

would be good enough for the from_bbox function.
Django's geos documention didn't mention that bbox can be strings, yet (am I right?).
So that we can forget the digits, here.

comment:3 by Aymeric Augustin, 13 years ago

Triage Stage: UnreviewedAccepted

in reply to:  2 comment:4 by David Eklund, 13 years ago

I think this makes sense, I have attached a patch making this change.

Replying to anonymous:

I think

return Polygon(((l,t),(r,t),(r,b),(l,b),(l,t)))

would be good enough for the from_bbox function.
Django's geos documention didn't mention that bbox can be strings, yet (am I right?).
So that we can forget the digits, here.

comment:5 by David Eklund, 13 years ago

Cc: daek@… added
Has patch: set

As noted above the str() method of floats rounds the input to 12 digits. I guess python floats are double precision (depends on the machine one is using?), which means roughly 16 digits of accuracy. In contrast to str(), the repr() method of floats preserves double precision: it truncates the input to 17 digits. So for floats it might make more sense to use repr() than to use str().

Perhaps one wants to keep the option of passing strings to from_bbox() and also create the polygon using GEOSGeometry directly. In that case, one might consider using repr() if the coordinate is a float and str() otherwise:

    def from_bbox(cls, bbox):
        "Constructs a Polygon from a bounding box (4-tuple)."
        box_string = []
        for z in bbox:
            if isinstance(z, float):
                box_string.append(repr(z))
            else: 
                box_string.append(str(z))
        x0, y0, x1, y1 = box_string
        return GEOSGeometry( 'POLYGON((%s %s, %s %s, %s %s, %s %s, %s %s))' %  (
                x0, y0, x0, y1, x1, y1, x1, y0, x0, y0) )
Version 0, edited 13 years ago by David Eklund (next)

comment:6 by Claude Paroz, 13 years ago

Needs tests: set

I like the idea of converting float to str as late as possible. So my suggestion would be to test if the bounding box is made of numeric types, and depending on this, choose either the Polygon() or the GEOSGeometry() creation method. We can probably assume that all bbox elements are of the same type, so checking the first one could be sufficient.

Tests wanted!

in reply to:  6 comment:7 by David Eklund, 13 years ago

Owner: changed from nobody to David Eklund
Status: newassigned

Sounds good. I will work on an implementation along these lines. I'm not sure, but I think Polygon() ends up creating a GEOSGeometry object using a string created from the numerical input, so that going via Polygon() is essentially just a detour (albeit perhaps a convenient one for numerical input). I will look closer at this issue.

Replying to claudep:

I like the idea of converting float to str as late as possible. So my suggestion would be to test if the bounding box is made of numeric types, and depending on this, choose either the Polygon() or the GEOSGeometry() creation method. We can probably assume that all bbox elements are of the same type, so checking the first one could be sufficient.

Tests wanted!

by David Eklund, 13 years ago

Attachment: 17736_gis_precision.diff added

Uses Polygon if the input is numerical. Test added.

comment:8 by anonymous, 13 years ago

The attached patch checks first whether the bounding box consists of objects of numeric type (real numbers). If that is the case for all elements of the bounding box, then the polygon is created with Polygon. Otherwise, that is if some element of the bounding box is not a real number, then the bounding box is converted to a string and the polygon is created using GEOSGeometry.

A test has been added. The test runs from_bbox on a particular input and checks the accuracy of one of the coordinates of the output to 14 digits.

comment:9 by Claude Paroz <claude@…>, 13 years ago

Resolution: fixed
Status: assignedclosed

In [17824e2b74ca02b9914e853c818fa512a0f9ef34]:

Fixed #17736 -- Kept maximal floating-point accuracy in from_bbox

When constructing a polygon with Polygon.from_bbox, do not convert
parameters to strings at this stage (str defaults to 12 significant
digits).
Thanks tdihp@… for the report and David Eklung for the patch.

Note: See TracTickets for help on using tickets.
Back to Top