Opened 12 years ago
Last modified 2 years ago
#18887 new Cleanup/optimization
LineString array method (property) returns different data type without and with NumPy installed
Reported by: | mal | Owned by: | |
---|---|---|---|
Component: | GIS | Version: | 1.4 |
Severity: | Normal | Keywords: | GIS, NumPy, LineString |
Cc: | deprince@…, Ubercore | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | yes |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
Apparently LineString (and probably other geometry types) return data in different types depending on whether NumPy is installed.
Simple test case (reproducible in Django 1.3, 1.3.1 and 1.4.1).
Clean virtual env with only Django installed (and ipython).
First - no NumPy installed.
In [1]: from django.contrib.gis.geos import LineString In [2]: line = LineString((0, 0), (3, 3)) In [3]: line.array Out[3]: [(0.0, 0.0), (3.0, 3.0)]
Now - install NumPy and try again.
In [1]: from django.contrib.gis.geos import LineString In [2]: line = LineString((0, 0), (3, 3)) In [3]: line.array Out[3]: array([[ 0., 0.], [ 3., 3.]])
[(0.0, 0.0), (3.0, 3.0)] =! array( 0., 0.],[ 3., 3.)
This is rather serious issue.
Change History (17)
follow-up: 2 comment:1 by , 12 years ago
comment:2 by , 12 years ago
Replying to claudep:
Replying to mal:
This is rather serious issue.
Maybe you could add a real use case demonstrating why you consider this as a serious issue?
Because it returns two different values depending on if a module is installed or not? Any conditionals on the result will be dependant on if numpy is installed or not.
comment:3 by , 12 years ago
Triage Stage: | Unreviewed → Accepted |
---|
Accepting on the base that this should be at least documented.
I see the potential issues, but apart documentation, do you see any possible improvements?
comment:4 by , 12 years ago
Cc: | added |
---|
Would it be an option to make numpy a dependency for the installation of gis?
comment:5 by , 12 years ago
Scratch that ... you don't want numpy to be required to install django. It seems if the user wants a list they would call list(LineString(...)) instead of LineString(...).array. Perhaps we should issue a warning when the user calls .array without numpy installed?
>>> line = LineString((0,0), (3,3)) >>> line.array array([[ 0., 0.], [ 3., 3.]]) >>> list(line) [(0.0, 0.0), (3.0, 3.0)]
comment:6 by , 12 years ago
Owner: | changed from | to
---|
comment:7 by , 12 years ago
I've confirmed this, going to look into it a little more closely to see if we can make the behavior predictable somehow, without losing the ability to take advantage of numpy if installed.
comment:8 by , 12 years ago
Cc: | added |
---|---|
Triage Stage: | Accepted → Design decision needed |
Yeah, the only reasonable options I see are the ones outlined: warning when array is called, and/or noting this in the docs. The behavior here seems to be intended, not a bug. Marking as design decision needed.
comment:9 by , 12 years ago
Here's a proposed fix if a warning in the absence of numpy is desired.
https://github.com/adamdeprince/django/commit/d02431ac5f15616df4cb3d39592651e59fdde17f
comment:10 by , 12 years ago
Has patch: | set |
---|---|
Status: | new → assigned |
Triage Stage: | Design decision needed → Accepted |
I've submitted a pull request for the following commit, which adds a setting: https://github.com/peterlandry/django/commit/0c9839355a005e8805467aa09d4fcf9795965b76
comment:11 by , 12 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:12 by , 12 years ago
No, this doesn't capture what is really happening.
There is a bug with this: [(0.0, 0.0), (3.0, 3.0)] =! array( 0., 0.],[ 3., 3.?)
This snippet doesn't mean
{{
not ([(0.0, 0.0), (3.0, 3.0)] =! array( 0., 0.],[ 3., 3.?))
}}
it means these types are completely incomparable.
If you compare these two you get completely different behavior than if you compared two lists.
numpy.array([(0.0, 0.0), (3.0, 3.0)]) == [(0.0, 0.0), (3.0, 3.0)]
array([[ True, True],
[ True, True]], dtype=bool
There will be two populations that use this API. One will have numpy installed on their dev machines and one won't. When they release extensions and share code those they share it with may not have the same numpy status.
It seems pretty clear that if you want a list you should call list(LineString(...)) and if you actually want a numpy array you could call .array and expect it. If the system will silently break that promise you should have an exception, but we don't want to break existing code, so I'm proposing we raise a warning instead. This diff doesn't do this. My proposed code does.
We can't have two APIs that depend on the module you have installed; I really feel strongly that .array without numpy installed should warn.
comment:14 by , 12 years ago
Triage Stage: | Ready for checkin → Accepted |
---|
Please do not mark your own patch as RFC.
comment:15 by , 12 years ago
Needs documentation: | set |
---|---|
Patch needs improvement: | set |
Type: | Bug → Cleanup/optimization |
I'm inclined to agree with claudep here that this is primarily a documentation issue - an admonition probably.
The Numpy related LineString methods aren't documented at all.
A warning would become annoying if you've accepted the non-numpy behavior in your code, and don't plan on changing it. This is as valid as the Numpy approach as there has been no documentation. So why should that user now suffer eternal warnings?
The comment on the property suggests that .array was to return a numpy array, but all the other properties are commented as returning a "list or numpy array"
If the numpy difference is documented clearly - anyone wanting to right portable/sharable code can check from django.contrib.gis.geos.base import numpy
and make sure they standardize the data.
I'm closing the current pull requests as the wrong direction - not RFC.
comment:16 by , 9 years ago
Owner: | changed from | to
---|
I don't see any benefit in returning numpy array from Django, because Django gets the list at first and then just creates numpy array with that list, users can do it easily by themselves. So I propose to deprecate array
property in favor of list(LineString)
.
comment:17 by , 2 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
Replying to mal:
Maybe you could add a real use case demonstrating why you consider this as a serious issue?