-
Notifications
You must be signed in to change notification settings - Fork 901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: optionally retrieve CRS from SRID in from_wkt/from_wkb #2956
base: main
Are you sure you want to change the base?
Conversation
Convert elements to strings if needed
73e5748
to
b9f3b31
Compare
@martinfleis We have a dilemma here: either we have to devise a way to read the SRID from the raw data (hex or string), or convert the WKBElements into strings before passing them off to lib.from_wkb. Should an attempt be made to convert the elements to strings or bytes? I solved it now by converting elements to string when needed. It's not very clean, though. |
Now I've used the shapely get_srid method as suggested in #2952. EWKT reading is not working, with the following:
It seems GEOS is not able to read this format. What do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Some comments on the implementation side.
This will also need a changelog note.
try: | ||
len(data) | ||
except TypeError: | ||
raise TypeError( | ||
"Input must be iterable (e.g. Series, list, array)" | ||
f"found {type(data)} instead." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use pandas.api.types.is_list_like
instead.
first_srid = srid_getter(data[0]) | ||
if all(first_srid == srid_getter(obj) for obj in data): | ||
if first_srid > 0: | ||
shared_crs = first_srid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose you can do shapely.get_srid(data)
to get an array of srids without a for loop on Python side, no? And then check they're all the same.
"Geometry elements do not agree on CRS. This may cause problems. " | ||
"CRS could not be inferred from data.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Geometry elements do not agree on CRS. This may cause problems. " | |
"CRS could not be inferred from data.\n", | |
"Geometries contain different SRIDs encoding different coordinate reference systems. " | |
"CRS could not be inferred from data.\n", |
Something along these lines may be better?
warnings.warn( | ||
f"The elements passed share a CRS, but it does not match the " | ||
"CRS passed as parameter.\n\n" | ||
f"CRS passed as parameter: {param_crs}.\n" | ||
f"CRS from data: {shared_crs}.\n" | ||
"CRS could not be inferred from data.\n", | ||
UserWarning, | ||
stacklevel=4, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a custom CRS is passed, we should not try to retrieve it from SRID to avoid calling get_srid
when we don't need to. So that shall be added in the first place.
Then this check is not needed as we would go this path only if param_crs is None.
@@ -174,13 +230,18 @@ def from_wkb(data, crs=None): | |||
---------- | |||
data : array-like | |||
list or array of WKB objects | |||
The 'srid' can be read from geoalchemy2's WKBElement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 'srid' can be read from geoalchemy2's WKBElement. |
@@ -174,13 +230,18 @@ def from_wkb(data, crs=None): | |||
---------- | |||
data : array-like | |||
list or array of WKB objects | |||
The 'srid' can be read from geoalchemy2's WKBElement. | |||
crs : value, optional | |||
Coordinate Reference System of the geometry objects. Can be anything accepted by | |||
:meth:`pyproj.CRS.from_user_input() <pyproj.crs.CRS.from_user_input>`, | |||
such as an authority string (eg "EPSG:4326") or a WKT string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
such as an authority string (eg "EPSG:4326") or a WKT string. | |
such as an authority string (eg "EPSG:4326") or a WKT string. If None, geopandas will try to infer CRS from the SRID of geometries (if available). |
I would add this here, instead of that sentence above (note it will need better formatting to avoid linting fail).
converted_data = vectorized.from_wkb(data) | ||
shared_crs = _get_common_crs_from_geometries(converted_data, crs) | ||
if shared_crs is not None: | ||
crs = shared_crs | ||
return GeometryArray(converted_data, crs=crs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
converted_data = vectorized.from_wkb(data) | |
shared_crs = _get_common_crs_from_geometries(converted_data, crs) | |
if shared_crs is not None: | |
crs = shared_crs | |
return GeometryArray(converted_data, crs=crs) | |
geoms = vectorized.from_wkb(data) | |
if not crs: | |
crs = _get_common_crs_from_geometries(geoms, crs) | |
return GeometryArray(geoms, crs=crs) |
If _get_common_crs_from_geometries
returns None when nothing useful is found, we can simplify it like this.
@@ -200,12 +261,17 @@ def from_wkt(data, crs=None): | |||
---------- | |||
data : array-like | |||
list or array of WKT objects | |||
The 'srid' can be read from geoalchemy2's WKTElement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as above
converted_data = vectorized.from_wkt(data) | ||
shared_crs = _get_common_crs_from_geometries(converted_data, crs) | ||
if shared_crs is not None: | ||
crs = shared_crs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as above. Note that you missed adapting return
here.
Could you also check what is the performance hit on this one? I.e. how long does it take to get srids compared to the geometry decoding. We may want to add an option to completely avoid that path, even for |
t1_wkb_elem = WKBElement(self.t1.wkb, srid=srid).as_ewkb() | ||
sq_wkb_elem = WKBElement(self.sq.wkb, srid=srid).as_ewkb() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we not depend on geoalchemy2 in basic tests and depend on it only optionally when we want to test it explicitly? We don't have geoalchemy2 in most of our CI envs and I'd like to keep that like that. You can use shapely (set_srid) directly instead.
This PR aims to enable reading geoalchemy WKTs and WKBs with SRID information in them. See documentation here: https://geoalchemy-2.readthedocs.io/en/latest/elements.html