-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalization of compression for spatial targets with GDAL #37
Comments
Thanks for splitting this out, I wanted to make one after closing of #4 but didnt get to it yet. I have been tinkering with some implementations for this issue and will have a draft PR in not too distant future |
Is there a reason to not just zip outputs of all GDAL drivers, even ones that are a single file? Are there downsides to using /vsizip/ ? e.g. is it not available in some instances?. |
Having the extra zip layer is a bit weird for formats that are both single-file and include internal compression. And, there's the zip layer to read through so it's less efficient. Note that GDAL added SOZip capability, which cloud-i-fied storing file/s within zip and made it very fast (not all zips will be as efficient). I don't think you'd want logic to determine if a GeoTIFF is not compressed to pivot on, even that has some explosion of option combinations. I think these kinds of choices are out of scope for this project (but very keen to discuss). Its support is GDAL and build dependent, so on CRAN you are at the behest currently of the Windows maintainer's efforts, mostly guided by Roger Bivand in the past, and similarly for Mac, and then the binary installers that align to linux builds. That's probably a good level to track to specify 1) version/s and 2) capabilities to make some boundaries. There's a lot of other subtleties too, because files like GeoTIFF and Geopackage could have sidecar files (that's how GDAL supports categorical rasters Raster Attribute Tables, RAT) for GeoTIFF for example, and there are controls about whether sidecar files are searched for at URLs and directories ... so, apologies all I can think of are details and complications. I think generally it's not a good idea to add a zip or any other layer unless you really need to, it's better to move to and advise modern formats (GeoTIFF, (Geo)Parquet, FlatGeobuf, Zarr) - but if you need to, the zip container can be a good solution (bundle up one or many shapefiles, or MapInfo files, or CSVs or many other options). Note there are also virtual file system support for gzip, tar, Azure, AWS, Google storage, on and on so I tend to suggest stay as close to what GDAL can do without adding layers (but, that's not a straightforward topic without putting some pretty tight boundaries on the scope). |
Oof, yeah I can forsee us having to do a lot of thinking around this and it might be best for |
Just pulling this from #4 as I'm not sure we captured this as an issue?
Generalization of the "multiple file target compression" GDAL /vsizip/ approach to all backends and formats that support it
From @brownag
The text was updated successfully, but these errors were encountered: