Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implementing parquet filetype? #36

Open
njtierney opened this issue Mar 16, 2024 · 2 comments
Open

implementing parquet filetype? #36

njtierney opened this issue Mar 16, 2024 · 2 comments

Comments

@njtierney
Copy link
Owner

As mentioned in #4, e.g.

tar_sf_vector(filetype="parquet")
@brownag
Copy link
Contributor

brownag commented Mar 18, 2024

So far, the following works for terra SpatVector objects via the GDAL (Geo)Parquet driver:

library(targets)

tar_script({
    list(
        geotargets::tar_terra_vect(test_terra_parquet,
                                   terra::vect(system.file("ex", "lux.shp", package = "terra")),
                                   filetype = "Parquet")
    )
})

tar_make()
#> Loading required namespace: terra
#> ▶ dispatched target test_terra_parquet
#> ● completed target test_terra_parquet [0.012 seconds]
#> ▶ ended pipeline [0.095 seconds]
x <- tar_read(test_terra_parquet)
x
#>  class       : SpatVector 
#>  geometry    : polygons 
#>  dimensions  : 12, 6  (geometries, attributes)
#>  extent      : 5.74414, 6.528252, 49.44781, 50.18162  (xmin, xmax, ymin, ymax)
#>  source      : test_terra_parquet
#>  coord. ref. : lon/lat WGS 84 (EPSG:4326) 
#>  names       :  ID_1   NAME_1  ID_2   NAME_2  AREA   POP
#>  type        : <num>    <chr> <num>    <chr> <num> <int>
#>  values      :     1 Diekirch     1 Clervaux   312 18081
#>                    1 Diekirch     2 Diekirch   218 32543
#>                    1 Diekirch     3  Redange   259 18664

terra::describe(tar_path_target(test_terra_parquet))
#> [1] "Driver: Parquet/(Geo)Parquet"              
#> [2] "Files: _targets/objects/test_terra_parquet"
#> [3] "Size is 512, 512"                          
#> [4] "Corner Coordinates:"                       
#> [5] "Upper Left  (    0.0,    0.0)"             
#> [6] "Lower Left  (    0.0,  512.0)"             
#> [7] "Upper Right (  512.0,    0.0)"             
#> [8] "Lower Right (  512.0,  512.0)"             
#> [9] "Center      (  256.0,  256.0)"

Still need to implement analogous methods for {sf} objects via #13.

Also, we may want to implement a variant that uses write methods via {arrow} RE: #2 as this may be more efficient for larger targets? Would be interesting to benchmark GDAL vs. Arrow

@Aariq
Copy link
Collaborator

Aariq commented Mar 18, 2024

Would be interesting to benchmark GDAL vs. Arrow

I think benchmarking is definitely part of the plan once things are somewhat stable. Would be good to give users an idea of the tradeoffs in speed, size, and dependency requirements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants