Skip to content

Functionality to clip, subset, plot and write EU-MOHP data

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

MxNl/eumohpclipr

Repository files navigation

eumohpclipr

Lifecycle: experimental R-CMD-check Project Status: Active - The project has reached a stable, usable state and is being actively developed. License: MIT + file LICENSE codecov

The goal of eumohpclipr is to provide users of the EU-MOHP data set with the functionality to

  1. eumohp_clip(): Clip the raster .tif files to their custom area of interest and define a required subset of the data.
  2. eumohp_plot(): Plot the clipped and subsetted data relatively fast through using stars proxy objects.
  3. eumohp_write(): To write the clipped and subsetted data to disc as .tif files. This helps to reduce file sizes to the required spatial extent.

The EU-MOHP data set is meant as temporally static and spatially contiguous environmental predictors for the application of predominantly machine learning models for hydrologic and hydrogeological modelling / mapping tasks. It can be used along with other environmental predictors, such as land use and land cover data, soil maps, geological maps, digital elevation models, etc.

Installation

You can install the development version of eumohpclipr from GitHub with:

# install.packages("devtools")
devtools::install_github("MxNl/eumohpclipr")

Example

Prerequisites

In order to use this package with data, it is a necessary to download the EU-MOHP data set from the data hosting platform hydroshare in the latest or required version. After the dowload the zipped .7z files must be unzipped and stored in the same directory.

Load the package

library(here)
#> here() starts at D:/Data/github/eumohpclipr
library(eumohpclipr)

Clipping and Subsetting

Get the directory, where the EU-MOHP Geotiffs (.tif) files are stored.

eumohp_directory <- here::here(
  "..",
  "macro_mohp_feature_test",
  "macro_mohp_feature",
  "output_data"
)

This directory contains all the unzipped downloaded files as described previously on my local computer. This directory needs to be changed according to the directory on your local machine.

Specifying the spatial extent of the clipped result via the argument: countries

eumohp_clipped_countries <- eumohp_clip(
  directory_input = eumohp_directory,
  countries = c("germany", "denmark"),
  buffer = 1E4,
  hydrologic_order = 1:4,
  abbreviation_measure = c("dsd", "lp"),
  eumohp_version = "v013.1.1"
)

The resulting eumohp_clipped object holds a list of clipped and subsetted stars proxy objects. This list can later be fed into the functions eumohp_plot or eumohp_write.

We can have a look at the length of the list eumohp_clipped_countries.

eumohp_clipped_countries |> length()
#> [1] 8

In this case, eumohp_clipped_countries contains 8 stars proxy objects because we requested 4 hydrologic orders (hydrologic_order = 1:4) and 2 measures (abbreviation_measure = c("dsd", "lp")). 4 * 2 = 8.

But there are also other options to specify the area of interest. Specifying the spatial extent of the clipped result via the argument: custom_sf_polygon

eumohp_clipped_customsfpolygon <- eumohp_clip(
  directory_input = eumohp_directory,
  custom_sf_polygon = .test_custom_sf_polygon() |> summarise(),
  buffer = 1E4,
  hydrologic_order = 1:4,
  abbreviation_measure = c("dsd", "lp"),
  eumohp_version = "v013.1.1"
)

Specifying the spatial extent of the clipped result via the argument: region_name_spatcov

eumohp_clipped_regionnamespatcov <- eumohp_clip(
  directory_input = eumohp_directory,
  region_name_spatcov = c("france", "turkey", "italy2"),
  hydrologic_order = 1:4,
  abbreviation_measure = c("dsd", "lp"),
  eumohp_version = "v013.1.1"
)

Here, the argument buffer can not be applied as we are already using the maximum coverage of the EU-MOHP raster files through using the files directly for setting the spatial extent.

Plotting

You can plot the clipped and subsetted data with eumohp_plot().

eumohp_clipped_countries |> 
  eumohp_plot(downsample = 50)
#> Warning: Removed 157332 rows containing missing values (geom_raster).
#> Removed 157332 rows containing missing values (geom_raster).
#> Removed 157332 rows containing missing values (geom_raster).
#> Removed 157332 rows containing missing values (geom_raster).
#> Removed 157332 rows containing missing values (geom_raster).
#> Removed 157332 rows containing missing values (geom_raster).
#> Removed 157332 rows containing missing values (geom_raster).
#> Removed 157332 rows containing missing values (geom_raster).

You don’t have to provide the downsample argument, as it has a default value. But if your area of interest is quite large, a higher value for this argument reduces the time to plot.

Analogous with the second example

eumohp_clipped_customsfpolygon |> 
  eumohp_plot(downsample = 1)
#> Warning: Removed 381835 rows containing missing values (geom_raster).
#> Removed 381835 rows containing missing values (geom_raster).
#> Removed 381835 rows containing missing values (geom_raster).
#> Removed 381835 rows containing missing values (geom_raster).
#> Removed 381835 rows containing missing values (geom_raster).
#> Removed 381835 rows containing missing values (geom_raster).
#> Removed 381835 rows containing missing values (geom_raster).
#> Removed 381835 rows containing missing values (geom_raster).

Analogous with the third example

eumohp_clipped_regionnamespatcov |> 
  eumohp_plot(downsample = 10)
#> Warning: Removed 20654991 rows containing missing values (geom_raster).
#> Removed 20654991 rows containing missing values (geom_raster).
#> Removed 20654991 rows containing missing values (geom_raster).
#> Removed 20654991 rows containing missing values (geom_raster).
#> Removed 20654991 rows containing missing values (geom_raster).
#> Removed 20654991 rows containing missing values (geom_raster).
#> Removed 20654991 rows containing missing values (geom_raster).
#> Removed 20654991 rows containing missing values (geom_raster).

Writing Results to Disk

Regarding run time and memory, writing the data is the crucial part. This can be very expensive. This is why it is recommended to run this in parallel mode on a computer with sufficient memory and can be shut on for a few hours or days.

Write the data in sequential mode (not recommended)

eumohp_clipped_countries |>
  eumohp_write(directory_output = here("..", "output_test"))

Write the data in parallel mode (not recommended)

future::plan(future::multisession, 
             workers = ceiling(length(eumohp_clipped_countries) / 3))

eumohp_clipped_countries |>
  eumohp_write(directory_output = here("..", "output_test"),
               parallel = TRUE)

Citation

citation("eumohpclipr")
#> 
#> To cite package 'eumohpclipr' in publications use:
#> 
#>   Maximilian Nölscher (2022). eumohpclipr: Clipping the EU-MOHP data
#>   set to a selected country. R package version 0.0.0.9000.
#> 
#> Ein BibTeX-Eintrag für LaTeX-Benutzer ist
#> 
#>   @Manual{,
#>     title = {eumohpclipr: Clipping the EU-MOHP data set to a selected country},
#>     author = {Maximilian Nölscher},
#>     year = {2022},
#>     note = {R package version 0.0.0.9000},
#>   }

About

Functionality to clip, subset, plot and write EU-MOHP data

Topics

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages