Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing data for 2 municipalities #18

Open
83221n4ndr34 opened this issue Jan 19, 2024 · 0 comments
Open

missing data for 2 municipalities #18

83221n4ndr34 opened this issue Jan 19, 2024 · 0 comments

Comments

@83221n4ndr34
Copy link

I noticed that compared to the istat dataset (https://www.istat.it/it/archivio/222527) the data is missing for 2 municipalities.
From wikipedia: one municipality changed code on 15/05/2023 (https://it.wikipedia.org/wiki/Moransengo-Tonengo) while the other was established on 01/01/2023 mergering 3 municipalities (https://it.wikipedia.org/wiki/Bardello_con_Malgesso_e_Bregano).

Here is a quick R code for verification. (The thing is easily noticed also graphically because using municipalities maps have holes)

#### installation of the required packages if they are not already present and loading

libraries <- c("dplyr", "sf")
for(library in libraries){
  if(!library %in% installed.packages()){
    install.packages(library)
  }
  library(library, character.only = TRUE)
}


### download of the 2 data sets

## download from github

# https://github.com/openpolis/geojson-italy
# GitHub repository url
url_regions <- "https://github.com/openpolis/geojson-italy/raw/master/geojson/limits_IT_regions.geojson"
url_provinces <- "https://github.com/openpolis/geojson-italy/raw/master/geojson/limits_IT_provinces.geojson"
url_municipalities <- "https://github.com/openpolis/geojson-italy/raw/master/geojson/limits_IT_municipalities.geojson"
# reading GeoJSON files
regions_github <- st_read(url_regions)
provinces_github <- st_read(url_provinces)
municipalities_github <- st_read(url_municipalities)

## download from istat website

# https://www.istat.it/it/archivio/222527 -> description: https://www.istat.it/it/files//2018/10/Descrizione-dei-dati-geografici-2020-03-19.pdf
# download file "Administrative boundaries 2023 (zip)" from the link
# setting the correct working directory
setwd("...")
# import of the extracted files based on the administrative category
# with transformation of data from CRS WGS 84 / UTM zone 32N to CRS WGS84 (EPSG:4326)
municipalities <- st_transform(st_read("Limiti01012023/Com01012023/Com01012023_WGS84.shp"), crs_wgs84)
provinces <- st_transform(st_read("Limiti01012023/ProvCM01012023/ProvCM01012023_WGS84.shp"), crs_wgs84)
regions <- st_transform(st_read("Limiti01012023/Reg01012023/Reg01012023_WGS84.shp"), crs_wgs84)


### verification that 2 municipalities are missing

## conversion of spatial data into a normal dataframe
municipalities_df <- as.data.frame(municipalities)
municipalities_github_df <- as.data.frame(municipalities_github)
provinces_df <- as.data.frame(provinces)
provinces_github_df <- as.data.frame(provinces_github)
regions_df <- as.data.frame(regions)
regions_github_df <- as.data.frame(regions_github)

## verification using istat codes

# change of column names to make the comparison easier
municipalities_github_df <- rename(municipalities_github_df, PRO_COM_T = com_istat_code)
provinces_github_df <- rename(provinces_github_df, COD_PROV = prov_istat_code_num)
regions_github_df <- rename(regions_github_df, COD_REG  = reg_istat_code_num)

# anti_join to select the elements of the left df that are not present in the right df
differences_municipalities <- anti_join(municipalities_df, municipalities_github_df, by = "PRO_COM_T")
differences_provinces <- anti_join(provinces_df, provinces_github_df, by = "COD_PROV")
differences_regions <- anti_join(regions_df, regions_github_df, by = "COD_REG")

# print the number of differences found
print(paste("Number of missing municipalities:", nrow(differences_municipalities))) # -> : 2
print(paste("Number of missing provinces:", nrow(differences_provinces))) # -> : 0
print(paste("Number of missing regions:", nrow(differences_regions))) # -> : 0

# print the names of the missing municipalities
print(differences_municipalities$COMUNE)
# -> 
# [1] "Bardello with Malgesso and Bregano"
# [2] "Moransengo-Tonengo"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant