Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add eTeryt designators if applicable #4

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

not7cd
Copy link

@not7cd not7cd commented Mar 31, 2020

Motivation

Italian ministry shares its data here: https://github.com/pcm-dpc/COVID-19
In region files, they use region codes next to region names.

Poland has a system for unique ids for cities and administration units.
http://eteryt.stat.gov.pl/eTeryt/

Because in Poland there are cities with non-unique names, adding such ids should help with further data manipulation.

Proposed change

Add python modules with CLI capability to seamlessly integrate into the current environment.
They should:

  • Modify files only if there is missing data.
  • Not modify rows if unique id was not found for a given row.
  • Could be run over files multiple times resulting in exactly the same output.
  • Report which rows need manual investigation.

TODO

  • Script to add WOJ columns
  • Script to add POW columns
  • Script to add SYMPOD columns
  • Script to report problematic rows

I hope that this can be seen as helpful in the long run. I'm open to suggestions. This relates to #2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant