-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Populate internal country database using sementic web #14
Comments
Have you had a look at GeoNames ? Lots of Semantic Web goodness if that's your thing, see http://www.geonames.org/ontology/documentation.html |
As sources for names and synonyms, there are also The Getty Thesaurus of Geographic Names (http://www.getty.edu/vow/TGNSearchPage.jsp), and GADM (http://www.gadm.org/). For misspellings, I have accumulated nearly 5000 variants on values mapped to the Darwin Core term country and have provided the corresponding ISO 3166-2 country code for all of the ones for which that is possible. This list is growing as we pass additional data through validation for VertNet. |
Just stumbled upon this tool: http://okfnlabs.org/blog/2013/05/16/nomenklatura-matching-service-reconciliation-made-easy.html Might be of help here. |
I think it is worth mentioning : http://community.gbif.org/pg/file/read/34059/ |
Would be interesting to expand the narwhal to be able to build an up-to-date and well-maintained knowledge base of country names, their alternative representations (possibly multilingual) and mappings to known misspellings using linked open data (semantic Web).
This could be done using a semantic Web URI.
Something like : http://dbpedia.org/page/Category:Member_states_of_the_United_Nations
A country could than be identified with a URI such as http://dbpedia.org/resource/Canada
The name of a country in different languages could populated using "owl:sameAs".
The known misspellings could be handle using SKOS.
For performance reasons, we'd like this thesaurus to be embedded in the library, but with the capacity to be periodically refreshed with data pulled from external resources (like it's currently the case through the gbif-parser).
Benefits:
http://dbpedia.org/page/Canada) in different languages.
country name in different languages as error
continent, hemispere)
The text was updated successfully, but these errors were encountered: