NOTE: These docs are out of date and are in the process of being updated. The implementation has changed significantly since these were written, most notably IGSNs are now managed by DataCite and the underlying implementation of this service is being modified to leverage the DataCite API to assist with IGSN resolution.
Performs IGSN resolution by leveraging the DataCite API.
igsn_resolver
provides a simple proof of concept for an IGSN resolver service implemented using FastAPI, and using the DataCite API and optionally the handle.net infrastructure while supporting expected behavior of content negotiation for RDF resources.
The service is composed of two components, the API which performs the resolution functions, and a minimal Web UI implemented as a Web Component. The UI component has minimal dependencies and may be deployed in any HTML page.
A test instance of the API is deployed on Vercel at https://igsn-resolver.vercel.app/. The UI is deployed using GitHub pages, available at https://datadavev.github.io/igsn_resolver/.
The API supports two endpoints, one for redirection, the other for basic metadata. These methods are described in the API documenation at https://igsn-resolver.vercel.app/docs with some examples below.
Identifiers are provided as strings, and the service will attempt to normalize a provided identifier string prior to lookup. Examples of IGSN identifier strings that are recognized include:
au1234
AU1234
igsn:au1234
10273/au1234
igsn:10273/au1234
Since the service is using the handle system under the hood, DOI idnetifier strings are also accepted, for example:
10.1594/PANGAEA.930327
doi:10.1594/PANGAEA.930327
The /.info/{identifier}
endpoint will return metadata from the handle system about the identifier. For example:
curl "https://igsn-resolver.vercel.app/.info/au1234" | jq '.'
[
{
"original": "au1234",
"scheme": "igsn",
"normalized": "igsn:10273/au1234",
"handle": "10273/au1234",
"target": "http://www.ga.gov.au/sample-catalogue/10273/AU1234",
"ttl": 86400,
"timestamp": "2015-07-22T05:19:38Z"
}
]
Where:
original
- The provided identifier string.
scheme
- Recognized identifier scheme, either "igsn" or "doi".
normalized
- Normalized representation of the identifier string.
handle
- Handle representation of the identifier string.
target
- Identifier targer as reported by the Handle System.
ttl
- Time to live in seconds, reported by the Handle System.
timestamp
- The entry timestamp as reported by the Handle System.
Multiple identifiers (up to 50) may be sent to the /.info/
endpoint using a comma as a delimiter. For example:
curl "https://igsn-resolver.vercel.app/.info/au1234,10.1594/PANGAEA.930327" | jq '.'
[
{
"original": "au1234",
"scheme": "igsn",
"normalized": "igsn:10273/au1234",
"handle": "10273/au1234",
"target": "http://www.ga.gov.au/sample-catalogue/10273/AU1234",
"ttl": 86400,
"timestamp": "2015-07-22T05:19:38Z"
},
{
"original": "10.1594/PANGAEA.930327",
"scheme": "doi",
"normalized": "doi:10.1594/PANGAEA.930327",
"handle": "10.1594/PANGAEA.930327",
"target": "https://doi.pangaea.de/10.1594/PANGAEA.930327",
"ttl": 86400,
"timestamp": "2021-06-10T01:14:56Z"
}
]
The resolve endpoint /{identifier}
accepts a single identifier string and returns a redirect (status code 307) to the target address listed by the handle system. For example:
curl -v -q "https://igsn-resolver.vercel.app/au1234"
...
< HTTP/1.1 307 Temporary Redirect
< Link:
<https://igsn-resolver.vercel.app/au1234>;
rel="canonical",
</.info/igsn:10273/au1234>;
type="application/json";
rel="alternate";
profile="https://igsn.org/info",
<https://hdl.handle.net/au1234/10273>;
rel="alternate";
profile="https://schema.datacite.org/"
< Location: http://www.ga.gov.au/sample-catalogue/10273/AU1234
Note the Link
header response which provides a hint to the client about alternate locations and profiles for accessing information about the identified resource as described below.
The behavior of this method can be modified by an optional Accept-Profile
header1 sent by the client.
If the client includes an Accept-Profile
header of https://igsn.org/info
the response is the same as a call to the /.info/{identifier}
endpoint. For example:
curl -q -H "Accept-Profile: https://igsn.org/info" \
"https://igsn-resolver.vercel.app/au1234"
...
{
"original": "au1234",
"scheme": "igsn",
"normalized": "igsn:10273/au1234",
"handle": "10273/au1234",
"target": "http://www.ga.gov.au/sample-catalogue/10273/AU1234",
"ttl": 86400,
"timestamp": "2015-07-22T05:19:38Z"
}
If the client includes an Accept-Profile
header of https://schema.datacite.org/
then the redirect response is to the handle system resolve address, which will subsequently return a redirect to the known target.
This approach enables correct resolution of some resource content types (such as RDF formats) in the DOI system which otherwise return metadata about the identifier rather than the identified resource. The IGSN infrastructure is in the process of migrating to using DOI infrastructure provided by DataCite, and a service such as this will be necessary for correct resolution of IGSN identifiers when that change is implemented.
For example, the DOI identifier doi:10.1594/PANGAEA.930327
has a target of https://doi.pangaea.de/10.1594/PANGAEA.930327
. Resolving this with the handle system for a content type of text/html
results in the expected redirect:
curl -q -v -H "Accept: text/html" "https://hdl.handle.net/10.1594/PANGAEA.930327"
...
< HTTP/2 302
< vary: Accept
< location: https://doi.pangaea.de/10.1594/PANGAEA.930327
If instead a content-type of application/ld+json
is requested, the location of DataCite metadata is returned instead of the identified resource:
curl -q -v -H "Accept: application/ld+json" "https://hdl.handle.net/10.1594/PANGAEA.930327"
...
< HTTP/2 302
< vary: Accept
< location: https://data.crosscite.org/10.1594%2FPANGAEA.930327
Resolving the same identifier with this igsn-resolver
service results in the expected location:
curl -q -v -H "Accept: application/ld+json" "https://igsn-resolver.vercel.app/10.1594/PANGAEA.930327"
...
< HTTP/1.1 307 Temporary Redirect
< Link:
<https://ule1dz.deta.dev/10.1594/PANGAEA.930327>;
rel="canonical",
</.info/doi:10.1594/PANGAEA.930327>;
type="application/json";
rel="alternate";
profile="https://igsn.org/info",
<https://hdl.handle.net/PANGAEA.930327/10.1594>;
rel="alternate";
profile="https://schema.datacite.org/"
< Location: https://doi.pangaea.de/10.1594/PANGAEA.930327
The DataCite metadata may be retrieved by specifically requesting that format:
curl -q -v -H "Accept: application/ld+json" \
-H "Accept-Profile: https://schema.datacite.org/" \
"https://igsn-resolver.vercel.app/10.1594/PANGAEA.930327"
...
< HTTP/1.1 307 Temporary Redirect
< Link:
<https://igsn-resolver.vercel.app/10.1594/PANGAEA.930327>;
rel="canonical",
</.info/doi:10.1594/PANGAEA.930327>;
type="application/json";
rel="alternate";
profile="https://igsn.org/info",
<https://hdl.handle.net/PANGAEA.930327/10.1594>;
rel="alternate";
profile="https://schema.datacite.org/"
< Location: https://hdl.handle.net/PANGAEA.930327/10.1594
After cloning this repo, create a virtual environment and install development dependencies:
pip install -r dev_requirements.txt
Then run a local instance on port 8000 by:
cd app
uvicorn main:app --reload
A push to main
on the origin repo will result in a re-deployment to deta.sh
and deployment of the web interface to GitHub pages.
Tests can be run with pytest
, e.g.:
pytest
================================ test session starts =================================
platform darwin -- Python 3.10.5, pytest-7.1.2, pluggy-1.0.0
rootdir: /Users/vieglais/Documents/Projects/IGSN/igsn_resolver
plugins: anyio-3.6.1, asyncio-0.19.0
asyncio: mode=strict
collected 13 items
tests/test_igsnresolve.py ............. [100%]
================================= 13 passed in 1.33s =================================
The web component is in the identifier-resolver
folder. It is implemented in Javascript using Lit and may be deployed without building or bundling. See the README
in that folder for details.
Footnotes
-
Content Negotiation by Profile is currently a W3C draft, https://www.w3.org/TR/dx-prof-conneg/ ↩