Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling dataset redirects #2688

Open
PeterAJansen opened this issue Apr 9, 2024 · 1 comment
Open

Handling dataset redirects #2688

PeterAJansen opened this issue Apr 9, 2024 · 1 comment
Labels
feature request Request for a new feature P1 Not as needed as P0, but still important/wanted

Comments

@PeterAJansen
Copy link

Currently datasets-server doesn't appear to handle dataset redirects, and gives an error.

For example:

  • Querying datasets endpoints with the dataset squad works

  • Querying datasets-server endpoints with the dataset squad gives an "unknown error"

  • Querying datasets-server endpoints with a random dataset name (e.g. xyz2309348) gives an expected response (e.g. dataset unknown/doesn't exist)

  • Querying datasets-server endpoints with the redirected name (rajpurkar/squad) works.

@severo severo added feature request Request for a new feature P1 Not as needed as P0, but still important/wanted labels Apr 9, 2024
@severo
Copy link
Collaborator

severo commented Apr 9, 2024

The dataset viewer only knows the last repository name if it has been renamed once or more. For example, squad was renamed to rajpurkar/squad. The previous entries for squad in the database were deleted, and new ones were created for rajpurkar/squad. As we don't check if the repo was renamed, asking for dataset=squad returns an error.

To support this, if the dataset has not been found in the database, we should request the Hub to get the current name of the repo (in the example: rajpurkar/squad) and look again if the dataset exists in the database.

We can get this info by looking at the id field in hfh.dataset_info() (see https://huggingface.co/api/datasets/squad)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for a new feature P1 Not as needed as P0, but still important/wanted
Projects
None yet
Development

No branches or pull requests

2 participants