-
-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle training-validation-test splits in NESMusicDatabase #29
Comments
I also need to load a dataset that has splits. I added a class Groove2GrooveDataset(muspy.RemoteFolderDataset):
...
def __init__(
self,
root: Union[str, Path],
download_and_extract: bool = False,
cleanup: bool = False,
convert: bool = False,
kind: str = "json",
n_jobs: int = 1,
ignore_exceptions: bool = True,
use_converted: Optional[bool] = None,
part: str = "train"
):
muspy.RemoteFolderDataset.__init__(
self, root=root, download_and_extract=download_and_extract,
cleanup=cleanup, convert=convert, kind=kind, n_jobs=n_jobs,
ignore_exceptions=ignore_exceptions, use_converted=use_converted)
path = self.root / 'groove2groove-data-v1.0.0' / 'midi' / part / 'fixed'
self.raw_filenames = sorted(
(
filename
for filename in path.rglob("*." + self._extension)
)
)
self._filenames = self.raw_filenames However, this doesn't work: >>> test_data = Groove2GrooveDataset('/tmp/groove2groove-data', part='test', download_and_extract=True) # OK
>>> test_data.convert() # OK
>>> val_data = Groove2GrooveDataset('/tmp/groove2groove-data', part='val') # OK, reuses downloaded data
>>> val_data.convert() # not OK, skips conversion as '_converted' already contains the test data Edit: Overriding @property
def converted_dir(self):
return self.root / "_converted_{}".format(self.part) |
I see your point. We could have a nes = muspy.NESMusicDatabase("data/nes/")
nes.convert()
training_set = nes.subset("training") And this won't work nes = muspy.NESMusicDatabase("data/nes/")
training_set = nes.subset("training")
training_set.convert() # error |
The current implementation of
NESMusicDatabase
does not handle the training-validation-test splits provided in the original dataset. To avoid changing the baseDataset
class too much, we could add asubset
method and achieve something like the following.The text was updated successfully, but these errors were encountered: