How to extract weighted sum SSL representations from an audio dataset? #539

mukherjeesougata · 2024-05-06T07:49:40Z

Weighted sum can be extracted from the Featurizer class and the code snippet for its extraction is as follows:-

 >>> import torch
 >>> from s3prl.nn import S3PRLUpstream, Featurizer
  ...
 >>> model = S3PRLUpstream("hubert")
 >>> model.eval()
  ...
 >>> with torch.no_grad():
 ...     wavs = torch.randn(2, 16000 * 2)
 ...     wavs_len = torch.LongTensor([16000 * 1, 16000 * 2])
 ...     all_hs, all_hs_len = model(wavs, wavs_len)
 ...
>>> featurizer = Featurizer(model)
>>> hs, hs_len = featurizer(all_hs, all_hs_len)

But How shall I modify this code so that I will get the weighted sum features/representations hs for each audio file from an audio dataset and store it as features for each corresponding audio file?
I think the modification needs to be done in the following 2 lines:-

        ...     wavs = torch.randn(2, 16000 * 2)
        ...     wavs_len = torch.LongTensor([16000 * 1, 16000 * 2])

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to extract weighted sum SSL representations from an audio dataset? #539

How to extract weighted sum SSL representations from an audio dataset? #539

mukherjeesougata commented May 6, 2024 •

edited

How to extract weighted sum SSL representations from an audio dataset? #539

How to extract weighted sum SSL representations from an audio dataset? #539

Comments

mukherjeesougata commented May 6, 2024 • edited

mukherjeesougata commented May 6, 2024 •

edited