seq2seq classification with AST #117

YSLCoat · 2023-11-30T21:38:48Z

Hi!

Is it trivial to adapt the AST architecture to do sequence to sequence classification? My input data has a label for each audio sample and my goal is to classify each sample in the data.

YuanGongND · 2024-01-07T02:25:39Z

Can you take a look at Figure 1 of this paper https://arxiv.org/pdf/2305.10790.pdf to see an example to mean pool over the frequency dimension to get representation in temporal order? Code implementation is here:

https://github.com/YuanGongND/ltu/blob/c2d0723c9f31a54eb2c2b62c5cc030b25317dc6f/src/ltu/hf-dev/transformers-main/src/transformers/models/llama/modeling_llama.py#L668-L672

However, the code is for "no-overalp" patch split, apply to "overlapped" patch split (in this repo) requires some change.

You can also check SSAST which supports naive temporal order representation. https://github.com/YuanGongND/ssast

When you have temporal order representation, you can do seq2seq tasks, e.g., add a CTC on top of the temporal representations.

-Yuan

YSLCoat · 2024-01-07T14:00:56Z

I got the results I wanted by removing line 184 in https://github.com/YuanGongND/ast/blob/master/src/models/ast_models.py and setting t_stride = 1, I think that should give me a working seq2seq classification. I will take a look at the links you provided as well!

YuanGongND added the question Further information is requested label Jan 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

seq2seq classification with AST #117

seq2seq classification with AST #117

YSLCoat commented Nov 30, 2023

YuanGongND commented Jan 7, 2024

YSLCoat commented Jan 7, 2024

seq2seq classification with AST #117

seq2seq classification with AST #117

Comments

YSLCoat commented Nov 30, 2023

YuanGongND commented Jan 7, 2024

YSLCoat commented Jan 7, 2024