Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relationship & diff between models/s4/s4.py & src/models/sequence/kernels/ssm.py #99

Open
jchia opened this issue Apr 27, 2023 · 1 comment

Comments

@jchia
Copy link
Contributor

jchia commented Apr 27, 2023

I understand that models/s4/s4.py is a standalone file that can be taken on its own to use the S4 models, not counting the CUDA kernel module. I have some questions about its intent and nature relative to the code in src/models/sequence/kernels/

Is it the case that models/s4/s4.py is provided mainly for convenience and ease-of-use from having just one source file, and that it is not meant to have all the features available from src/models/sequence/kernels/ssm.py?

In terms of the development process, is it the case that models/s4/s4.py is downstream of src/models/sequence/kernels so that changes go to the latter first and then get manually ported to the former?

In terms of software behavior (the values that are calculated mathematically ignoring floating-point error and random seed differences), is models/s4.py meant to do the same thing as src/models/sequence/kernels/ssm.py for the use cases that it covers?

Which version (standalone vs non-standalone) of the S4 implementation is generally used for the experiments in the papers?

@albertfgu
Copy link
Contributor

Is it the case that models/s4/s4.py is provided mainly for convenience and ease-of-use from having just one source file, and that it is not meant to have all the features available from src/models/sequence/kernels/ssm.py?

Yes, the standalone files are meant for convenience. The models inside this repo's training infrastructure are very modular, which means they are factored over a large number of files and would be inconvenient to copy to external repositories.

In terms of the development process, is it the case that models/s4/s4.py is downstream of src/models/sequence/kernels so that changes go to the latter first and then get manually ported to the former?

That's right

In terms of software behavior (the values that are calculated mathematically ignoring floating-point error and random seed differences), is models/s4.py meant to do the same thing as src/models/sequence/kernels/ssm.py for the use cases that it covers?

They should do the exact same thing. It's conceivable that there are edge cases in the standalone because it is much less tested.

Which version (standalone vs non-standalone) of the S4 implementation is generally used for the experiments in the papers?

All experiments use the original version, not the standalone. There are several READMEs in this repo documenting the full training pipeline and model structure (e.g. https://github.com/HazyResearch/state-spaces/tree/main/src/models/sequence)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants