-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relationship & diff between models/s4/s4.py & src/models/sequence/kernels/ssm.py #99
Comments
Yes, the standalone files are meant for convenience. The models inside this repo's training infrastructure are very modular, which means they are factored over a large number of files and would be inconvenient to copy to external repositories.
That's right
They should do the exact same thing. It's conceivable that there are edge cases in the standalone because it is much less tested.
All experiments use the original version, not the standalone. There are several READMEs in this repo documenting the full training pipeline and model structure (e.g. https://github.com/HazyResearch/state-spaces/tree/main/src/models/sequence) |
I understand that models/s4/s4.py is a standalone file that can be taken on its own to use the S4 models, not counting the CUDA kernel module. I have some questions about its intent and nature relative to the code in src/models/sequence/kernels/
Is it the case that models/s4/s4.py is provided mainly for convenience and ease-of-use from having just one source file, and that it is not meant to have all the features available from src/models/sequence/kernels/ssm.py?
In terms of the development process, is it the case that models/s4/s4.py is downstream of src/models/sequence/kernels so that changes go to the latter first and then get manually ported to the former?
In terms of software behavior (the values that are calculated mathematically ignoring floating-point error and random seed differences), is models/s4.py meant to do the same thing as src/models/sequence/kernels/ssm.py for the use cases that it covers?
Which version (standalone vs non-standalone) of the S4 implementation is generally used for the experiments in the papers?
The text was updated successfully, but these errors were encountered: