`SortingAnalyzer` issues #3031

khl02007 · 2024-06-14T20:27:25Z

@alejoe91 @samuelgarcia
I just started trying the new sorting analyzer and wanted to share some issues that I have encountered:

It seems that every time you run sorting_analyzer.compute, you recompute all the extensions. I think that it would be better if it first checks if the extension exists with the same parameters and not re-do it if possible. Maybe there could be a recompute=True parameter that you can pass.
Some of the extensions depend on other extensions. For example, computing PC metrics requires the principal components extension. But the parameters passed may differ. For example, principal_components extension could be computed with n_components=5, but a PC metric that depends on it may ask for n_components=10. In that case it seems like the user should get at least a warning, or a error message saying that the parameters are not compatible.
Another idea is for the user to just be able to compute the quality metrics from the sorting analyzer, and the sorting analyzer to silently compute the extensions if they don't exist. I prefer this, as I don't really want to think about what kind of extensions are required for me to get my metrics.
In general from the user's perspective it is very opaque as to how the extensions depend on one another. Perhaps this can be better documented.
I feel that si.load_sorting_analyzer should have load_extensions=False by default (rather than load all the extensions by default, which could take a long time)
I'm not sure how I can add / attach a recording back to a sorting analyzer. I created a sorting analyzer with a sorting and a recording, and computed some extensions. Then later I tried to load the sorting analyzer and compute more extensions. But when I loaded the sorting analyzer with load_sorting_analyzer and tried to compute the extensions, I got an error saying that it requires a recording and the sorting analyzer doesn't have one. I can load the recording independently, but I don't know how it can be associated with the loaded sorting analyzer again. Would I 'recreate' the sorting analyzer again with create_sorting_analyzer again in that case?

The text was updated successfully, but these errors were encountered:

h-mayorquin · 2024-06-14T20:58:40Z

Just a comment, there is documentation for the extension dependency graph but it might not be easily discoverable:

https://spikeinterface.readthedocs.io/en/latest/modules/postprocessing.html#extensions-as-analyzerextensions

khl02007 · 2024-06-14T22:08:53Z

@h-mayorquin I see, thanks, should have read that

zm711 · 2024-06-17T12:20:24Z

Another idea is for the user to just be able to compute the quality metrics from the sorting analyzer, and the sorting analyzer to silently compute the extensions if they don't exist. I prefer this, as I don't really want to think about what kind of extensions are required for me to get my metrics.

I wonder about this. I too like this idea to some extent especially since a chunk of users really only want to use default for everything and don't care about the fine-tuning...

But I think the idea for the analyzer was that it would force the user to make these choices for two reasons: 1) tuneability so that it is easy to optimize and reproduce and 2) education so the user can see how their choices affect quality.

Could I ask what your specific use-case is? You just want to use defaults?

I'm not sure how I can add / attach a recording back to a sorting analyzer. I created a sorting analyzer with a sorting and a recording, and computed some extensions. Then later I tried to load the sorting analyzer and compute more extensions. But when I loaded the sorting analyzer with load_sorting_analyzer and tried to compute the extensions, I got an error saying that it requires a recording and the sorting analyzer doesn't have one. I can load the recording independently, but I don't know how it can be associated with the loaded sorting analyzer again. Would I 'recreate' the sorting analyzer again with create_sorting_analyzer again in that case?

This will be solved with #2785. :) Just not merged yet.

Some of the extensions depend on other extensions. For example, computing PC metrics requires the principal components extension. But the parameters passed may differ. For example, principal_components extension could be computed with n_components=5, but a PC metric that depends on it may ask for n_components=10. In that case it seems like the user should get at least a warning, or a error message saying that the parameters are not compatible.

this is interesting. What is the actual example you're thinking about?

It seems that every time you run sorting_analyzer.compute, you recompute all the extensions. I think that it would be better if it first checks if the extension exists with the same parameters and not re-do it if possible. Maybe there could be a recompute=True parameter that you can pass.

this shouldn't happen. Could you double-check that this is happening within the context of the dependency graph.

samuelgarcia · 2024-06-17T12:53:23Z

Hi Kyu,
than you for the feedback.

It seems that every time you run sorting_analyzer.compute, you recompute all the extensions. I think that it would be better if it first checks if the extension exists with the same parameters and not re-do it if possible. Maybe there could be a recompute=True parameter that you can pass.

Normally compute, launch the computation only on the targetted list or dict. This is strange.
"compute" is more or less "recompute" when it already exists.

Some of the extensions depend on other extensions. For example, computing PC metrics requires the principal components extension. But the parameters passed may differ. For example, principal_components extension could be computed with n_components=5, but a PC metric that depends on it may ask for n_components=10. In that case it seems like the user should get at least a warning, or a error message saying that the parameters are not compatible.

Having several instances of an extension with differents parameters would be extremely complicated and also error prone - I think -
For this specific case I would recommand to have several analyzers with several parameters sets.

Another idea is for the user to just be able to compute the quality metrics from the sorting analyzer, and the sorting analyzer to silently compute the extensions if they don't exist. I prefer this, as I don't really want to think about what kind of extensions are required for me to get my metrics.

I think I prefer no.
I am trying to kill slowly every magics stuff happening under the wood in favor of more explicit steps in spikeinterface.

I feel that si.load_sorting_analyzer should have load_extensions=False by default (rather than load all the extensions by default, which could take a long time)

I am agree with you but Alessio convinced me that most of end users would prefer to load everything at once.
Loading only a few extensions is a niche for optimization which can be achieved with load_extensions=False.

khl02007 · 2024-06-17T18:00:26Z

Thanks @zm711 and @samuelgarcia

If you guys would like to make every parameter explicit for every processing step in general, I think that's fine and see value in doing so. In that case maybe having a better documentation may help.
I meant that if you run e.g. sorting_analyzer.compute('waveforms') twice, it will run the processing both times even if it has already been done with the same parameters the first time. Seems like it would be better to avoid that. I'm working with a long recording and accidentally did this, which led to some wasted time.
For the clash between extensions issue, what I had in mind was principal_components vs. quality_metrics, specifically pc params like nearest neighbor isolation / noise overlap. In this case, the user may give different values for the same parameter (e.g. n_components). I subsequently learned that these metrics cannot be computed anyway (they are disabled because they take too long to compute?) but I think it can still happen elsewhere. Maybe you guys have already thought about this and designed the propagation of the parameters such that there is no duplication.

Feel free to close this issue once #2785 is merged.

zm711 · 2024-06-17T21:43:04Z

I meant that if you run e.g. sorting_analyzer.compute('waveforms') twice, it will run the processing both times even if it has already been done with the same parameters the first time. Seems like it would be better to avoid that. I'm working with a long recording and accidentally did this, which led to some wasted time.

Assumption here is that if you run it again it is because you have changed something and want to re-run it. But I agree I have made this mistake many times accidentally and it is frustrating. Maybe we could add a check such that if the params are the exact same then we don't re-run?

I subsequently learned that these metrics cannot be computed anyway (they are disabled because they take too long to compute?

I think that is just for default... I you look at the code if you force it by saying metrics_names = [nn...] then even though the default PC will be skipped the fact it is listed as a desired metric should include this. Based on my reading.But I also agree this is a little annoying that we remove something at default. Not sure the thought process, ie, was it for testing?

alejoe91 · 2024-06-19T14:28:34Z

We removed nn_isolation and nn_overlap because they are extremely slow on hgih-density probes (e.g. Neuropixels). Maybe we can make this behavior dependent on the number of channels/units? of course one can always force the metric_names

h-mayorquin assigned h-mayorquin and unassigned h-mayorquin Jun 14, 2024

h-mayorquin added the core Changes to core module label Jun 14, 2024

alejoe91 mentioned this issue Jun 19, 2024

Add force_recompute to SortingAnalyzer.compute #3045

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`SortingAnalyzer` issues #3031

`SortingAnalyzer` issues #3031

khl02007 commented Jun 14, 2024 •

edited

Loading

h-mayorquin commented Jun 14, 2024

khl02007 commented Jun 14, 2024

zm711 commented Jun 17, 2024

samuelgarcia commented Jun 17, 2024

khl02007 commented Jun 17, 2024 •

edited

Loading

zm711 commented Jun 17, 2024

alejoe91 commented Jun 19, 2024

SortingAnalyzer issues #3031

SortingAnalyzer issues #3031

Comments

khl02007 commented Jun 14, 2024 • edited Loading

h-mayorquin commented Jun 14, 2024

khl02007 commented Jun 14, 2024

zm711 commented Jun 17, 2024

samuelgarcia commented Jun 17, 2024

khl02007 commented Jun 17, 2024 • edited Loading

zm711 commented Jun 17, 2024

alejoe91 commented Jun 19, 2024

`SortingAnalyzer` issues #3031

`SortingAnalyzer` issues #3031

khl02007 commented Jun 14, 2024 •

edited

Loading

khl02007 commented Jun 17, 2024 •

edited

Loading