[WIP] Adaptive patch2self #2401

captainnova · 2021-05-22T16:35:13Z

Hello, this will probably be most interesting to @ShreyasFadnavis .

Shreyas, I think I figured out why p2s is blurring in diffusion space, and mostly fixed it. The problem is that by training the regressor to predict an entire volume it is assuming that all of the voxels have the same "diffusion response function", i.e. it is like saying they have the same diffusion tensor. The result looks better than that, though, because the regressor is free to give more weight to similar volumes, so even though it is using a one-size-fits-all interpolation kernel, it does not have to interpolate from very far away.

My solution is to train different regressors for different types of voxels, and then for any given voxel use a weighted sum of the regressors. The "types of voxels" comes from either Independent Component Analysis or Principal Component Analysis (ICA is slightly better), and the voxel's position in component space also sets the weights for the regressors. Thus each voxel gets a regressor that is appropriate to it, preserving signal without hurting denoising by going to the extreme of giving each voxel a regressor that is exactly tuned to it.

Since the equivalent of a "patch" is now a collection of voxels with similar intensities for each volume, regardless of their spatial location, I took out patch_radius. I now think p2s really is better working only in diffusion space, and the improvement I was seeing before with larger patches was just from the regressor getting more variables to work with. Because the interface changed, I ended up making adaptive_patch2self its own command instead of trying to insert this into p2s.

This will probably make more sense with pictures, but I'll submit the pull request for review now and add images later instead of trying to do everything at once.

Best wishes,

Rob

pep8speaks · 2021-05-22T16:35:16Z

Hello @captainnova, Thank you for updating !

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated at 2021-09-17 21:24:29 UTC

captainnova · 2021-05-22T17:23:09Z

I forgot to mention that another reason to drop patch_radius is that I added masking, which gives a big speed boost. So altogether with a few x speedup from masking, and a couple of dozen more regressors, adaptive_patch2self is only several x slower than patch2self.

ShreyasFadnavis · 2021-05-22T23:39:20Z

@captainnova:
Hi Rob, this is very good work and thanks a lot for the PR! I am looking at your code and looks quite good to me. I like the name Patch2Self Adaptive too 👍🏽

I only briefly looked at the code and realized that it could be easily integrated withing Patch2Self as an option. We can of course keep the workflow separate -- which looks good to me. For the patch_radius=0 I was going to downgrade that in fashion to what you have too! I also like the predict that you have. The one that is in SKlearn is slowing things down I assume.

I also have some new work on speeding up Patch2Self which would work with your modification too :) I will try the code out ASAP (especially with the data that you have). Can you also add a small example of how I can run the code and reproduce your results... So that we are on the same page? It will help with reviewing the PR too 👍🏽
[You don't need to share your data]

As a side question: do you get the same result with and without masking? If yes -- That is quite neat!

captainnova · 2021-05-23T15:36:19Z

@ShreyasFadnavis :

Hi Shreyas, thanks for the kind words.

I only briefly looked at the code and realized that it could be easily integrated withing Patch2Self as an option.

That would make sense to me too, and was my original intent, but when the urge to get rid of patch_radius grew too strong I ended up splitting it into a new command, because I didn't want to "break" p2s. As the original author of p2s, though, you have more freedom in choosing what changes to introduce.

The one that is in SKlearn is slowing things down I assume.

I haven't compared different implementations of the regressions or decompositions for speed. Using DGESVD for PCA and sklearn for ICA is a little weird, since sklearn also comes with PCA, and has a simpler/more consistent interface. But I started with PCA, and copied the example in denoise_lpca. When I added ICA I wondered about switching PCA to sklearn, but for all I know dipy has a reason to use DGESVD instead.

I also have some new work on speeding up Patch2Self which would work with your modification too :) I will try the code out ASAP (especially with the data that you have). Can you also add a small example of how I can run the code and reproduce your results... So that we are on the same page? It will help with reviewing the PR too 👍🏽

Running it is really easy, and very similar to p2s. I added a dipy_denoise_adaptive_patch2self CLI (which I'm not sure I've used, to be honest), but here is the python gist:

import dipy.denoise.adaptive_patch2self as ap2s
from dipy.io.image import load_nifti, save_nifti
import numpy as np

data, affine = load_nifti('raw.nii')
bvals = np.loadtxt('raw.bval')

# I use https://github.com/captainnova/dmri_segmenter to make these
mask, _ = load_nifti('raw_TIV.nii')

dnarr = ap2s.adaptive_patch2self, data, bvals, mask=mask, out_dtype=np.float32, verbose=True)
save_nifti('dnlinica.nii', dnarr, affine)

Properly evaluating the results takes more code, though. It is especially important to look at how the residuals change w.r.t. volume, not just the spatial pattern. Unfortunately the unwanted correlations do not show up very well in either movies of the residuals or time series of individual voxels. They do show up very well, and can be quantified, in maps of Pearson's r vs. signal, but that takes some code. What is the recommended way to share code that is used to investigate how a feature works, but is not needed by the feature itself?

As a side question: do you get the same result with and without masking? If yes -- That is quite neat!

Almost - I think the result is slightly better with masking. What happens without masking is the first component is "is it air?", which is not very interesting for denoising. With masking that component goes away, and the others get promoted by one. But with 8-12 component axes being used in total this does not make a huge difference. It would probably make a bigger difference for original p2s. Without masking it is mostly learning the difference between air and not air, but with masking it would effectively be one component ahead of where it is now. You could simulate this by using ap2s(with and without masking, n_comps=0).

skoudoro

Hi @captainnova,

Thank you for this new feature!

I would like to know what is the next step concerning this PR @ShreyasFadnavis and @captainnova.

+1 for integrating within Patch2Self as an option

Could you precise if this PR is still in work in progress or not?

If not, Could you add a short tutorial? It would make it easier to test.

It would be good if you could improve the consistency of your docstring in general. Sometimes, it is really good, and sometimes inexistent. See below for a really quick review (just code style) until I know more about the status.

Thank you!

skoudoro · 2021-06-07T18:23:14Z