Allow segmentations to reference multiple images #199

hackermd · 2022-08-22T16:31:45Z

Applying the logic of #193 to Segmentation instances.

CPBridge

I'm not so sure about this... There are numerous issues (which are why we did no go down this path before...):

The intended semantics of this are unclear to me and are very different from those of the case of single frame source images, which creates further confusion. Unless there are user-provided plane positions, in the single frame case each SOP Instance in source_images corresponds to one 2D slice of the pixel_array, and in the multiframe case, each frame of the single input SOP instance corresponds to one 2D slice of the pixel_array. Is any framewise correspondence implied by the passing of further multiframe instances?

If so i.e. framewise correspondence is implied between the pixel array and all source images, it is also implied between all source images also. In this cases there need to be significantly more checks to ensure that this is valid (all source images have the same frame of reference UID, number of frames, corresponding spatial locations etc). Also you would probably want to make sure that the this correspondence is recorded for all the subsequent source images, beyond just the first, in the DerivationImageSequence/SourceImageSequence as part of the per-frame functional groups. Currently the only effect of your change (as far as I can tell) is to add the second and subsequent source images to the SourceImageSequence attribute at the root of the dataset. Note that there is an important difference here because there is no way to record that a single slice of the input pixel array is derived from multiple single frame inputs, even though this is in principle allowed within the standard. So we have a major difference between how single frame input and multiframe inputs are treated.

If framewise correspondences are not implied, then this the first image passed to source_images is significantly more important than the rest, as it defines the frame of reference and the locations of individual slices of the input segmentation. This feels weird to me (one would expect all items in a list parameter to be equally important) and again this is very different from single frame case. At the very least this would need to be very carefully documented. In this case, I guess your intention purely that the subsequent source images are recorded in the top level SourceImageSequence. If so, I would suggest that you create a new "additional_source_images" parameter to allow this in a less confusing way, and handle single frame source images also.

hackermd · 2022-08-30T12:31:39Z

Thanks for your feedback @CPBridge.

The use case I am trying to support is the following: We perform segmentation using more than one input image channels (grayscale images). These may be different optical paths in case of VL Whole Slide Microscopy Image instances or different sequences in case of an Enhanced MR Image instance.

The intended semantics of this are unclear to me and are very different from those of the case of single frame source images, which creates further confusion. Unless there are user-provided plane positions, in the single frame case each SOP Instance in source_images corresponds to one 2D slice of the pixel_array, and in the multiframe case, each frame of the single input SOP instance corresponds to one 2D slice of the pixel_array. Is any framewise correspondence implied by the passing of further multiframe instances?

If so i.e. framewise correspondence is implied between the pixel array and all source images, it is also implied between all source images also. In this cases there need to be significantly more checks to ensure that this is valid (all source images have the same frame of reference UID, number of frames, corresponding spatial locations etc).

Additional multi-frame instances would currently be implied to have the same organization (pixel spacing, image orientation, and plane positions) as the first multi-frame image in source_images. I agree that we should check that assumption.

Also you would probably want to make sure that the this correspondence is recorded for all the subsequent source images, beyond just the first, in the DerivationImageSequence/SourceImageSequence as part of the per-frame functional groups. Currently the only effect of your change (as far as I can tell) is to add the second and subsequent source images to the SourceImageSequence attribute at the root of the dataset.

Good point! Yes, we will need to include references at the frame level, too.

Note that there is an important difference here because there is no way to record that a single slice of the input pixel array is derived from multiple single frame inputs, even though this is in principle allowed within the standard. So we have a major difference between how single frame input and multiframe inputs are treated.

I defer to your judgement whether it would be useful to extend this logic to single-frame inputs. I can imagine use cases where one performs segmentation using slices from two series of single-frame image instances (e.g., T1 and a T2 weighted slices from two series of MR Image instances).

If framewise correspondences are not implied, then this the first image passed to source_images is significantly more important than the rest, as it defines the frame of reference and the locations of individual slices of the input segmentation. This feels weird to me (one would expect all items in a list parameter to be equally important) and again this is very different from single frame case.

I agree, we should make that assumption explicit, perform the necessary checks, and document the behavior.

hackermd · 2022-08-30T12:49:27Z

@CPBridge we already perform several checks (see here) on source_images. We probably should add more attributes (Image Orientation, Pixel Spacing, etc.). We could also perform a frame-level comparison to ensure that the first source image is representative of all source images, but that may get computationally expensive and we may just want to document that. From a performance perspective, we should probably make that check here, but it would reduce cohesion.

hackermd · 2022-08-30T13:43:15Z

@CPBridge I added checks on source_images (15a7169) and included references for all source image frames (ead8a30). I also added a unit test to ensure that works properly (fc5b2d1).

hackermd · 2022-08-30T13:46:48Z

As you pointed our earlier, the behavior is now different for single-frame and multi-frame images. I suggest we also support multiple series of single-frame image instances. To this end, I have updated the logic on how frames of single-frame image instances are referenced (194b1d3). Trying to provide single-frame image instances from multiple series will currently still fail, because we assert that Series Instance UID is unique across images. However, I think we could now relax that. What do you think?

CPBridge · 2022-10-10T17:49:08Z

Hi @hackermd, sorry: I have only just noticed that this PR was waiting on my input...

This is looking better and I think the behaviour for multiframe images makes sense, although it still needs to be documented in the docstring. Furthermore, do you expect multiple multi-frame source instances to come from the same series or different series? I would expect them to come from different series (the whole point of multiframe images being to get rid of series of multiple objects). Currently the constructor will fail if source images are not from the same series, making this change applicable only in quite unusual situations of series of multiframe intances.

As you pointed our earlier, the behavior is now different for single-frame and multi-frame images. I suggest we also support multiple series of single-frame image instances. To this end, I have updated the logic on how frames of single-frame image instances are referenced (194b1d3). Trying to provide single-frame image instances from multiple series will currently still fail, because we assert that Series Instance UID is unique across images. However, I think we could now relax that. What do you think?

I support this direction (I have worked on several situations in radiology were a segmentation was derived from multiple input series) but unfortunately what you have done so far is too simple. I imagine that the "multiple series of single frame instances" would work as follows. There would be a segmentation pixel array of n frames (i.e. shape (n, h, w)), which has been derived from m different input series, each of which must consist of of n frames. This gives rise to a set of constraints on the input images that we are not currently checking for, i.e. that the number of source images is a positive integer multiple of the number of slices in the segmentation pixel array (m x n), and that these are grouped correctly such that there are m unique series instance UIDs, each with exactly n instances. If we just used what is currently implemented and removed the series UID check, then we have a free-for-all where the above may not apply at all, and the derivation will only be recorded for the first n items of the source_images, and the rest will be ignored.

Then we come onto the related problem (again for the "multiple series single frame instances") case of how to group the source instances in order to figure out the correspondence between the source images and the slices of the segmentation pixel array. Currently, it is assumed that source_images[i], is the (single) source image for slice pixel_array[i], but this no longer works when there can be multiple input series. I see two options here:

The user continues to pass a list of instances, as currently, which now can have length of any positive integer multiple of n. Items 0 through n-1 are assumed to correspond to pixel_array[0], items n through 2n-1 are assumed to correspond to pixel_array[1], and so on. Alternatively, you could do this the other way around, with items 0, n, 2n corresponding to pixel_array[0], 1, n+1, 2n+1 corresponding to pixel_array[1] etc. There would also need to be some basic checks on patient position (maybe with a tolerance) to enforce this. I think this is probably my preferred option as it is the simplest generalization of the current API, though it is potentially a little confusing...
We allow source_images to be either a Sequence[Dataset] (for backwards compatibility) or a Sequence[Sequence[Dataset]], with each inner sequence containing an entire series. Each inner Sequence would have to have the same length.

Neither option is great, but some compromise will be needed somewhere if this is going to work. What do you think?

hackermd · 2022-10-10T20:00:24Z

Do you expect multiple multi-frame source instances to come from the same series or different series? I would expect them to come from different series (the whole point of multiframe images being to get rid of series of multiple objects). Currently the constructor will fail if source images are not from the same series, making this change applicable only in quite unusual situations of series of multiframe instances.

Good catch! The images may come from different series and we will need to be able to support that use case. Addressed via 88d03c3

I support this direction (I have worked on several situations in radiology were a segmentation was derived from multiple input series) but unfortunately what you have done so far is too simple. I imagine that the "multiple series of single frame instances" would work as follows. There would be a segmentation pixel array of n frames (i.e. shape (n, h, w)), which has been derived from m different input series, each of which must consist of of n frames. This gives rise to a set of constraints on the input images that we are not currently checking for, i.e. that the number of source images is a positive integer multiple of the number of slices in the segmentation pixel array (m x n), and that these are grouped correctly such that there are m unique series instance UIDs, each with exactly n instances. If we just used what is currently implemented and removed the series UID check, then we have a free-for-all where the above may not apply at all, and the derivation will only be recorded for the first n items of the source_images, and the rest will be ignored.

Agreed, we should handle this use case more explicitly and put checks in place to ensure assumptions are satisfied. For now, I have added checks to ensure all single-frame images are from the same series (88d03c3).

We allow source_images to be either a Sequence[Dataset] (for backwards compatibility) or a Sequence[Sequence[Dataset]], with each inner sequence containing an entire series. Each inner Sequence would have to have the same length

I find this option more intuitive and would thus prefer this approach. However, I defer to your judgement, since you will be dealing with this use case more frequently. In fact, I would prefer if you'd implement that change, given that you have more experience in handling series of single-frame image instances.

src/highdicom/seg/sop.py

CPBridge · 2022-10-11T02:02:20Z

src/highdicom/seg/sop.py

+ if is_multiframe and not is_tiled:
+ raise ValueError(
+ 'If source images are multiple-frame images that are '
+ 'not tiled, then only a single source image from a single '
+ 'series must be provided.'
+ )


I really don't understand why there is different behaviour for tiled non-tiled. Can you please explain?

My thinking goes as follows: If a multi-frame image is not tiled, then it likely represents an Enhanced CT/MR/PET Image and I wasn't sure how that should be handled. Would we need to check whether the number of frames in the image (each of the images if multiple should be allowed) match the first dimension of pixel_array?

You are probably right and we can just handle tiled and non-tiled the same way..

Addressed via a2c0d1d

For consistency, we should now also support source_images: Sequence[Sequence[pydicom.Dataset]] to allow users to specify multiple series of single-frame images. However, that will require further thought and work.

src/highdicom/seg/sop.py

Co-authored-by: Chris Bridge <[email protected]>

…b/highdicom into bugfix/segmentation-source-images

fedorov · 2022-10-12T15:48:06Z

src/highdicom/seg/sop.py

- from which the segmentation was derived
+ from which the segmentation was derived. The images must have the
+ same dimensions (rows, columns) and orientation, have the same frame
+ of reference, and contain the same number of frames.


I believe in the general case, it is not required that the geometry of the images being segmented is consistent with the geometry of the segmentation. I was not sure if this constraint is here to simplify the implementation, which would make a lot of sense, or for some other reason.

To clarify, this means that the source images must have the same "(rows, columns) and orientation, have the same frame of reference, and contain the same number of frames" as each other. We do handle the case where this differs from the geometry of the segmentation (through the plane_positions parameter)

I see. I don't think that is required either by the standard (other than the same FoR).

The standard may not place any constraints on the referenced source images. However, in practice the constraints should generally apply. For example, all instances of a series of CT images generally have the same size and orientation.

If this should ever become a problem in practice, we can further relax the constraints. However, this would further complicate mapping segmentation frames to source image frames.

This all makes sense. I want to make sure it is clear that I am not pushing for those features, I just wanted to raise awareness.

Those are not hypothetical situations, however. In practice, it is common that different series within a single MR study in prostate and brain, as few examples, have varying resolution/orientation.

Thanks @fedorov. That's a good point. In the short term, I think it's unlikely that we will be able to support such use cases. However, we should consider them for the API design.

Yes, as we open ourselves up to having multiple source series, this becomes relevant as different series may have different values for these attributes. Previously this test was there since we were assuming that they were consistent within a series.

I think at the very least we would still want these checks:

Within a series, and

Within all images if the user has not provided plane positions for the segmentation

CPBridge

A couple of specific issues, which can be easily fixed. Also, as discussed in person, I think we should shift to requiring each series to be within a sub-Sequence even if the series is just a single multiframe instance:

Segmentation(
    ...
    source_images=[[multiframe1], [multiframe2], [multiframe3]],
    ...,
)

This is going to make grouping things into series much easier when I later add logic to allow multiple single frame series, which could be passed along with multi-frame instances. And also makes the intent a bit clearer in my opinion.

src/highdicom/seg/sop.py

Co-authored-by: Chris Bridge <[email protected]>

Allow more than one multi-frame source images

18fc8f4

hackermd added the bug Something isn't working label Aug 22, 2022

hackermd requested a review from CPBridge August 22, 2022 16:31

hackermd self-assigned this Aug 22, 2022

CPBridge requested changes Aug 27, 2022

View reviewed changes

hackermd added 4 commits August 30, 2022 09:11

Add checks to ensure similarity of source images

15a7169

Simplify formatting of log messages

ddc14c5

Include references to all source image frames

ead8a30

Add test for multiple multi-frame source images

fc5b2d1

Include references for multiple single-frame images

194b1d3

Fix uniqueness check for source images

0b01806

Allow reference of images from multiple series

88d03c3

hackermd added 2 commits October 10, 2022 16:02

Fix coding style issue

7d5a4b1

Clarify requirements for source images

29f5389

CPBridge requested changes Oct 11, 2022

View reviewed changes

hackermd and others added 3 commits October 11, 2022 08:19

Update src/highdicom/seg/sop.py

8f154d8

Co-authored-by: Chris Bridge <[email protected]>

Handle multi-frame images consistently

a2c0d1d

Merge branch 'bugfix/segmentation-source-images' of github:herrmannla…

03ce387

…b/highdicom into bugfix/segmentation-source-images

fedorov reviewed Oct 12, 2022

View reviewed changes

CPBridge requested changes Oct 21, 2022

View reviewed changes

src/highdicom/seg/sop.py Outdated Show resolved Hide resolved

src/highdicom/seg/sop.py Outdated Show resolved Hide resolved

Apply suggestions from code review

5889b6b

Co-authored-by: Chris Bridge <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow segmentations to reference multiple images #199

Allow segmentations to reference multiple images #199

hackermd commented Aug 22, 2022

CPBridge left a comment •

edited

Loading

hackermd commented Aug 30, 2022

hackermd commented Aug 30, 2022

hackermd commented Aug 30, 2022

hackermd commented Aug 30, 2022

CPBridge commented Oct 10, 2022

hackermd commented Oct 10, 2022

CPBridge Oct 11, 2022

hackermd Oct 11, 2022

hackermd Oct 11, 2022

hackermd Oct 11, 2022

hackermd Oct 11, 2022

fedorov Oct 12, 2022

CPBridge Oct 12, 2022

fedorov Oct 12, 2022

hackermd Oct 12, 2022

fedorov Oct 12, 2022 •

edited

Loading

hackermd Oct 12, 2022

CPBridge Oct 12, 2022

CPBridge left a comment

Allow segmentations to reference multiple images #199

Are you sure you want to change the base?

Allow segmentations to reference multiple images #199

Conversation

hackermd commented Aug 22, 2022

CPBridge left a comment • edited Loading

Choose a reason for hiding this comment

hackermd commented Aug 30, 2022

hackermd commented Aug 30, 2022

hackermd commented Aug 30, 2022

hackermd commented Aug 30, 2022

CPBridge commented Oct 10, 2022

hackermd commented Oct 10, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fedorov Oct 12, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CPBridge left a comment

Choose a reason for hiding this comment

CPBridge left a comment •

edited

Loading

fedorov Oct 12, 2022 •

edited

Loading