Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a mapping function in image_reader.py and image_writer.py #7769

Open
wants to merge 21 commits into
base: dev
Choose a base branch
from

Conversation

staydelight
Copy link

@staydelight staydelight commented May 14, 2024

Add a function to create a JSON file that maps input and output paths.

Fixes #7557 .

Description

A few sentences describing the changes proposed in this pull request.

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Integration tests passed locally by running ./runtests.sh -f -u --net --coverage.
  • Quick tests passed locally by running ./runtests.sh --quick --unittests --disttests.
  • In-line docstrings updated.
  • Documentation updated, tested make html command in the docs/ folder.

@@ -148,6 +149,25 @@ def _stack_images(image_list: list, meta_dict: dict):
return np.stack(image_list, axis=0)


def update_json(input_file=None, output_file=None):
record_path = "img-label.json"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @staydelight, thank you for the PR. I have a few concerns:

  1. Do we really need to make any changes to read images? It seems that we can already support reading paired data using LoadImaged.
  2. I suggest adding a flag to SaveImage to allow users to choose whether or not to write a JSON file.
  3. Can you clarify the purpose of the record_path? We can directly obtain the label path from the input of SaveImage, and for the image path, we can retrieve it from the metadata of the data (since we have introduced MetaTensor).

Let me know if you need further clarification on any of these points.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @staydelight thanks as well but I also have concerns similar to @KumoLiu. In the general case users are not going to want these JSON files being generated so I don't think we should add this to the readers and writers at all. This is very use-case specific so writing some custom transform or doing things in a different way would be the solution.

One other thing to mention is that it's not thread-safe since multiple parallel transforms may be reading/writing the file at the same time. In this case you also cannot rely on the ordering of LoadImage/SaveImage operations to ensure you match the right input with output.

As mentioned you can access original paths with the metadata present in the MetaTensor objects:

trans = monai.transforms.LoadImaged(keys="image")
d = trans({"image": "/path/to/file.nii.gz"})
print(d["image"].meta["filename_or_obj"])

It should be possible to access this value in the postprocessing transform sequence were SaveImage is used since the network output should be a MetaTensor with these values included. You should be able to define a transform after SaveImage which logs these values to a file. This would be the much more modular approach versus adding specific code to the loader/saver classes, so I'd strongly suggest investigating how best to go about that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ericspod @KumoLiu,Thank you for your advice. According to what you said, since the network output should be a MetaTensor, is it possible to add a save_log function to SaveImage like this:

self.log_data.append({
    "input": meta_data.get("filename_or_obj", "(unknown)"),
    "output": filename
})


def save_log(self):
    try:
        with open(self.log_json_path, 'r') as f:
            existing_log_data = json.load(f)
    except FileNotFoundError:
        existing_log_data = []

    with open(self.log_json_path, 'w') as f:
        existing_log_data.extend(self.log_data)
        json.dump(existing_log_data, f, indent=4)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a workable solution, but it involves repeatedly reading and writing JSON files. I haven't thought of a better way yet. We could add a utility function, but it might not be very effective. @ericspod, do you have any better suggestions for saving a mapping?

BTW, the save path can also be record in the meta data:

if self.savepath_in_metadict and meta_data is not None:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a similar sort of change we discussed with the input-output mapping file generation in another PR. I think this would be better implemented in a transform which will appear after SaveImage(d) in your pipeline which will handle saving to file(s). The code here can be put into that and keep SaveImage(d) focused on saving the image data only. You would also have to take into account multiprocessing when doing this and write to separate files for each subprocess, this implementation as-is contains a race condition if multiple processes attempt to save to the same file.

staydelight and others added 21 commits June 13, 2024 23:00
Add a function to create a JSON file that maps input and output paths.

Signed-off-by: staydelight <[email protected]>
Remove changes unrelated to this issue.

Signed-off-by: staydelight <[email protected]>
Remove changes unrelated to this issue.

Signed-off-by: staydelight <[email protected]>
Remove changes unrelated to this issue.

Signed-off-by: staydelight <[email protected]>
Signed-off-by: staydelight <[email protected]>
Signed-off-by: staydelight <[email protected]>
Add code for generating a mapping json file.

Signed-off-by: staydelight <[email protected]>
Change mapping_json_path init way.

Signed-off-by: staydelight <[email protected]>
Fixing unsuccessful checks.

Signed-off-by: staydelight <[email protected]>
Fixes unseccessful ckecks. (if mapping_json_path is not None)

Signed-off-by: staydelight <[email protected]>
Signed-off-by: staydelight <[email protected]>
Signed-off-by: staydelight <[email protected]>
Signed-off-by: staydelight <[email protected]>
Signed-off-by: staydelight <[email protected]>
Signed-off-by: staydelight <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update the SaveImage transform to support saving input-output mapping.
3 participants