Skip to content

tinymilky/TextSCF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TextSCF: LLM-Enhanced Image Registration Model

Pytorch arXiv

This repository hosts the official PyTorch implementation of "Spatially Covariant Image Registration with Text Prompts". TextSCF is a comprehensive library focused on weakly supervised image alignment and registration, equipped with a wide range of tools for in-depth analysis of deformation fields.

Updates

[12/18/2023] - A pretrained model weight for task 3 in MICCAI 2021 Learn2Reg Challenge, is now available. Please see below

[12/06/2023] - The code for textSCF, reproducing our results for task 3 in MICCAI 2021 Learn2Reg Challenge, is now available. See datasets and usage sections below.

[11/30/2023] - We collect a list of papers exploring the use of LLMs in AI for medicine and healthcare. (Awesome-Medical-LLMs)
[11/30/2023] - We collect a list of papers centered on image registration models in healthcare. (Awesome-Medical-Image-Registration)

Papers

Spatially Covariant Image Registration with Text Prompts
Hang Zhang, Xiang Chen, Rongguang Wang, Renjiu Hu, Dongdong Liu, and Gaolei Li.
arXiv 2023.

Spatially Covariant Lesion Segmentation
Hang Zhang, Rongguang Wang, Jinwei Zhang, Dongdong Liu, Chao Li, and Jiahao Li.
IJCAI 2023.

Highlights

  • The deformation field generated by textSCF effectively demonstrates its ability to preserve discontinuities across different anatomical regions. As illustrated in the image below, it outlines the stomach in contrast to the adjacent regions.

  • TextSCF is designed to integrate with various architextures, including LKU-Net, LapRIN, VoxelMorph, and TransMorph, showcasing its utility in medical image registration.

Datasets

See Datasets for more details.

Usage

Run the script with the following command in folder ./src to reproduce the results:

python train_brainreg.py -d oasis_pkl -m brainTextSCFComplex -bs 1 --epochs 501 --reg_w 0.1 start_channel=64 scp_dim=2048 diff_int=0 clip_backbone=vit
  • -d oasis_pkl: Dataset used, specifically 'oasis_pkl'.
  • -m brainTextSCFComplex: Model name, set to 'brainTextSCFComplex'.
  • -bs 1: Batch size, defined as 1.
  • --epochs 501: Total number of epochs for training, set to 501.
  • --reg_w 0.1: Smoothness regularization weight, specified as 0.1.
  • start_channel=64: Number of starting channels (N_s), set to 64.
  • scp_dim=2048: The dimension (C_{\phi}) for the implicit function, set to 2048.
  • diff_int=0: Diffeomorphic integration flag, '0' for not used.
  • clip_backbone=vit: CLIP backbone type, specified as 'vit' (ViT-L/14@336px).

Please note that using a starting channel of 64 is computationally intensive. It is recommended to run this on an A100 GPU or higher for optimal performance. Alternatively, you can reduce the starting channel to as low as 8 for increased efficiency, while still achieving a Dice score of approximately 87.

To use the pretrained mode, run the script with the following command in folder ./src to get the npz files:

python test_brainreg.py -d oasis_pkl -m brainTextSCFComplex -bs 1 --is_submit 1 --load_ckpt ./../../../checkpoint/oasis_9002_64_2048_0_vit.pth start_channel=64 scp_dim=2048 diff_int=0 clip_backbone=vit
  • --is_submit: Whether to create npz files for submission to the challenge.
  • --load_ckpt: The type of the checkpoint to load, 'last' is from the latest checkpoint, 'best' is from the checkpoint with highest validation score, and a path such as './../../../checkpoint/oasis_9002_64_2048_0_vit.pth' directing to the checkpoint.

The npz files will be saved at ./textSCF/src/logs/oasis_pkl/brainTextSCFComplex/ where textSCF is the root of the code repository.

See Datasets for obtaining pretrained models and to download a complete project setup.

Todo

  • Awesome-Medical-LLMs
  • Awesome-Medical-Image-Registration
  • Core code release
  • Pretrained model release
  • Support of different backbones and datasets
  • Tutorials and periphery code
    • Smoothness and complexity analysis
    • Statistical analysis
    • Discontinuity-preserving deformation field

Citation

If our work has influenced or contributed to your research, please kindly acknowledge it by citing:

@misc{zhang2023spatially,
      title={Spatially Covariant Image Registration with Text Prompts}, 
      author={Hang Zhang and Xiang Chen and Rongguang Wang and Renjiu Hu and Dongdong Liu and Gaolei Li},
      year={2023},
      eprint={2311.15607},
      archivePrefix={arXiv},
      primaryClass={eess.IV}
}

@inproceedings{ijcai2023p0190,
  title     = {Spatially Covariant Lesion Segmentation},
  author    = {Zhang, Hang and Wang, Rongguang and Zhang, Jinwei and Liu, Dongdong and Li, Chao and Li, Jiahao},
  booktitle = {Proceedings of the Thirty-Second International Joint Conference on
               Artificial Intelligence, {IJCAI-23}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},
  editor    = {Edith Elkind},
  pages     = {1713--1721},
  year      = {2023},
  month     = {8},
  note      = {Main Track},
  doi       = {10.24963/ijcai.2023/190},
  url       = {https://doi.org/10.24963/ijcai.2023/190},
}

Acknowledgment

We extend our gratitude to LKU-Net, LapRIN, VoxelMorph, and TransMorph for their valuable contributions. Portions of the code in this repository have been adapted from these sources.

Keywords

Keywords: Diffeomorphic image registration, large deformation, Convolutional neural networks, Vision transformers, Large-scale visual language models, Spatially covariant filters, Text prompts