TextSCF: LLM-Enhanced Image Registration Model

This repository hosts the official PyTorch implementation of "Spatially Covariant Image Registration with Text Prompts". TextSCF is a comprehensive library focused on weakly supervised image alignment and registration, equipped with a wide range of tools for in-depth analysis of deformation fields.

Updates

[12/18/2023] - A pretrained model weight for task 3 in MICCAI 2021 Learn2Reg Challenge, is now available. Please see below

[12/06/2023] - The code for textSCF, reproducing our results for task 3 in MICCAI 2021 Learn2Reg Challenge, is now available. See datasets and usage sections below.

[11/30/2023] - We collect a list of papers exploring the use of LLMs in AI for medicine and healthcare. (Awesome-Medical-LLMs)
[11/30/2023] - We collect a list of papers centered on image registration models in healthcare. (Awesome-Medical-Image-Registration)

Papers

Spatially Covariant Image Registration with Text Prompts
Hang Zhang, Xiang Chen, Rongguang Wang, Renjiu Hu, Dongdong Liu, and Gaolei Li.
arXiv 2023.

Spatially Covariant Lesion Segmentation
Hang Zhang, Rongguang Wang, Jinwei Zhang, Dongdong Liu, Chao Li, and Jiahao Li.
IJCAI 2023.

Highlights

The textSCF (random_cat) ranks 1st on the validation set of task 3 in MICCAI 2021 Learn2Reg Challenge.

The deformation field generated by textSCF effectively demonstrates its ability to preserve discontinuities across different anatomical regions. As illustrated in the image below, it outlines the stomach in contrast to the adjacent regions.

TextSCF is designed to integrate with various architextures, including LKU-Net, LapRIN, VoxelMorph, and TransMorph, showcasing its utility in medical image registration.

Datasets

See Datasets for more details.

Usage

Run the script with the following command in folder ./src to reproduce the results:

python train_brainreg.py -d oasis_pkl -m brainTextSCFComplex -bs 1 --epochs 501 --reg_w 0.1 start_channel=64 scp_dim=2048 diff_int=0 clip_backbone=vit

-d oasis_pkl: Dataset used, specifically 'oasis_pkl'.
-m brainTextSCFComplex: Model name, set to 'brainTextSCFComplex'.
-bs 1: Batch size, defined as 1.
--epochs 501: Total number of epochs for training, set to 501.
--reg_w 0.1: Smoothness regularization weight, specified as 0.1.
start_channel=64: Number of starting channels (N_s), set to 64.
scp_dim=2048: The dimension (C_{\phi}) for the implicit function, set to 2048.
diff_int=0: Diffeomorphic integration flag, '0' for not used.
clip_backbone=vit: CLIP backbone type, specified as 'vit' (ViT-L/14@336px).

Please note that using a starting channel of 64 is computationally intensive. It is recommended to run this on an A100 GPU or higher for optimal performance. Alternatively, you can reduce the starting channel to as low as 8 for increased efficiency, while still achieving a Dice score of approximately 87.

To use the pretrained mode, run the script with the following command in folder ./src to get the npz files:

python test_brainreg.py -d oasis_pkl -m brainTextSCFComplex -bs 1 --is_submit 1 --load_ckpt ./../../../checkpoint/oasis_9002_64_2048_0_vit.pth start_channel=64 scp_dim=2048 diff_int=0 clip_backbone=vit

--is_submit: Whether to create npz files for submission to the challenge.
--load_ckpt: The type of the checkpoint to load, 'last' is from the latest checkpoint, 'best' is from the checkpoint with highest validation score, and a path such as './../../../checkpoint/oasis_9002_64_2048_0_vit.pth' directing to the checkpoint.

The npz files will be saved at ./textSCF/src/logs/oasis_pkl/brainTextSCFComplex/ where textSCF is the root of the code repository.

See Datasets for obtaining pretrained models and to download a complete project setup.

Todo

Citation

If our work has influenced or contributed to your research, please kindly acknowledge it by citing:

@misc{zhang2023spatially,
      title={Spatially Covariant Image Registration with Text Prompts}, 
      author={Hang Zhang and Xiang Chen and Rongguang Wang and Renjiu Hu and Dongdong Liu and Gaolei Li},
      year={2023},
      eprint={2311.15607},
      archivePrefix={arXiv},
      primaryClass={eess.IV}
}

@inproceedings{ijcai2023p0190,
  title     = {Spatially Covariant Lesion Segmentation},
  author    = {Zhang, Hang and Wang, Rongguang and Zhang, Jinwei and Liu, Dongdong and Li, Chao and Li, Jiahao},
  booktitle = {Proceedings of the Thirty-Second International Joint Conference on
               Artificial Intelligence, {IJCAI-23}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},
  editor    = {Edith Elkind},
  pages     = {1713--1721},
  year      = {2023},
  month     = {8},
  note      = {Main Track},
  doi       = {10.24963/ijcai.2023/190},
  url       = {https://doi.org/10.24963/ijcai.2023/190},
}

Acknowledgment

We extend our gratitude to LKU-Net, LapRIN, VoxelMorph, and TransMorph for their valuable contributions. Portions of the code in this repository have been adapted from these sources.

Keywords

Keywords: Diffeomorphic image registration, large deformation, Convolutional neural networks, Vision transformers, Large-scale visual language models, Spatially covariant filters, Text prompts

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
docs		docs
figs		figs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs

docs

figs

figs

src

src

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

TextSCF: LLM-Enhanced Image Registration Model

Updates

Papers

Highlights

Datasets

Usage

Todo

Citation

Acknowledgment

Keywords

About

Releases

Packages

Contributors 2

Languages

License

tinymilky/TextSCF

Folders and files

Latest commit

History

Repository files navigation

TextSCF: LLM-Enhanced Image Registration Model

Updates

Papers

Highlights

Datasets

Usage

Todo

Citation

Acknowledgment

Keywords

About

Topics

Resources

License

Stars

Watchers

Forks

Languages