Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stage 0 scripts and config #5

Open
Sreyan88 opened this issue Mar 13, 2024 · 1 comment
Open

Stage 0 scripts and config #5

Sreyan88 opened this issue Mar 13, 2024 · 1 comment

Comments

@Sreyan88
Copy link

Hi there,

Great work! Could you please provide us with the pretrain_stage0.sh or the config file (except the log file). We would like to reproduce some experiments! Thank You!

@yiren-jian
Copy link
Owner

I used something similar to this (if you find anything here inconsistent with the log, please feel free to replace it). The stage-0 was trained on an other server at Northwestern with 3x RTX-A6000, which I only kept the log and pre-trained weights.

model:
  arch: pformer_opt
  model_type: pformer_opt2.7b
  load_pretrained: False
  # intialize stage 2 pretraining from stage 1 pretrained model
  # pretrained: "https://storage.googleapis.com/sfr-vision-language-research/LAVIS/models/BLIP2/blip2_pretrained.pth"
  freeze_vit: True


datasets:
  sentence_dataset:
    text_processor:
        train:
          name: "blip_caption"

run:
  task: image_text_pretrain   ### no need to change
  # runner: runner_iter
  # optimizer
  lr_sched: "linear_warmup_cosine_lr"
  init_lr: 1e-4
  min_lr: 1e-5
  warmup_lr: 1e-6

  weight_decay: 0.05
  max_epoch: 5
  # max_iters: 60000
  # iters_per_inner_epoch: 6000
  batch_size_train: 128
  batch_size_eval: 64
  num_workers: 4
  warmup_steps: 2000

  seed: 42
  output_dir: "output/BLIP-T/Pretrain_stage0"

  amp: True
  resume_ckpt_path: null

  evaluate: False
  train_splits: ["train"]

  device: "cuda"
  world_size: 3
  dist_url: "env://"
  distributed: True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants