Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert staging job to python for forecast-only mode #2651

Draft
wants to merge 58 commits into
base: develop
Choose a base branch
from

Conversation

KateFriedman-NOAA
Copy link
Member

Description

This PR converts the staging job from shell to python. The existing forecast-only stage_ic is converted to use python and yaml via jinja parsing. This PR does not impact the existing ROTDIR symlink population done for cycled mode via the setup scripts.

Changes in this PR:

  1. Rename scripts/exglobal_stage_ic.sh to scripts/exglobal_stage_ic.py.
  2. Update jobs/JGLOBAL_STAGE_IC to use .py script extension. Move GDATE/gPDY/gcyc settings up to JJOB from ex-script, as well as move up COM* variable declarations and MEMDIR[_ARRAY] settings.
  3. Add PYTHONPATH export to jobs/rocoto/stage_ic.sh.
  4. Create parm/stage folder to hold newly created yamls (*.yaml.j2) for each initial condition set currently handled in existing scripts/exglobal_stage_ic.sh. Included yamls for warm start (fv3_warm.yaml.j2) and DO_NEST=YES (fv3_nest.yaml.j2) to retain functionality in existing scripts/exglobal_stage_ic.sh but have not tested these configurations in forecast-only mode.
  5. Create ush/python/pygfs/task/stage.py to house staging job python functions for call from scripts/exglobal_stage_ic.py.
  6. Add export USE_OCN_PERTURB_FILES=".false." to gfs config.base; need variable as key in staging job python regardless of RUN.
  7. Remove stage_ic job rocoto dependencies from xml. Do not need and removes area of duplicate maintenance.

Follow-up PRs will add stage_ic job to cycled mode and extended capabilities as needed as part of issue #2475.

Resolves #2650

Type of change

  • New feature (adds functionality)

Change characteristics

  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO

How has this been tested?

Ran the forecast-only CI tests on Hera:

  • C48_ATM
  • C48_S2SW
  • C48_S2SWA_gefs

Outputs match CI tests run from develop. Logs available for review.

Snippet from C48 S2SWA GEFS stage_ic job log showing the mkdir and copy of mem002 files into ROTDIR:

2024-06-03 13:50:27,622 - INFO     - file_utils  : Created /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/12/mem002/model_data/atmos/input
2024-06-03 13:50:27,628 - INFO     - file_utils  : Copied /scratch1/NCEPDEV/global/glopara/data/ICSDIR/prototype_ICs/gefs_test/2021032312/mem002/atmos/gfs_ctrl.nc to /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/12/mem002/model_data/atmos/input
2024-06-03 13:50:27,734 - INFO     - file_utils  : Copied /scratch1/NCEPDEV/global/glopara/data/ICSDIR/prototype_ICs/gefs_test/2021032312/mem002/atmos/gfs_data.tile1.nc to /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/12/mem002/model_data/atmos/input
2024-06-03 13:50:27,879 - INFO     - file_utils  : Copied /scratch1/NCEPDEV/global/glopara/data/ICSDIR/prototype_ICs/gefs_test/2021032312/mem002/atmos/gfs_data.tile2.nc to /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/12/mem002/model_data/atmos/input
2024-06-03 13:50:27,983 - INFO     - file_utils  : Copied /scratch1/NCEPDEV/global/glopara/data/ICSDIR/prototype_ICs/gefs_test/2021032312/mem002/atmos/gfs_data.tile3.nc to /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/12/mem002/model_data/atmos/input
2024-06-03 13:50:28,174 - INFO     - file_utils  : Copied /scratch1/NCEPDEV/global/glopara/data/ICSDIR/prototype_ICs/gefs_test/2021032312/mem002/atmos/gfs_data.tile4.nc to /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/12/mem002/model_data/atmos/input
2024-06-03 13:50:28,317 - INFO     - file_utils  : Copied /scratch1/NCEPDEV/global/glopara/data/ICSDIR/prototype_ICs/gefs_test/2021032312/mem002/atmos/gfs_data.tile5.nc to /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/12/mem002/model_data/atmos/input
2024-06-03 13:50:28,463 - INFO     - file_utils  : Copied /scratch1/NCEPDEV/global/glopara/data/ICSDIR/prototype_ICs/gefs_test/2021032312/mem002/atmos/gfs_data.tile6.nc to /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/12/mem002/model_data/atmos/input
2024-06-03 13:50:28,544 - INFO     - file_utils  : Copied /scratch1/NCEPDEV/global/glopara/data/ICSDIR/prototype_ICs/gefs_test/2021032312/mem002/atmos/sfc_data.tile1.nc to /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/12/mem002/model_data/atmos/input
2024-06-03 13:50:28,583 - INFO     - file_utils  : Copied /scratch1/NCEPDEV/global/glopara/data/ICSDIR/prototype_ICs/gefs_test/2021032312/mem002/atmos/sfc_data.tile2.nc to /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/12/mem002/model_data/atmos/input
2024-06-03 13:50:28,614 - INFO     - file_utils  : Copied /scratch1/NCEPDEV/global/glopara/data/ICSDIR/prototype_ICs/gefs_test/2021032312/mem002/atmos/sfc_data.tile3.nc to /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/12/mem002/model_data/atmos/input
2024-06-03 13:50:28,694 - INFO     - file_utils  : Copied /scratch1/NCEPDEV/global/glopara/data/ICSDIR/prototype_ICs/gefs_test/2021032312/mem002/atmos/sfc_data.tile4.nc to /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/12/mem002/model_data/atmos/input
2024-06-03 13:50:28,721 - INFO     - file_utils  : Copied /scratch1/NCEPDEV/global/glopara/data/ICSDIR/prototype_ICs/gefs_test/2021032312/mem002/atmos/sfc_data.tile5.nc to /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/12/mem002/model_data/atmos/input
2024-06-03 13:50:28,932 - INFO     - file_utils  : Copied /scratch1/NCEPDEV/global/glopara/data/ICSDIR/prototype_ICs/gefs_test/2021032312/mem002/atmos/sfc_data.tile6.nc to /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/12/mem002/model_data/atmos/input
2024-06-03 13:50:28,941 - INFO     - file_utils  : Created /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/12/mem002/model_data/wave/restart
2024-06-03 13:50:29,101 - INFO     - file_utils  : Copied /scratch1/NCEPDEV/global/glopara/data/ICSDIR/prototype_ICs/gefs_test/2021032312/mem002/wave/20210323.120000.restart.glo_500 to /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/12/mem002/model_data/wave/restart
2024-06-03 13:50:29,206 - INFO     - file_utils  : Created /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/06/mem002/model_data/ocean/restart
2024-06-03 13:50:29,478 - INFO     - file_utils  : Copied /scratch1/NCEPDEV/global/glopara/data/ICSDIR/prototype_ICs/gefs_test/2021032312/mem002/ocean/20210323.120000.MOM.res.nc to /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/06/mem002/model_data/ocean/restart
2024-06-03 13:50:29,485 - INFO     - file_utils  : Created /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/06/mem002/model_data/ice/restart
2024-06-03 13:50:29,526 - INFO     - file_utils  : Copied /scratch1/NCEPDEV/global/glopara/data/ICSDIR/prototype_ICs/gefs_test/2021032312/mem002/ice/20210323.120000.cice_model.res.nc to /scratch1/NCEPDEV/stmp4/Kate.Friedman/comrot/teststage_C48_S2SWA_gefs_c/gefs.20210323/06/mem002/model_data/ice/restart
2024-06-03 13:50:29,621 - INFO     - stage       :   END: pygfs.task.stage.execute_stage
2024-06-03 13:50:29,622 - DEBUG    - stage       :  returning: None
2024-06-03 13:50:29,622 - INFO     - root        :   END: __main__.main
2024-06-03 13:50:29,622 - DEBUG    - root        :  returning: None
+ JGLOBAL_STAGE_IC[67]: [[ 0 -ne 0 ]]
+ JGLOBAL_STAGE_IC[75]: cd /scratch1/NCEPDEV/stmp2/Kate.Friedman/RUNDIRS/teststage_C48_S2SWA_gefs_c
+ JGLOBAL_STAGE_IC[76]: [[ NO = \N\O ]]
+ JGLOBAL_STAGE_IC[76]: rm -rf /scratch1/NCEPDEV/stmp2/Kate.Friedman/RUNDIRS/teststage_C48_S2SWA_gefs_c/stage_ic.2811565
+ JGLOBAL_STAGE_IC[78]: exit 0
+ JGLOBAL_STAGE_IC[1]: postamble JGLOBAL_STAGE_IC 1717422601 0
+ preamble.sh[70]: set +x
End JGLOBAL_STAGE_IC at 13:50:30 with error code 0 (time elapsed: 00:00:29)
+ stage_ic.sh[20]: status=0
+ stage_ic.sh[23]: exit 0
+ stage_ic.sh[1]: postamble stage_ic.sh 1717422598 0
+ preamble.sh[70]: set +x
End stage_ic.sh at 13:50:30 with error code 0 (time elapsed: 00:00:32)
_______________________________________________________________
Start Epilog on node h22c03 for job 61225255 :: Mon Jun  3 13:50:30 UTC 2024

KateFriedman-NOAA and others added 30 commits May 14, 2024 14:57
- Change the extension of the exglobal_stage ex-script
from "sh" to "py".

Refs NOAA-EMC#2475
- Update script extension for ex-script from "sh" to "py".
- Pull COM* variable declares up from ex-script.

Refs NOAA-EMC#2475
Remove the functions and calls to set up
symlinks to ICs in ROTDIR.

Refs NOAA-EMC#2475
- Add initial new yaml files for staging information
- Add new stage.py to python tasks.
- Add first draft pythonization of stage ex-script.

Much more work is still to be done.

Refs NOAA-EMC#2475
Revert changes to workflow/setup_expt.py; will do in later task

Refs NOAA-EMC#2475

* upstream/develop:
  Sea-ice analysis insertion (NOAA-EMC#2584)
  Refactored archiving (NOAA-EMC#2491)
  Add remove RUNDIRS step in CI before creating experements (NOAA-EMC#2607)
Also fix to set target and remove source

Refs NOAA-EMC#2475
Add target, remove source, and update file info

Refs NOAA-EMC#2475
Update to use mkdir and copy instead of target and required

Refs NOAA-EMC#2475
If RUN=gefs add keys_gefs to keys.

Refs NOAA-EMC#2475
Set to .false. by default; needed for staging job

Refs NOAA-EMC#2475
- remove master yaml, no longer using
- update fv3_cold, ice, ocean, and wave yamls

Refs NOAA-EMC#2475
- Add keys for GEFS
- Cleanup

Refs NOAA-EMC#2475
- General cleanup
- Rework determine_stage function
- Rework execute_stage function

Refs NOAA-EMC#2475
- Delete da.yaml.j2; will remake in follow-up work
- Update fv3_warm.yaml.j2 to not use src/head variables

Refs NOAA-EMC#2475
Add extra whitespaces in exglobal_stage_ic.py to address E231 error.

Refs NOAA-EMC#2475

for set_yaml in stage_sets:

stage_set = parse_j2yaml(os.path.join(stage_parm, set_yaml), stage_dict)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If all variables need to be defined in the jinja file, then you might consider adding allow_missing=False to the parse_j2yaml call. This will raise an exception if a variable referenced in the template is not defined.

@KateFriedman-NOAA
Copy link
Member Author

Got some good feedback from reviewers (offline and online). Will ponder feedback and work on updates. Will leave this draft PR open while working on it. Thank you all so far!

@KateFriedman-NOAA
Copy link
Member Author

Will be making some iterative commits. Please do not rereview until rerequested.

KateFriedman-NOAA and others added 11 commits June 14, 2024 15:24
- set job for gdas_half cycledef only
- update fcst jobs dependencies

Refs NOAA-EMC#2475
- Delete the fill_ROTDIR functions used in half cycle setup;
functions are replaced by half cycle stage_ic jobs
- Update --icsdir setup flag to allow user provided path for all RUNs

Refs NOAA-EMC#2475
* origin/develop:
  Add observation preparation job for aerosols DA to workflow (NOAA-EMC#2624)
  Remove ocean daily files (NOAA-EMC#2689)
  Update Jenkinsfile
  Add Hercules-EMC to the Jenkins configurable parameter list (NOAA-EMC#2685)
  Update gdas.cd and gsi_utils hashes (NOAA-EMC#2641)
  Add ability to use GEFS replay ICs (NOAA-EMC#2559)
  Replace `sleep` with `wait_for_file` (NOAA-EMC#2586)
  Add COM template for JEDI obs (NOAA-EMC#2678)
  Link both global-nest fix files and non-nest ones at the same time (NOAA-EMC#2632)
  Update ufs-weather-model  (NOAA-EMC#2663)
  Add ability to process ocean/ice products specific to GEFS (NOAA-EMC#2561)
  Update cleanup job to use COMIN/COMOUT (NOAA-EMC#2649)
  Add overwrite to creat experiment in BASH CI (NOAA-EMC#2676)
  Add handling to select CRTM cloud optical table based on cloud scheme and update calcanal_gfs.py  (NOAA-EMC#2645)

Refs NOAA-EMC#2475
Makes yamls more generic and uses paths set previously

Refs NOAA-EMC#2475
- Add new keys
- Remove determine function
- Update execute function to use single stage.yaml.j2

Refs NOAA-EMC#2475
- Add RDATE and DTG_PREFIX
- Remove USE_OCN_PERTURB_FILES
- Always declare COM_ATMOS_ANALYSIS

Refs NOAA-EMC#2475
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Convert forecast-only stage_ic job to python
2 participants