-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zeros in ocean post grib2 files on hera #2615
Comments
@JessicaMeixner-NOAA we need to check the regular grid ocean nc files (which is used as input for converting to grib2) but they were erased in the g-w runs. For example the following doean't exist anymore: |
@jiandewang I'll rewind and re-run one of them and save the rundir. I'll post back here when I have that. |
Here's the saved output @jiandewang : TMP: /scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/TMP/RUNDIRS/cold03/oceanice_products.4064953 |
@JessicaMeixner-NOAA quick check for these three files: so the problem happened on tripolar to regular step, let me go through log file to see if there is any clue |
@JessicaMeixner-NOAA can you re-run it but set debug to true ? |
@jiandewang here's the output with debug=true: |
The output with If needed, I can dig deeper into the interpolation code. |
@GwenChen-NOAA do you have an idea as to what is going on? We'd appreciate your help to determine issues here. |
I am trying to understanding the run sequential for this post job: fcst step generate oceannativenc, then it being copied as ocean.nc and further more cut out key variables and saved as ocean_subset.nc. Which one is being used as input for post ? ocean.nc or ocean_subset.nc ? ls -l /scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/TMP/RUNDIRS/cold03/oceanice_products.3448181/ocean*nc -rw-r--r-- 1 Jessica.Meixner climate 1328960900 May 22 10:46 ocean.0p25.nc ocean.1p00.nc is generated 1 minute before ocean_subset.nc looked at line 74 /scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/TMP/RUNDIRS/cold03/oceanice_products.3448181/ocean.post.log |
@JessicaMeixner-NOAA, can you provide the sea-ice PR number that just merged? It will be helpful to look at the code changes. |
@jiandewang, the ocean.nc files are used to generate grib2 files. The ocean_subset.nc files are moved to the /products directory as the netcdf products to be distributed through NOMADS. |
@jiandewang I think ocean.nc is used to create ocean_subset.nc - I could be wrong... let me look into that more. @GwenChen-NOAA - The PR is #2584 I did just confirm that output from hera from before this PR was merged also had the issue where the grib files were zero output, so the sea-ice analysis PR is not the cause of this problem. I"m not sure how long this issue has been in the develop branch, if it's just a hera issue or something else? |
@JessicaMeixner-NOAA, can you run it on WCOSS2? I know downstream package can only run on WCOSS2. |
@GwenChen-NOAA The ocean post products should be able to be generated on RHDPCS, not just WCOSS2. I don't have a workflow set-up there right now, so it would be great if you could try that out to see if it works. I did find an old run that I was doing when trying to update the ufs-weather-model to a more recent version and it has non-zero fields: /scratch1/NCEPDEV/climate/Jessica.Meixner/testgw2505/test02/COMROOT/test02/gfs.20191203/00/products/ocean/grib2/1p00/gfs.ocean.t00z.1p00.f072.grib2 (for example has non-zero fields). The commit of g-w was updates from an April 17th commit. We could also look into if module for hera were updated within the ufs-weather-model between the updates as I do think that this job is using the ufs-weather-model. |
Okay I did confirm that the ufs-weather-model modules have not changed on hera, so it's not just that. |
@EricSinsky-NOAA I see that you've been running some ocean/ice post recently. Thought I'd ping you in this to see if you've noticed that grib files of the ocean were zeros or constant in any of your testing. |
@JessicaMeixner-NOAA I just ran the C48_S2SWA_gefs CI test case today using the most recent hash (7d2c539). I also see all zeroes in the gridded (5 degree) ocean data. The data is all zeroes in the gridded NetCDF data as well (not just the gridded grib2 data). |
@EricSinsky-NOAA thanks for the info! what machine was that on? |
@JessicaMeixner-NOAA This test was on Cactus. |
Thanks @EricSinsky-NOAA, seems like this is not just a hera issue then. I'm re-running my case on hera where i went back and found that I had output I expected. I'm then going to merge in develop and see how that goes as well. Hopefully will have an update on that this afternoon. |
Okay, my re-run of something where I thought I had previously had grib2 output that was non-zero, did not give me non-zeros this time.... I believe that should rule out the model version, but not sure what to look at now... |
@GwenChen-NOAA when you tested this: #2611 did you get non-zero grib2 output files? |
@JessicaMeixner-NOAA, my test used an old version of the ocean.0p25.nc file (i.e., latlon netcdf file output from ocnicepost) and worked fine. I saw the ocean.0p25.nc file under /scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/TMP/RUNDIRS/cold03/oceanice_products.3448181 also contains all zero. I found a recent closed issue (#2483) that updated fix files for CICE and MOM6/post. Perhaps @DeniseWorthen can provide some clues here. |
The issue #2483 only added/corrected the 5-degree fix file. It did not alter the 0.25-degree or 1.0-degree fix files. |
Thanks @aerorahul for that information! |
I just ran the C48_S2SW CI test case on Cactus using the 5/13/2024 commit hash (6ca106e). The gridded ocean data still consists of all zeroes as of the 5/13/2024 gw version. Will keep trying to go back to earlier commit hashes to get a better idea when and why this issue started. |
I updated to the latest version of ufs-weather-model on hera and ran another test and got all zeros in the gribs still. @EricSinsky-NOAA we know at least the HR3 tag 6f9afff from Feb 21st has non-zero gribs on wcoss2. On hera, the furthest back of g-w would be the rocky8 transition commit. |
@jiandewang After replacing the fix files with /scratch2/NCEPDEV/ensemble/noscrub/Eric.Sinsky/ocnpost_bugfix/oceanice_products.3448181/fixed-file-wcoss2 and rerunning, I am still getting all zeroes. |
My test run of C48 on wcoss2 did not do well: /lfs/h2/emc/couple/noscrub/jessica.meixner/testoceanpost/hr3/test01/COMROOT/c48t01/gfs.20210323/12/products/ocean/grib2/5p00 |
Thank you, @JessicaMeixner-NOAA. It sounds like this might be an issue with the build of ocnicepost.x on WCOSS2 and Hera. @jiandewang When you ran your HR3 test and you got reasonable interpolated ocean output, did you rebuild ocnicepost.x (as well as the other executables related to HR3) during your test? |
no I just used my original several month ago's *.x |
I did a new build, but I did have an old build too... I'll try the 0.25 case w/the new build and I'll also try using my old build on a C48 case and see what happens. |
Update:
Therefore, I think there are likely issues with all of the 5 deg cases and so we should not be using that to see if things are working or not. |
@JessicaMeixner-NOAA Glad to see you are getting non-zeroes for C768mx025. Were the C768mx025 test cases also based on the HR3 tag (not just the C48mx500 test case)? Also did you run the C768mx025 test case using both your old build and new build too? Also, I ran an old version of ocnicepost offline. I got non-zeroes in the interpolated NetCDF output. In this test, however, the resolution of the NetCDF input (MOM6) data was mx025. |
@EricSinsky-NOAA It is nice to see some non-zero values, for sure!! The tests I ran with the HR3 tag, I ran both the old build and the new build and both had non-zeros. |
This is my understanding on what we know so far:
|
@EricSinsky-NOAA I'd say that we get zero's with the newest hashes, where the mx025 issues come in between now and HR3 tag is an open question I think, since most of our previous testing was based on mx500, I'm not sure we have a lot of information about the in-between parts. I'm going to run a few tests on WCOSS2 to see if we can narrow down issues there. |
Thank you @EricSinsky-NOAA for the summary and @JessicaMeixner-NOAA for the additional information. A few questions:
I'ld say we need to find a baseline that works first; I think we have that for |
For the HR3 tag on WCOSS2 the mom6 fix files are: I'm currently trying to test the commit before the fix file change on wcoss2 with mx025 to see if that works. I did find an experiment on hera that a case using the old fix files and mx025 still gave me zeros... |
So some random thoughts before the weekend:
|
@JessicaMeixner-NOAA The diffs between WCOSS2 and Hera are because the comparisons were between two different versions of the fix files. The fix files being compared from WCOSS2 are the 20231219 version, while the fix files being compared from Hera are the 20240416 version. Both fix file versions exist on both WCOSS2 and Hera. When the fix files of the same version are compared between WCOSS2 and Hera, the file sizes are identical. |
@EricSinsky-NOAA thanks for confirming that! |
some further testing results: (2) in HR3 run on wcoss2 which gave us correct results, ocean master files are on 40 levels. However in Jessica's HERA run (/scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/TMP/RUNDIRS/cold03/oceanice_products.3448181) and Eric's run, ocean.nc are on 75 levels because you are setting as DA see https://github.com/NOAA-EMC/global-workflow/blob/develop/parm/config/gfs/config.ufs#L454-L459 |
more testing results: @EricSinsky-NOAA : you may repeat your run but use my modified input file at /scratch1/NCEPDEV/climate/Jiande.Wang/working/scratch/ocean-zero-value/ceanice_products.3448181-JM/NCO2/ocean.nc-JM-75L-E34 or you can simply repeat your C48mx500 run but set https://github.com/NOAA-EMC/global-workflow/blob/develop/parm/config/gfs/config.ufs#L456C9-L456C31 as -e34 |
@jiandewang Thank you very much for finding the issue! I just ran the C48_S2SWA_gefs CI test case (MOM6 is set to mx500) using the most recent hash. I have set EDIT: My test was on WCOSS2. |
The exception value will need to be resolved with @guillaumevernieres and others, as DA might need the missing value to be set as 0. @jiandewang what module issues did you have on hera? I was curious on Friday if we had module mis-match issues as a possible issue. |
@JessicaMeixner-NOAA I followed Walter's method (the g-w I used is the cycle one you asked me to run). No error pop out after I did source ush/......... but when I ran ocnicepost.x it crashed at writing 3D mask file. |
a quick and dirty solution: apply this command in the script after DA ocean files being generated: |
Apologies for being late to the party. Am I understanding that the missing value is defined as 0.0 in the history file? A missing value of 0.0 makes no sense to me, since it is also a valid value. How do you distinguish where Temp=0 because it really is 0.0C and where it is 0 because it is a land point? |
@DeniseWorthen see https://github.com/NOAA-EMC/global-workflow/blob/develop/parm/config/gfs/config.ufs#L456C9-L456C31 |
@jiandewang Thanks, but that doesn't answer my question really. How is a missing value of 0.0 being distinguished from a physical value of 0.0? |
@DeniseWorthen , you just don't construct your mask based on the fill value. |
@guillaumevernieres Thanks. So where does your mask come from? edit: I mean, which file? Are you retrieving it from the model output or are you using something else? |
We use the mom6 grid generation functionality but this is overkill for this issue. The mask could simply be constructed using the layer thicknesses. |
A PR has been created so that for GFS or GEFS versus GDAS/ENKF we have different exception values and number of layers for MOM6. This should be able to resolve this problem, although in the future, it might be good to still explore updating how the mask is defined in the ocean post. |
What is wrong?
When running with the sea-ice PR that was just merged, so essentially develop as of today, it was noticed by @SulagnaRay-NOAA that all of the ocean grib2 files are constant values (mostly zeros). The native model output is not zeros and the ice gribs also appear to be okay.
Investigation as to what is going on and why is ongoing.
What should have happened?
We should have grib2 output files that match the native model output (and have non-zero/constant values).
What machines are impacted?
Hera
Steps to reproduce
This was discovered running a C384 test case of C384mx025_3DVarAOWCDA. However, I suspect other test cases would expose this issue as well.
Some example output can be found here:
/scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/cold03/COMROOT/cold03/gfs.20210703/06/products/ocean/grib2/0p25
Log files can be found here: /scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/cold03/COMROOT/cold03/logs/2021070306
Additional information
@GwenChen-NOAA @jiandewang @SulagnaRay-NOAA @LydiaStefanova-NOAA @guillaumevernieres @CatherineThomas-NOAA FYI - any additional information or help is appreciated!
Do you have a proposed solution?
Not yet...
The text was updated successfully, but these errors were encountered: