Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different loss values for seemingly same forecast #32

Open
tvonich opened this issue Dec 4, 2023 · 6 comments
Open

Different loss values for seemingly same forecast #32

tvonich opened this issue Dec 4, 2023 · 6 comments

Comments

@tvonich
Copy link

tvonich commented Dec 4, 2023

I performed "1 Eval Step" forecast with Graphcast small using dataset_source-era5_date-2022-01-01_res-1.0_levels-13_steps-01.nc, steps-12.nc, and steps-40.nc.

The loss is different for each case even though we are only performing a 6 hr forecast in each case. Why might this be? As I understand it, the prediction should be the same in all these cases and the target should also be the same (ERA5 Reanalysis 6hr into the future).

Loss Values:
0.9296875 for 6hr forecast 1-step data 01 Jan 2022
0.69140625 for 6hr forecast 12-step data 01 Jan 2022
0.66015625 for 6hr forecast 40-step data 01 Jan 2022

@tvonich tvonich changed the title Different loss values seemingly same forecast Different loss values for seemingly same forecast Dec 4, 2023
@tewalds
Copy link
Member

tewalds commented Dec 20, 2023

One confusing thing about the extract_inputs_targets_forcings is it does it right-aligned instead of left-aligned, so your dataset may start from the same dates, but your actual inputs are not the same date.

@tvonich
Copy link
Author

tvonich commented Jan 13, 2024

One confusing thing about the extract_inputs_targets_forcings is it does it right-aligned instead of left-aligned, so your dataset may start from the same dates, but your actual inputs are not the same date.

Hey Timo,
That would explain it. Why is it structured this way? How does the training method deal with it?

@tewalds
Copy link
Member

tewalds commented Jan 14, 2024

I'm not sure why this was chosen as the default, but sometimes you want to know how your error changes as you predict the same time from different points in the past. There should probably be an option for choosing left vs right alignment. @alvarosg may have more context here.

@tvonich
Copy link
Author

tvonich commented Jan 14, 2024

Thanks for the quick response. This work will hopefully get me going on the 1st chapter of my dissertation.

The differences I'm getting are fairly subtle. For example, I just ran a 12 hour forecast with the step-04 netcdf and did the same with the step-40 netcdf. The 6 hr and 12 hr losses are in the jpeg. If alignment was the issue, I'd think the differences would be really large in this case. Would you tend to agree or am I thinking about this the wrong way? image

@tewalds
Copy link
Member

tewalds commented Jan 14, 2024

The step-04.nc should be a subset of step-40.nc, I think with the same initial time. That should be pretty easy to verify by loading both and looking at the data. Then just make sure you're extracting the data correctly for your use case. Feel free to send a PR with a left/right alignment option.

@tvonich
Copy link
Author

tvonich commented Jan 14, 2024

Ok. Yep. I see how it picks out the inputs and targets now. I'll make a few changes and try to submit a pull request this week. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants