-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve efficiency of summary metadata aggregation #539
Comments
Alternative path to quicker and more memory efficient aggregation of metadata from |
Today, from webviz side, we probably want to let the |
Co-authored-by: Havard Bjerke <[email protected]>
Currently the summary metadata is handled by a separate method in
fmu-ensemble
, which is about as heavy as the get_smry() itself, as it loads all the summary data.In
ecl2df
it is now proposed to add this metadata directly to the pandasDataFrame.attrs
for the dataset. TheDataFrame.attrs
is said to be experimental, so a bit risky to base on it, but there are some good opportunities. An alternative is something like returning a tuple of thedf
and a metadatadict
instead of just thedf
if you set a flag.At least,
fmu-ensemble
is planning to base this part of the code directly onecl2df
, so then we could get the same feature there.On the webviz-side, the
pandas.to_parquet()
for portables doesn't support metadata directly, but according to this article the route to combine a df and json-like metadata dict in a parquet file doesn't seem too hard. A feature to combinedf
s with metadata in our portables is anyways something I am sure that we can benefit from (unless reading the parquet back into pandas becomes a lot slower).Think this can be a major gain in build time, and possibly also memory usage during build for apps using data from
UNSMRY
.The text was updated successfully, but these errors were encountered: