Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve efficiency of summary metadata aggregation #539

Closed
asnyv opened this issue Jan 13, 2021 · 2 comments
Closed

Improve efficiency of summary metadata aggregation #539

asnyv opened this issue Jan 13, 2021 · 2 comments
Labels
Data input This issue related to extracting/manipulating or organizing input data to Webviz enhancement 🚀 New feature or request
Projects

Comments

@asnyv
Copy link
Collaborator

asnyv commented Jan 13, 2021

Currently the summary metadata is handled by a separate method in fmu-ensemble, which is about as heavy as the get_smry() itself, as it loads all the summary data.

In ecl2df it is now proposed to add this metadata directly to the pandas DataFrame.attrs for the dataset. The DataFrame.attrs is said to be experimental, so a bit risky to base on it, but there are some good opportunities. An alternative is something like returning a tuple of the df and a metadata dict instead of just the df if you set a flag.
At least, fmu-ensemble is planning to base this part of the code directly on ecl2df, so then we could get the same feature there.

On the webviz-side, the pandas.to_parquet() for portables doesn't support metadata directly, but according to this article the route to combine a df and json-like metadata dict in a parquet file doesn't seem too hard. A feature to combine dfs with metadata in our portables is anyways something I am sure that we can benefit from (unless reading the parquet back into pandas becomes a lot slower).

Think this can be a major gain in build time, and possibly also memory usage during build for apps using data from UNSMRY.

@asnyv asnyv added enhancement 🚀 New feature or request Data input This issue related to extracting/manipulating or organizing input data to Webviz labels Jan 13, 2021
@asnyv asnyv added this to Backlog 📝 in Webviz via automation Jan 13, 2021
@asnyv
Copy link
Collaborator Author

asnyv commented Jan 13, 2021

Alternative path to quicker and more memory efficient aggregation of metadata from SMSPEC is to solve equinor/ecl/issues/796 and utilize that in fmu-ensemble / ecl2df with an implementation of the aggregation close to how it currently is (so a separate function for metadata as today in fmu-ensemble)

@anders-kiaer
Copy link
Collaborator

anders-kiaer commented Jun 28, 2021

Today, from webviz side, we probably want to let the .arrow dump solve efficient metadata aggregation "automatically" (simply be being a good format for arbitrary reads). 🚀

Webviz automation moved this from Backlog 📝 to Done 🏁 Jun 28, 2021
VincentNevermore pushed a commit to VincentNevermore/webviz-subsurface that referenced this issue Jul 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data input This issue related to extracting/manipulating or organizing input data to Webviz enhancement 🚀 New feature or request
Projects
Archived in project
Webviz
  
Done 🏁
Development

No branches or pull requests

2 participants