Implement tool for saved Keras model file inspection, diff, and patching. #19705

pmasousa · 2024-05-10T14:28:02Z

Hello!
I saw this feature on "🚀 Contributing to Keras 🚀" and I want to know If I can start developing it.
The tool can:

Take a fname.keras file and display the manifest of its contents (including weights file structure)
Take a fname.weights.h5 and display the manifest of its content.
Diff two weights files, highlighting what's in one and not in the other.
Patch a file, by replacing a given weight with a different value provided by the user.

fchollet · 2024-05-12T00:48:00Z

Sure, you are welcome to work on that. Do you have any experience with web development? I'm thinking this tool may benefit from an interactive js/html interface to be used in a notebook.

pmasousa · 2024-05-12T17:00:46Z

I'm in the third year of computer science and engineering, so we already had to use JS and HTML for some projects, and I already have some knowledge of web development from some side projects I have done and am currently doing. I will also be doing this with @pedro-curto, who is in the same year and university as I am.
I also agree that this tool would be better with an interface so I'll be glad to do it if you agree.

fchollet · 2024-05-12T17:33:34Z

Sounds great.

Here's an example of a draft I wrote a long time ago, displaying a kind of summary of a file's content:

def inspect_file(
    filepath, reference_model=None, custom_objects=None, print_fn=print
):
    filepath = str(filepath)
    if filepath.endswith(".keras"):

        with zipfile.ZipFile(filepath, "r") as zf:
            print_fn(f"Keras model file '{filepath}'")

            with zf.open(_CONFIG_FILENAME, "r") as f:
                config = json.loads(f.read())
                print_fn(
                    f"Model: {config['class_name']} name='{config['config']['name']}'"
                )
            if reference_model is None:
                reference_model = deserialize_keras_object(
                    config, custom_objects=custom_objects
                )

            with zf.open(_METADATA_FILENAME, "r") as f:
                metadata = json.loads(f.read())
                print_fn(f"Saved with Keras {metadata['keras_version']}")
                print_fn(f"Date saved: {metadata['date_saved']}")

            archive = zipfile.ZipFile(filepath, "r")
            weights_store = H5IOStore(
                _VARS_FNAME + ".h5", archive=archive, mode="r"
            )
            print_fn("Weights file:")
            inspect_nested_dict(weights_store.h5_file, print_fn, prefix="    ")

    elif filepath.endswith(".weights.h5"):
        print_fn(f"Keras weights file '{filepath}'")
        weights_store = H5IOStore(
            _VARS_FNAME + ".h5", archive=archive, mode="r"
        )
        inspect_nested_dict(weights_store.h5_file, print_fn)

    else:
        raise ValueError(
            "Invalid filename: expected a `.keras` `.weights.h5` extension. "
            f"Received: filepath={filepath}"
        )


def inspect_nested_dict(store, print_fn=print, prefix=""):
    for key in store.keys():
        value = store[key]

        if hasattr(value, "keys"):
            skip = False
            if (
                list(value.keys()) == ["vars"]
                and len(value["vars"].keys()) == 0
            ):
                skip = True
            if key == "vars" and len(value.keys()) == 0:
                skip = True
            if not skip:
                print_fn(f"{prefix}{key}")
                inspect_nested_dict(value, print_fn, prefix=prefix + "    ")
                if key == "vars":
                    for k in value.keys():
                        w = value[k]
                        print_fn(f"{prefix}    {k}: {w.shape} {w.dtype}")

(It relies on objects from keras/src/saving/saving_lib.py, like H5IOStore).

I think we want the following features:

Show the contents of a file, down to visualizing weight variables as a series of color grids. The interface should make this easy: at first you only see the list of top-level layers, but you can click on any of them to expand their contents, etc. Finally you can click on a weight tensor to visualize it. HTML+JS is a great fit for this, compared to the command line.
Show the diff compared to a reference_model. Highlight any incompatibilities or differences between the saved file contents and the structure of the reference model.
Offer a way to rename a weight or layer, or delete one, or add one -- saving a new edited file as a result. This could be done with an interactive interface.

What do you think?

pmasousa · 2024-05-12T21:49:00Z

That sounds fantastic!
The template already helps a lot. Can I ask you if I have any further questions?
Can you add @pedro-curto as a participant to this issue?

fchollet · 2024-05-12T23:21:07Z

Sure, you can just ask questions in this thread.

pmasousa · 2024-05-20T22:50:55Z

Hi @fchollet,
We would like some feedback on our current data visualization and structure.
Here is a collab with what we have so far, regarding the first functionality, along with some test scripts to visualize the output.
Is this what you had in mind?
Additionally, we're struggling to find a robust way to handle aspect ratios for the graphs so that they work well with all types of dimensions. Do you have any suggestions or best practices for this?

Thank you very much for your assistance.

github-actions bot assigned sachinprasadhs May 10, 2024

sachinprasadhs added type:feature The user is asking for a new feature. keras-team-review-pending Pending review by a Keras team member. labels May 10, 2024

sachinprasadhs added stat:contributions welcome A pull request to fix this issue would be welcome. and removed keras-team-review-pending Pending review by a Keras team member. labels May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement tool for saved Keras model file inspection, diff, and patching. #19705

Implement tool for saved Keras model file inspection, diff, and patching. #19705

pmasousa commented May 10, 2024

fchollet commented May 12, 2024

pmasousa commented May 12, 2024

fchollet commented May 12, 2024

pmasousa commented May 12, 2024

fchollet commented May 12, 2024

pmasousa commented May 20, 2024 •

edited

Implement tool for saved Keras model file inspection, diff, and patching. #19705

Implement tool for saved Keras model file inspection, diff, and patching. #19705

Comments

pmasousa commented May 10, 2024

fchollet commented May 12, 2024

pmasousa commented May 12, 2024

fchollet commented May 12, 2024

pmasousa commented May 12, 2024

fchollet commented May 12, 2024

pmasousa commented May 20, 2024 • edited

pmasousa commented May 20, 2024 •

edited