Support for many new model output types #93

abigailgold · 2024-05-02T12:20:10Z

Including multi-label (output) models.
Note that not all model output types are supported in all modules and methods.

…i-label classifiers Signed-off-by: abigailt <[email protected]>

Signed-off-by: abigailt <[email protected]>

…upports multiple output types. Existing tests pass. Still need more tests for new types. Signed-off-by: abigailt <[email protected]>

Signed-off-by: abigailt <[email protected]>

…ytorch model passing. Signed-off-by: abigailt <[email protected]>

Signed-off-by: abigailt <[email protected]>

…ions Signed-off-by: abigailt <[email protected]>

Signed-off-by: abigailt <[email protected]>

…cated with the new output types supported. Signed-off-by: abigailt <[email protected]>

andersonm-ibm

Some readability suggestions

andersonm-ibm · 2024-06-10T13:06:53Z

apt/utils/models/keras_model.py

@@ -65,7 +63,7 @@ def predict(self, x: Dataset, **kwargs) -> OUTPUT_DATA_ARRAY_TYPE:
 :return: Predictions from the model as numpy array (class probabilities, if supported).
 """
 predictions = self._art_model.predict(x.get_samples(), **kwargs)
- check_correct_model_output(predictions, self.output_type)
+ # check_correct_model_output(predictions, self.output_type)


Why is this commented out?

andersonm-ibm · 2024-06-10T13:17:58Z

apt/utils/models/model.py

@@ -12,9 +13,16 @@


 class ModelOutputType(Enum):


Why not inherit from Flag instead?
Then you could use much less members for the variations, e.g. - CLASSIFIER, MULTI, BINARY, LOGITS, and the checks in the rest of the code will be much easier

andersonm-ibm · 2024-06-10T13:20:35Z

apt/utils/models/model.py



-def get_nb_classes(y: OUTPUT_DATA_ARRAY_TYPE) -> int:
+def is_multi_label(output_type: ModelOutputType) -> bool:


This whole section would be redundant if you ModelOutputType inherits from Flag.

andersonm-ibm · 2024-06-10T13:29:16Z

apt/utils/models/model.py

 :return: the score as float (for classifiers, between 0 and 1)
 """
- raise NotImplementedError
+ predictions = kwargs['predictions'] if 'predictions' in kwargs else None


In these and the following you can use kwargs.get('predictions') variations.
It returns the value for key if key is in the dictionary, else default. If default is not given, it defaults to None,

andersonm-ibm · 2024-06-10T13:41:03Z

apt/utils/models/model.py

+ if scoring_method == ScoringMethod.ACCURACY:
+ if not is_multi_label(self.output_type) and not is_binary(self.output_type) and nb_classes is not None:
+ y = check_and_transform_label_format(y, nb_classes=nb_classes)
+ if (self.output_type == ModelOutputType.CLASSIFIER_SINGLE_OUTPUT_CLASS_PROBABILITIES


Again, a Flag ModelOutputType would make this check easier

andersonm-ibm · 2024-06-10T13:54:10Z

apt/utils/models/model.py


 if y_train_pred is not None and len(y_train_pred.shape) == 1:
- self._nb_classes = get_nb_classes(y_train_pred)
+ # self._nb_classes = get_nb_classes(y_train_pred, self.output_type)


These commented-out lines can be removed, right?

andersonm-ibm · 2024-06-10T14:07:59Z

apt/utils/models/model.py

 def score(self, test_data: Dataset, **kwargs):
 """
 Score the model using test data.

 :param test_data: Test data.
- :type train_data: `Dataset`
+ :type test_data: `Dataset`
+ :param predictions: Model predictions to score. If provided, these will be used instead of calling the model's


Why aren't all these defined in the API, but only in the method documentation?

andersonm-ibm · 2024-06-10T14:18:44Z

tests/test_pytorch.py

+
+def test_pytorch_predictions_multi_label_cat():
+ # This kind of model requires special training and will not be supported using the 'fit' method.
+ class multi_label_cat_model(nn.Module):


Class name should be MultiLabelCatModel

andersonm-ibm · 2024-06-10T14:19:54Z

tests/test_pytorch.py

+
+
+def test_pytorch_predictions_multi_label_binary():
+ class multi_label_binary_model(nn.Module):


Again, class names should normally use the CapWords convention, unless they are primarily used as a callable.

Signed-off-by: abigailt <[email protected]>

abigailgold added 10 commits February 12, 2024 09:45

Initial version of general model wrappers and methods supporting mult…

f197199

…i-label classifiers Signed-off-by: abigailt <[email protected]>

Initial support+test for pytorch multi-label binary classifier

79534b6

Signed-off-by: abigailt <[email protected]>

New model output types + single implementation of score method that s…

5e19d4a

…upports multiple output types. Existing tests pass. Still need more tests for new types. Signed-off-by: abigailt <[email protected]>

Working example of anonymization with pytorch multi-output binary model

076503b

Signed-off-by: abigailt <[email protected]>

Support for multi-label binary models in minimizer. First test with p…

7e34f0d

…ytorch model passing. Signed-off-by: abigailt <[email protected]>

Support for multi-label logits/probabilities

8b8b461

Signed-off-by: abigailt <[email protected]>

Test for sklearn (currently not passing due to ART dependency)

b3f8762

Signed-off-by: abigailt <[email protected]>

Add tests that check the transforemd is identical when no generalizat…

aa65f0f

…ions Signed-off-by: abigailt <[email protected]>

Add tests for single label binary pytorch models

a8ec87f

Signed-off-by: abigailt <[email protected]>

Tests and support for additional model output types

0f5a1bc

Signed-off-by: abigailt <[email protected]>

abigailgold requested a review from andersonm-ibm May 2, 2024 12:20

abigailgold added 2 commits May 2, 2024 17:04

Formatting

a481687

Signed-off-by: abigailt <[email protected]>

Remove check of correct shape of predictions which becomes too compli…

846de0f

…cated with the new output types supported. Signed-off-by: abigailt <[email protected]>

andersonm-ibm requested changes Jun 10, 2024

View reviewed changes

Addressing review comments

2895b40

Signed-off-by: abigailt <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for many new model output types #93

Support for many new model output types #93

abigailgold commented May 2, 2024

andersonm-ibm left a comment

andersonm-ibm Jun 10, 2024

andersonm-ibm Jun 10, 2024

andersonm-ibm Jun 10, 2024

andersonm-ibm Jun 10, 2024

andersonm-ibm Jun 10, 2024

andersonm-ibm Jun 10, 2024

andersonm-ibm Jun 10, 2024

andersonm-ibm Jun 10, 2024

andersonm-ibm Jun 10, 2024



		def get_nb_classes(y: OUTPUT_DATA_ARRAY_TYPE) -> int:
		def is_multi_label(output_type: ModelOutputType) -> bool:



		def test_pytorch_predictions_multi_label_binary():
		class multi_label_binary_model(nn.Module):

Support for many new model output types #93

Are you sure you want to change the base?

Support for many new model output types #93

Conversation

abigailgold commented May 2, 2024

andersonm-ibm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment