Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AIF 360 for images #502

Open
AmnaSaghi opened this issue Nov 13, 2023 · 1 comment
Open

AIF 360 for images #502

AmnaSaghi opened this issue Nov 13, 2023 · 1 comment

Comments

@AmnaSaghi
Copy link

How can I use this tool for image dataset having class imbalance?

@ManasBhole
Copy link

Step-by-Step Approach to Using AIF360 for Image Datasets with Class Imbalance

  1. Preparation and Preprocessing
    Image Data Conversion: Image datasets need to be converted into a numerical format that can be processed by AI algorithms. This usually involves resizing images to a uniform size, normalizing pixel values, and possibly extracting features that represent the images in a lower-dimensional space if direct pixel analysis is impractical.
    Dataset Representation: For AIF360, data needs to be represented in a structured format, like a table, where each row corresponds to an image (or features extracted from an image) and columns represent the features plus a target variable indicating the class and potentially one or more columns indicating sensitive attributes (e.g., attributes that are imbalanced).

  2. Bias Analysis
    Quantifying Bias: Utilize AIF360’s metrics to quantify the extent of bias in the dataset. For class imbalance, metrics that compare the rate of positive outcomes across different groups (e.g., the classes of interest) are particularly relevant.

from aif360.datasets import StandardDataset
def to_aif360_format(X, y, protected_attribute_name, privileged_classes):
    return StandardDataset(...)
dataset = to_aif360_format(X_train, y_train, protected_attribute_name='sensitive_attribute', privileged_classes=[1])
from aif360.metrics import BinaryLabelDatasetMetric
metric = BinaryLabelDatasetMetric(dataset, 
                                  unprivileged_groups=[{'sensitive_attribute': 0}], 
                                  privileged_groups=[{'sensitive_attribute': 1}])
print("Statistical Parity Difference:", metric.statistical_parity_difference())
print("Disparate Impact:", metric.disparate_impact())
  1. Bias Mitigation Techniques
    Selecting a Technique: Choose a bias mitigation technique suitable for class imbalance. For instance, Reweighing is a preprocessing technique that can adjust the weights of classes to make the dataset more balanced.
    Application to Image Data: When applying bias mitigation, it's crucial to ensure that the technique is compatible with the way image data is represented. For preprocessing techniques like reweighing, this typically involves applying the technique to the structured representation of the image dataset.
from aif360.algorithms.preprocessing import Reweighing

rw = Reweighing(unprivileged_groups=[{'sensitive_attribute': 0}], 
                privileged_groups=[{'sensitive_attribute': 1}])
dataset_transf_train = rw.fit_transform(dataset)
  1. Model Training and Evaluation
    Training: Train your model on the bias-mitigated dataset. This involves using the transformed dataset where the effects of class imbalance have been reduced as the input for model training.
    Fairness Evaluation: After training, evaluate your model not only on its predictive performance but also on fairness metrics to assess whether the bias mitigation has been effective.

  2. Iterative Refinement
    Iterative Process: Bias mitigation is often an iterative process. Based on the outcomes of fairness evaluation, you might need to adjust your mitigation strategy, try different bias mitigation techniques, or further preprocess your data.
    Example Scenario
    Imagine you have an image dataset for facial recognition with an imbalance in gender representation. The steps would involve:

Preprocessing the images to extract features or convert them into a uniform format.
Analyzing the dataset using AIF360 to identify and quantify bias based on gender.
Applying a bias mitigation technique, such as reweighing, to adjust the representation of genders in the training data.
Training a facial recognition model on this adjusted dataset.
Evaluating the model's performance and fairness, ensures that predictive performance does not disproportionately benefit one gender over the other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants