Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error calling Module: dataset.SearchDataset #44

Open
TEnsorTHiru opened this issue Oct 26, 2022 · 2 comments
Open

Error calling Module: dataset.SearchDataset #44

TEnsorTHiru opened this issue Oct 26, 2022 · 2 comments

Comments

@TEnsorTHiru
Copy link

TEnsorTHiru commented Oct 26, 2022

I get this error: Error calling module: dataset.SearchDataset. I have attached the stacktrace for the error and also the yaml and dataset pyscript.

Stack Trace

Traceback (most recent call last):
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/hydra/_internal/utils.py", line 529, in _locate
    module = import_module(mod)
  File "/opt/conda/envs/auto/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1011, in _gcd_import
  File "<frozen importlib._bootstrap>", line 950, in _sanity_check
ValueError: Empty module name

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/hydra/utils.py", line 61, in call
    type_or_callable = _locate(cls)
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/hydra/_internal/utils.py", line 532, in _locate
    raise ImportError(f"Error loading module '{path}'") from e
ImportError: Error loading module 'dataset.SearchDataset'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/envs/auto/bin/autoalbument-search", line 8, in <module>
    sys.exit(main())
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/hydra/main.py", line 32, in decorated_main
    _run_hydra(
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/hydra/_internal/utils.py", line 346, in _run_hydra
    run_and_report(
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/hydra/_internal/utils.py", line 201, in run_and_report
    raise ex
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/hydra/_internal/utils.py", line 198, in run_and_report
    return func()
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/hydra/_internal/utils.py", line 347, in <lambda>
    lambda: hydra.run(
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 107, in run
    return run_job(
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/hydra/core/utils.py", line 127, in run_job
    ret.return_value = task_function(task_cfg)
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/autoalbument/cli/search.py", line 55, in main
    searcher.search()
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/autoalbument/faster_autoaugment/search.py", line 65, in search
    self.trainer.fit(self.model, datamodule=self.datamodule)
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 488, in fit
    self.data_connector.prepare_data(model)
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 59, in prepare_data
    self.trainer.datamodule.prepare_data()
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/pytorch_lightning/core/datamodule.py", line 92, in wrapped_fn
    return fn(*args, **kwargs)
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/pytorch_lightning/utilities/distributed.py", line 40, in wrapped_fn
    return fn(*args, **kwargs)
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/autoalbument/faster_autoaugment/datamodule.py", line 23, in prepare_data
    self._instantiate_dataset()
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/autoalbument/faster_autoaugment/datamodule.py", line 73, in _instantiate_dataset
    dataset = instantiate(data_cfg.dataset, transform=transform)
  File "/opt/conda/envs/auto/lib/python3.8/site-packages/hydra/utils.py", line 70, in call
    raise HydraException(f"Error calling '{cls}' : {e}") from e
hydra.errors.HydraException: Error calling 'dataset.SearchDataset' : Error loading module 'dataset.SearchDataset'

Search.yaml

# @package _global_

_version: 2  # An internal value that indicates a version of the config schema. This value is used by
# `autoalbument-search` and `autoalbument-migrate` to upgrade the config to the latest version if necessary.
# Please do not change it manually.


seed: 42 # Random seed. If the value is not null, it will be passed to `seed_everything` -
# https://pytorch-lightning.readthedocs.io/en/stable/api/pytorch_lightning.utilities.seed.html?highlight=seed_everything


task: classification # Deep learning task. Should either be `classification` or `semantic_segmentation`.


policy_model:
# Configuration for Policy model which is used to augment input images.

  task_factor: 0.1
# Multiplier for classification loss of a model. Faster AutoAugment uses classification loss to prevent augmentations
# from transforming images of a particular class to another class. The authors of Faster AutoAugment use 0.1 as
# default value.

  gp_factor: 10
# Multiplier for the gradient penalty for WGAN-GP training. 10 is the default value that was proposed in
# `Improved Training of Wasserstein GANs`.

  temperature: 0.05
# Temperature for Relaxed Bernoulli distribution. The probability of applying a certain augmentation is sampled from
# Relaxed Bernoulli distribution (because Bernoulli distribution is not differentiable). With lower values of
# `temperature` Relaxed Bernoulli distribution behaves like Bernoulli distribution. In the paper, the authors
# of Faster AutoAugment used 0.05 as a default value for `temperature`.

  num_sub_policies: 40
# Number of augmentation sub-policies. When an image passes through an augmentation pipeline, Faster AutoAugment
# randomly chooses one sub-policy and uses augmentations from that sub-policy to transform an input image. A larger
# number of sub-policies leads to a more diverse set of augmentations and better performance of a model trained on
# augmented images. However, an increase in the number of sub-policies leads to the exponential growth of a search
# space of augmentations, so you need more training data for Policy Model to find good augmentation policies.

  num_chunks: 4
# Number of chunks in a batch. Faster AutoAugment splits each batch of images into `num_chunks` chunks. Then it
# applies the same sub-policy with the same parameters to each image in a chunk. This parameter controls the tradeoff
# between the speed of augmentation search and diversity of augmentations. Larger `num_chunks` values will lead to
# faster searching but less diverse set of augmentations. Note that this parameter is used only in the searching
# phase. When you train a model with found sub-policies, Albumentations will apply a distinct set of transformations
# to each image separately.

  operation_count: 4
# Number of consecutive augmentations in each sub-policy. Faster AutoAugment will sequentially apply `operation_count`
# augmentations from a sub-policy to an image. Larger values of `operation_count` lead to better performance of
# a model trained on augmented images. Simultaneously, larger values of `operation_count` affect the speed of search
# and increase the searching time.

#   operations:
#   - _target_: autoalbument.faster_autoaugment.models.policy_operations.ShiftRGB
#     shift_r: true
#   - _target_: autoalbument.faster_autoaugment.models.policy_operations.ShiftRGB
#     shift_g: true
#   - _target_: autoalbument.faster_autoaugment.models.policy_operations.ShiftRGB
#     shift_b: true
#   - _target_: autoalbument.faster_autoaugment.models.policy_operations.RandomBrightness
#   - _target_: autoalbument.faster_autoaugment.models.policy_operations.RandomContrast
#   - _target_: autoalbument.faster_autoaugment.models.policy_operations.Solarize
#   - _target_: autoalbument.faster_autoaugment.models.policy_operations.HorizontalFlip
#   - _target_: autoalbument.faster_autoaugment.models.policy_operations.VerticalFlip
#   - _target_: autoalbument.faster_autoaugment.models.policy_operations.Rotate
#   - _target_: autoalbument.faster_autoaugment.models.policy_operations.ShiftX
#   - _target_: autoalbument.faster_autoaugment.models.policy_operations.ShiftY
#   - _target_: autoalbument.faster_autoaugment.models.policy_operations.Scale
#   - _target_: autoalbument.faster_autoaugment.models.policy_operations.CutoutFixedNumberOfHoles
#   - _target_: autoalbument.faster_autoaugment.models.policy_operations.CutoutFixedSize
#   # A list of augmentation operations that will be applied to input data.


classification_model:
# Settings for Classification Model that is used for two purposes:
# 1. As a model that performs classification of input images.
# 2. As a Discriminator for Policy Model.

  _target_: autoalbument.faster_autoaugment.models.ClassificationModel
# Python class for instantiating Classification Model. You can read more about overriding this value
# to use a custom model at https://albumentations.ai/docs/autoalbument/custom_model/

  num_classes: 27
# Number of classes in the dataset. The dataset implementation should return an integer in the range
# [0, num_classes - 1] as a class label of an image.

  architecture: tf_efficientnetv2_s
# Architecture of Classification Model. The default implementation of Classification model in AutoAlbument uses
# models from https://github.com/rwightman/pytorch-image-models/. Please refer to its documentation to get a list of
# available models - https://rwightman.github.io/pytorch-image-models/#list-models-with-pretrained-weights.

  pretrained: true
# Boolean flag that indicates whether the selected model architecture should load pretrained weights or use randomly
# initialized weights.


data:
  #dataset:
  #  _target_: dataset.SearchDataset
  dataset_file: /home/jupyter/thiru/auto/dataset.py
  # Class for instantiating a PyTorch dataset.

  input_dtype: uint8
# The data type of input images. Two values are supported:
# - uint8. In that case, all input images should be NumPy arrays with the np.uint8 data type and values in the range
#   [0, 255].
# - float32. In that case, all input images should be NumPy arrays with the np.float32 data type and values in the
#   range [0.0, 1.0].

  preprocessing: null
# A list of preprocessing augmentations that will be applied to each image before applying augmentations from
# a policy. A preprocessing augmentation should be defined as `key`: `value`, where `key` is the name of augmentation
# from Albumentations, and `value` is a dictionary with augmentation parameters. The found policy will also apply
# those preprocessing augmentations before applying the main augmentations.
#
# Here is an example of an augmentation pipeline that first pads an image to the size 512x512 pixels, then resizes
# the resulting image to the size 256x256 pixels and finally crops a random patch with the size 224x224 pixels.
#
#  preprocessing:
#    - PadIfNeeded:
#        min_height: 512
#        min_width: 512
#    - Resize:
#        height: 256
#        width: 256
#    - RandomCrop:
#        height: 224
#        width: 224
#

  normalization:
    mean: [0.485, 0.456, 0.406]
    std: [0.229, 0.224, 0.225]
# Normalization values for images. For each image, the search pipeline will subtract `mean` and divide by `std`.
# Normalization is applied after transforms defined in `preprocessing`. Note that regardless of `input_dtype`,
# the normalization function will always receive a `float32` input with values in the range [0.0, 1.0], so you should
# define `mean` and `std` values accordingly. ImageNet normalization is used by default.


  dataloader:
    _target_: torch.utils.data.DataLoader
    batch_size: 16
    shuffle: true
    num_workers: 8
    pin_memory: true
    drop_last: true
# Parameters for the PyTorch DataLoader. Please refer to the PyTorch documentation for the description of parameters -
# https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader.


searcher:
  _target_: autoalbument.faster_autoaugment.search.FasterAutoAugmentSearcher
# Class for Searcher that is used to discover augmentation policies. You can create your own Searcher to alter
# the behavior of AutoAlbument.


trainer:
  _target_: pytorch_lightning.Trainer
# Configuration for PyTorch Lightning Trainer. You can read more about Trainer and its arguments at
# https://pytorch-lightning.readthedocs.io/en/stable/common/trainer.html.

  gpus: 1
# Number of GPUs to train on. Set to `0` or None` to use CPU for training.
# More detailed description - https://pytorch-lightning.readthedocs.io/en/stable/common/trainer.html#gpus

  benchmark: true
# If true enables cudnn.benchmark.
# More detailed description - https://pytorch-lightning.readthedocs.io/en/stable/common/trainer.html#benchmark

  max_epochs: 20
# Number of epochs to search for augmentation parameters.
# More detailed description - https://pytorch-lightning.readthedocs.io/en/stable/common/trainer.html#max-epochs

  resume_from_checkpoint: null
# Path to a checkpoint to resume training from it. More detailed description -
# https://pytorch-lightning.readthedocs.io/en/stable/common/trainer.html#resume-from-checkpoint


optim:
  main:
  # Optimizer configuration for the main (either Classification or Semantic Segmentation) Model
    _target_: torch.optim.Adam
    lr: 1e-3
    betas: [0, 0.999]

  policy:
  # Optimizer configuration for Policy Model
    _target_: torch.optim.Adam
    lr: 1e-3
    betas: [0, 0.999]


callbacks:
# A list of PyTorch Lightning callbacks. Documentation on callbacks is available at
# https://pytorch-lightning.readthedocs.io/en/stable/extensions/callbacks.html

- _target_: autoalbument.callbacks.MonitorAverageParameterChange
# Prints the "Average Parameter Change" metric at the end of each epoch.
# Read more about this metric at https://albumentations.ai/docs/autoalbument/metrics/#average-parameter-change

- _target_: autoalbument.callbacks.SavePolicy
# Saves augmentation policies at the end of each epoch. You can load saved policies with Albumentations to create
# an augmentation pipeline.

- _target_: pytorch_lightning.callbacks.ModelCheckpoint
  save_last: true
  dirpath: checkpoints
# Saves a checkpoint at the end of each epoch. The checkpoint will contain all the necessary data to resume training.
# More information about this checkpoint -
# https://pytorch-lightning.readthedocs.io/en/latest/extensions/generated/pytorch_lightning.callbacks.ModelCheckpoint.html


logger:
# Configuration for a PyTorch Lightning logger.
# You can read more about loggers at https://pytorch-lightning.readthedocs.io/en/stable/extensions/logging.html
# By default, TensorBoardLogger is used.

  _target_: pytorch_lightning.loggers.TensorBoardLogger
  save_dir: ${config_dir:}/outputs/${now:%Y-%m-%d}/${now:%H-%M-%S}/tensorboard_logs


hydra:
  run:
    dir: ${config_dir:}/outputs/${now:%Y-%m-%d}/${now:%H-%M-%S}
  # Path to the directory that will contain all outputs produced by the search algorithm. `${config_dir:}` contains
  # path to the directory with the `search.yaml` config file. Please refer to the Hydra documentation for more
  # information - https://hydra.cc/docs/configure_hydra/workdir.


Dataset.py

import torch.utils.data
import pandas as pd
import cv2

class SearchDataset(torch.utils.data.Dataset):

    def __init__(self, transform=None):
        self.csv = pd.read_csv("folds.csv")
        self.transform = transform
        # Implement additional initialization logic if needed

    def __len__(self):
        # Replace `...` with the actual implementation
        return len(self.csv)

    def __getitem__(self, index):
        # Implement logic to get an image and its label using the received index.
        #
        # `image` should be a NumPy array with the shape [height, width, num_channels].
        # If an image contains three color channels, it should use an RGB color scheme.
        #
        # `label` should be an integer in the range [0, model.num_classes - 1] where `model.num_classes`
        # is a value set in the `search.yaml` file.

        image = cv2.imread(self.csv.loc[index, "paths"])
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        label = self.csv.loc[index, "cind"]

        if self.transform is not None:
            transformed = self.transform(image=image)
            image = transformed["image"]

        return image, label
@danielmatwicki
Copy link

Hi, did you figure out how to fix the error? I'm facing the same problem

@saigontrade88
Copy link

There might be some syntax errors. You need set the value of HYDRA_FULL_ERROR variable to 1 using this command 'export HYDRA_FULL_ERROR=1'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants