Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notifying a trait with a DataFrame instance throws Value Error #756

Open
jp-schneider opened this issue Jul 7, 2022 · 3 comments
Open

Comments

@jp-schneider
Copy link

jp-schneider commented Jul 7, 2022

Hey there,

At first, thank you for this amazing library!

I noticed that there are problems when linking multiple objects using the link function, if the value of a trait is a dataframe.
To me it looks like the compare logic overridden by pandas is causing the problem.
Here is an example:

import traitlets
from traitlets import link, directional_link
import pandas as pd

class SomeClass(traitlets.HasTraits):
    df = traitlets.Instance(klass=pd.DataFrame, allow_none=True)

foo = SomeClass()
baz = SomeClass()
bar = SomeClass()


# Will not work
link((foo, "df"), (baz, "df"))
foo.df = pd.DataFrame() # Throws ValueError

Stacktrace:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
traitlets_dataframe.ipynb Cell 4' in <cell line: 5>()
      2 link((foo, "df"), (baz, "df"))
      4 # Throws ValueError
----> 5 foo.df = pd.DataFrame()

File .\lib\site-packages\traitlets\traitlets.py:712, in TraitType.__set__(self, obj, value)
    710     raise TraitError('The "%s" trait is read-only.' % self.name)
    711 else:
--> 712     self.set(obj, value)

File .\lib\site-packages\traitlets\traitlets.py:701, in TraitType.set(self, obj, value)
    697     silent = False
    698 if silent is not True:
    699     # we explicitly compare silent to True just in case the equality
    700     # comparison above returns something other than True/False
--> 701     obj._notify_trait(self.name, old_value, new_value)

File .\lib\site-packages\traitlets\traitlets.py:1371, in HasTraits._notify_trait(self, name, old_value, new_value)
   1370 def _notify_trait(self, name, old_value, new_value):
-> 1371     self.notify_change(
   1372         Bunch(
   1373             name=name,
   1374             old=old_value,
   1375             new=new_value,
   1376             owner=self,
   1377             type="change",
   1378         )
   1379     )

File .\lib\site-packages\traitlets\traitlets.py:1383, in HasTraits.notify_change(self, change)
   1381 def notify_change(self, change):
   1382     """Notify observers of a change event"""
-> 1383     return self._notify_observers(change)

File .\lib\site-packages\traitlets\traitlets.py:1428, in HasTraits._notify_observers(self, event)
   1425 elif isinstance(c, EventHandler) and c.name is not None:
   1426     c = getattr(self, c.name)
-> 1428 c(event)

File .\lib\site-packages\traitlets\traitlets.py:366, in link._update_target(self, change)
    364 with self._busy_updating():
    365     setattr(self.target[0], self.target[1], self._transform(change.new))
--> 366     if getattr(self.source[0], self.source[1]) != change.new:
    367         raise TraitError(
    368             "Broken link {}: the source value changed while updating "
    369             "the target.".format(self)
    370         )

File .\lib\site-packages\pandas\core\generic.py:1527, in NDFrame.__nonzero__(self)
   1525 @final
   1526 def __nonzero__(self):
-> 1527     raise ValueError(
   1528         f"The truth value of a {type(self).__name__} is ambiguous. "
   1529         "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
   1530     )

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

As you can see, hhe problem is in traitlets.py: 366, because getattr(self.source[0], self.source[1]) != change.new does not return a bool value in the case of a DataFrame.

Would it be possible to make this function compatible with pandas, or possibly define a custom function for comparison?

Thank you in advance!

@janhurst
Copy link

janhurst commented Jul 27, 2022

I've run into the same issue when trying to observe a Instance(klass=pd.DataFrame).

I am not sure this is really a traitlets problem... but more the "bad behaviour" of the DataFrame __ne__ function. My interpretation is that traitlets is asking if the objects are equal ... which gets a little bit messy if we are talking about non primitive types. For the case of an Instance and perhaps some other examples, perhaps the check should be more like is not?

I've worked around it for my case by overriding the __ne__ ... that is something like:

class MyDataFrame(pandas.DataFrame):
    def __ne__(self, other):
        return self is not other

(although this is causing other headaches... )

@rmorshea
Copy link
Contributor

rmorshea commented Jul 27, 2022

I'd recommend overriding this value-based comparison behavior of link in _update_source and _update_target with an identity-based comparison (i.e. use the is operator). This way you don't have to subclass DataFrame.

If you'd like to make a contribution, it would be better if link had a _should_update method that _update_source and _update_target could call to check if a value has changed. This way you could override this one method in order to implement your desired comparison behavior.

@Paul-Aime
Copy link

Paul-Aime commented Feb 7, 2024

EDIT (WARNING): It does not work when multiple observers on the trait, since the raised error will still interrupt the loop over callbacks ...


I found a workaround by creating a dataframe trait type that ignores that particular error:

class TraitletsPandasDataFrame(traitlets.Instance):
    def __init__(self, *args, **kwargs) -> None:
        super().__init__(*args, klass=pd.DataFrame, **kwargs)

    def __set__(self, obj: traitlets.HasTraits, value) -> None:
        # Ignore error raised by old and new dataframes comparison
        # see https://github.com/ipython/traitlets/issues/756
        try:
            super().__set__(obj, value)
        except ValueError as e:
            if not (
                len(e.args) > 0
                and isinstance(e.args[0], str)
                and e.args[0].startswith("The truth value of a DataFrame is ambiguous.")
            ):
                raise e


class SomeClass(traitlets.HasTraits):
    df = TraitletsPandasDataFrame()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants