Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: first_valid_index errors on dataframe with only None/NaN values #4912

Open
noloerino opened this issue Aug 31, 2022 · 0 comments
Open
Assignees
Labels
bug 🦗 Something isn't working P2 Minor bugs or low-priority feature requests pandas concordance 🐼 Functionality that does not match pandas

Comments

@noloerino
Copy link
Collaborator

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macOS Monterey 12.5.1
  • Modin version (modin.__version__): 5ff947b9 (latest master on my machine)
  • Python version: 3.10
  • Code we can use to reproduce:
import modin.pandas as pd
import numpy as np
df = pd.DataFrame({"a": [np.nan] * 100, "b": [np.nan] * 100})
df.first_valid_index()

Exception: IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices, coming from the index operation on this line in the query compiler.

Describe the problem

Per pandas docs, first_valid_index should have the following behavior:

If all elements are non-NA/null, returns None. Also returns None for empty Series/DataFrame.
We currently do not add a check for this case, leading to an exception when None is returned (I'm not sure why the pandas error message for the IndexError lists None as a valid index).

This error currently does not affect empty dataframes, as those will default to pandas for this method.

This bug affects last_valid_index and possibly other functions as well; I'll investigate further. I plan to fix this (and other similar issues) along with #4909, since the changes to Map will affect Reduce and TreeReduce operators as well.

@noloerino noloerino added bug 🦗 Something isn't working pandas concordance 🐼 Functionality that does not match pandas P2 Minor bugs or low-priority feature requests labels Aug 31, 2022
@noloerino noloerino self-assigned this Aug 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🦗 Something isn't working P2 Minor bugs or low-priority feature requests pandas concordance 🐼 Functionality that does not match pandas
Projects
None yet
Development

No branches or pull requests

1 participant