-
Notifications
You must be signed in to change notification settings - Fork 646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different behavior between modin and pandas for isin operation #4618
Comments
@Garra1980 thank you for reporting this issue. I can reproduce it at version 86d3610. The root cause is that when Modin defaults to pandas for the dataframe Here's modin getting the wrong dtype for the indexer when converting to pandas: import modin.pandas as pd
df = pd.DataFrame(columns=['col1', 'col2'])
modin_indexer = df['col1'].isin(['1','2'])
# Modin dtype is bool
print(modin_indexer.dtype)
# _to_pandas() dtype is object
print(modin_indexer._to_pandas().dtype) and here is the difference in behavior for the two indexers: import pandas
pdf = pandas.DataFrame(columns=['col1', 'col2'])
bool_indexer = pandas.Series([], dtype=bool, name='col1')
object_indexer = pandas.Series([], dtype="object", name='col1')
# prints Index(['col1', 'col2'], dtype='object')
print(pdf[bool_indexer].columns)
# prints Index([], dtype='object')
print(pdf[object_indexer].columns)
I will mark this issue as a duplicate of #4605. |
Duplicate of #4605 |
System information
modin.__version__
):Another example of difference in modin and pure pandas for following snippet
df = pd.DataFrame(columns=['col1', 'col2'])
df = df[df['col1'].isin(['1','2'])]
print(df)
Describe the problem
modin.pandas will print:
Empty DataFrame
Columns: []
Index: []
default pandas will print:
Empty DataFrame
Columns: [col1, col2]
Index: []
Not sure pandas is super correct here though
Source code / logs
The text was updated successfully, but these errors were encountered: