Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(api): move from .case() to .cases() #9096

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

NickCrews
Copy link
Contributor

@NickCrews NickCrews commented May 1, 2024

Redo of #7914 (with substantive changes) and #9039 (merely switching the base repo to the correct one, my fork)

Summary of changes:

  • Instead of removing the old .case() APIs, I deprecate them here. There are a few tests for them to try to ensure we didn't break anyone.
  • Added many more test cases, including for dtypes, dshapes, and expected errors on bad construction
  • Added test that ibis.null().cases((None, "fill"), else_="not hit") always results in "not hit". Maybe not the best ergonomics, but at least it is consistent and written down. Perhaps we can revisit later. See one of the TODOs below.
  • fixed bug where datashape was only getting determined from base or cases. Really it needs to depend on ALL inputs.
  • added some tests and implementation for dealing with empty branches: ibis.cases() (results in NULL) and ibis.cases(else_=5) (results in 5). I considered disallowing these, but I don't think there is anything semantiically wrong with supporting this.
  • moved a few tests from the pandas and dask backends to backends/test/test_generic.py so they are run on all backends.

TODOs that I found that should come in followups:

  • NULL replacement isn't super consistent yet. For example, val.substitute({None: 4}) currently does a fillna(). But if you do val.cases((None, 4), else_=val), then this ALWAYS hits the else_ case, because x = NULL never evals to True. EXCECPT for clickhouse, which appears to special case this. See the added test_switch_cases_null test. This also isn't even consistent in the sense that it only special cases for python None. If you do ibis.null(), or something only known at runtime like ibis.literal(5).nullif(5), then this will always hit the else_ case. Due to these limitations, I vote for making matching against NULLs out of scope for .cases and .substitute. If a user wants to do this, then they better do a .fillna() before.
  • the batting table has a column RBI of type int64. On sqlite, this .to_pandas() to a column of type object. I have this marked as broken here, but would be good to fix separately.
  • Literal('foo', type=bool), should error, but doesn't

@NickCrews
Copy link
Contributor Author

NickCrews commented May 1, 2024

EDIT: duh, it's because they don't guarantee row order. Updated the assertions to be order-independent.

Any idea as to why the datafusion, exasol, and risingwave column tests are failing? I still have trouble getting those backends running on my M1 so I can't debug locally very well.

.else_(nulls)
.end()
)
return self.cases(*enumerate(labels), else_=nulls)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that this is such a simple implementation I would consider deprecating and then removing the whole .label() API.

@NickCrews NickCrews requested a review from cpcloud May 1, 2024 18:05
@NickCrews NickCrews force-pushed the case-to-cases branch 6 times, most recently from d3ac95a to 513bb94 Compare May 7, 2024 17:52
This is setup for ibis-project#9039,
where I change the API of Value.cases(),
so I want to make sure that
this functionality doesn't change, but
the user gets a deprecationwarning
@NickCrews
Copy link
Contributor Author

@cpcloud gentle nudge for a review here :)

@NickCrews
Copy link
Contributor Author

@cpcloud anything I can do to help move this forward/easier to review?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant