Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emphasize that filter(x, A, B) is not strictly the same as filter(filter(x, A), B) #6968

Open
MichaelChirico opened this issue Nov 20, 2023 · 4 comments

Comments

@MichaelChirico
Copy link
Contributor

MichaelChirico commented Nov 20, 2023

Have had to re-confirm this for myself a few times:

filter(mtcars, cyl < max(cyl), hp < max(hp)) |> dim()
# [1] 18 11

# vs
filter(filter(mtcars, cyl < max(cyl)), hp < max(hp)) |> dim()
# [1] 17 11

This in ?dplyr hints at what's going on:

If multiple expressions are included, they are combined with the & operator.

But this behavior is a bit more subtle / worth calling out IMO. This came up again recently here:

r-lib/lintr#2305 (comment)

FWIW it's also really not clear from reading the filter.data.frame implementation without being well-versed in {dplyr} internals.

@markolipka
Copy link

Seems to me that in the second case there will be a different value for 'max(hp)' because of the pre-filtering of the dataset. Therefore the B's in the issue title are not the same in the compared cases.

@MichaelChirico
Copy link
Contributor Author

Yes, that's exactly the point.

I'm not saying the behavior is wrong, I'm saying it may not be obvious how filter() will process ... from the current documentation.

@markolipka
Copy link

If you compare the two cases with identical values, the results are identical:

> filter(mtcars, cyl < 8, hp < 335) |> dim()
[1] 18 11
> # vs
> filter(filter(mtcars, cyl < 8), hp < 335) |> dim()
[1] 18 11

I think there is nothing wrong here...

@MichaelChirico
Copy link
Contributor Author

Not going to waste more of my time engaging when it's clear you haven't read the issue or my response carefully.

@tidyverse tidyverse locked as too heated and limited conversation to collaborators Nov 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants