-
Notifications
You must be signed in to change notification settings - Fork 539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(polars): add more accurate type mapping for timestamps #8954
base: main
Are you sure you want to change the base?
feat(polars): add more accurate type mapping for timestamps #8954
Conversation
@@ -518,6 +518,11 @@ def test_timestamp_unit(): | |||
assert dt.Timestamp(scale=scale).unit == TimestampUnit.NANOSECOND | |||
|
|||
|
|||
def test_timestamp_unit_raise(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this because it seemed like this
ibis/ibis/expr/datatypes/core.py
Lines 619 to 620 in 9355281
else: | |
raise ValueError(f"Invalid scale {self.scale}") |
But turns out this test is causing a raise in a different place, which is confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Timestamp
op has a restriction that only the literal ints from 0 through 10 can be passed as scale
-- so while it isn't technically unit-tested, it's not possible to set a value outside of that range without raising a SignatureValidationError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, but then why having a code path to "raise" that would never been triggered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would wager that the raise
predates the enforcement of values at the op level
8bd9479
to
400000e
Compare
], | ||
) | ||
def test_from_ibis_type_seconds(ibis_dtype, polars_type): | ||
# we accept seconds as an ibis type, default to ns in polars |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jcrist in the original issue, you suggested maybe defaulting to "ms"
for when we go from "s"
in Ibis to polars. I though it made sense keeping the behavior we had before for that particular case which is "ns"
. Any objections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still seems a bit odd to me that asking for seconds I would get back milliseconds. I say we raise in the seconds case as an UnsupportedOperationError
the same any other missing functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will break the current behavior where we ask for seconds and return ns
. Just making sure we are ok with breaking that.
400000e
to
ad7e1cc
Compare
After looking more into this PR, I'm not sure if we are correctly casting the values when mapping the timestamps, base on this code: ibis/ibis/backends/polars/compiler.py Lines 157 to 171 in b48f451
That being said the casting "potential issue" is separate from the mapping of the types, so I proposed opening a separate issue to investigate that, and for now get this PR in. |
I did a little more digging and on main (as well as this branch). The casting is broken, see #9091 I'm not sure what are the next steps here, regarding this PR. |
polars datetime types have a fixed time unit (us, ns, or ms). Ibis timestamp types do too (but we also have seocnds as an option). In the case of polars we're currently mapping all timestamps to
ns
, this PR makes things a bit more specific and uses proper units.I'm still not sure how to handle the case
Ibis seconds
to maybems in polars
? Need a bit help on how the conversion of units happens, more specifically where is the value part handled? we would need to doibis_seconds*1000
=polars_ms
at the moment I'm raising, similarly to what we do in durations, but I need to tackle this differently, see all the current failures in CI.