-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add ExpressionEvaluator
#363
base: main
Are you sure you want to change the base?
Conversation
@Fokko @liurenjie1024 @sdd |
ExpressionEvaluator
ExpressionEvaluator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, with just a small structural suggestion. Thanks @marvinlanhenke!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might want to consolidate some of the test setup for the recently-added visitors into a single place at some point as there is a lot of commonality and and it will keep the test section cleaner and easier to navigate, but I think we can address this in a future clean-up rather than in this PR. Looks great! Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marvinlanhenke Thanks for this pr, and all the tests! Generally it looks good to me, but I have concern with reusing the Ord
trait of PrimitiveLiteral
. I think it's incorrect since it should only implement PartialOrd
. Also I think we should only implement PartialOrd
for Datum
and use it in this filter.
|
||
/// Checks if the [`PrimitiveLiteral`] is null. | ||
fn is_null(literal: &PrimitiveLiteral) -> bool { | ||
if let PrimitiveLiteral::Boolean(false) = literal { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused here, should not is_null
just check Option<Datum>
is none?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when the DataFile
contains a partition struct like:
let partition = Struct::from_iter([None]);
the Struct
fields will contain a Literal::Primitive(PrimitiveLiteral::Boolean(false))
impl FromIterator<Option<Literal>> for Struct {
fn from_iter<I: IntoIterator<Item = Option<Literal>>>(iter: I) -> Self {
let mut fields = Vec::new();
let mut null_bitmap = BitVec::new();
for value in iter.into_iter() {
match value {
Some(value) => {
fields.push(value);
null_bitmap.push(false)
}
None => {
fields.push(Literal::Primitive(PrimitiveLiteral::Boolean(false)));
null_bitmap.push(true)
}
}
}
Struct {
fields,
null_bitmap,
}
}
}
thats why I check the Result<Datum>
returned by the Accessor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is incorrect, PrimitiveLiteral::Boolean
here is just a place holder, which could also be other things. Otherwise how do you distinguish it from actual value of PrimitiveLiteral::Boolean(false)
. And the Accessor
should return Option<Literal>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise how do you distinguish it from actual value of
PrimitiveLiteral::Boolean(false)
. And theAccessor
should returnOption<Literal>
.
Yeah, this makes sense to me.
Regarding the Accessor
will you open another issue or @sdd something you might want to take a look at?
Since I'm not to familiar with the Accessor
and to verify my understanding; in order to return Option here we should check the null_bitmap
of the Struct
at the position and if the null_bitmap is true we can return None
from the Accessor
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracked here: #379
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @marvinlanhenke - I'll fix that Accessor
bug, no problem.
return Ok(false); | ||
} | ||
|
||
Ok(datum.literal() < literal.literal()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is incorrect since PrimitiveLiteral
should be partial order rather full order. My suggestion is to implement PartitionOrd
for Datum
and remove Ord
for PrimitiveLiteral
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about removing Ord
from PrimitiveLiteral
since its also required by Literal
. However, I think I get your point, so impl PartialOrd
for Datum
and then comparing those intstead of the primitive literals makes sense.
Perhaps like this:
impl PartialOrd for Datum {
fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
if self.r#type != other.r#type {
return None;
}
self.literal.partial_cmp(&other.literal)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why Literal
should be Ord
, what's the order of a struct compared with map? For Datum
, I think we can start with this approach, but with further refinement to compare compatible types, for example float vs double, int vs long, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's best to make the changes as outlined in #378 to unblock this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1.
Which issue does this PR close?
Closes #358
Rationale for this change
DataFile
s in TableScanWhat changes are included in this PR?
Are these changes tested?
Yes, unit tests for expression evaluator are included.