-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move many udf implementations from invoke
to invoke_batch
#13491
Conversation
…invoke is passed the return type created for the udf instance
19632b5
to
6b3db8c
Compare
Do not yet deprecate `invoke_batch`, add docs to invoke_with_args
# Conflicts: # datafusion/expr/src/udf.rs
# Conflicts: # datafusion/expr/src/udf.rs # datafusion/functions/src/datetime/to_local_time.rs # datafusion/functions/src/utils.rs # datafusion/physical-expr/src/scalar_function.rs
@@ -1954,32 +1954,6 @@ The following intervals are supported: | |||
- years | |||
- century | |||
|
|||
#### Example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
./dev/update_function_docs.sh did this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is because you removed the docs (perhaps accidentally) above -- i left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @joseph-isaacs -- this looks like a great step forward
I had some suggestions / additional cleanups
THe only thing I think that needs to be fixed before merging this is reverting the loss of the documentation for date_bin
datafusion/expr/src/udf.rs
Outdated
@@ -546,7 +546,7 @@ pub trait ScalarUDFImpl: Debug + Send + Sync { | |||
/// to arrays, which will likely be simpler code, but be slower. | |||
fn invoke_with_args(&self, args: ScalarFunctionArgs) -> Result<ColumnarValue> { | |||
#[allow(deprecated)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we cam probably remove this #allow as well
@@ -141,8 +141,8 @@ fn run_with_string_type<M: Measurement>( | |||
), | |||
|b| { | |||
b.iter(|| { | |||
#[allow(deprecated)] // TODO use invoke_batch | |||
black_box(ltrim.invoke(&args)) | |||
#[allow(deprecated)] // TODO use invoke_with_args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
likewise here I think we can remove the #allow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment below
datafusion/functions/src/utils.rs
Outdated
@@ -149,7 +149,8 @@ pub mod test { | |||
let return_type = return_type.unwrap(); | |||
assert_eq!(return_type, $EXPECTED_DATA_TYPE); | |||
|
|||
let result = func.invoke_with_args(datafusion_expr::ScalarFunctionArgs{args: $ARGS, number_rows: cardinality, return_type: &return_type}); | |||
#[allow(deprecated)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like it would be good to keep calling using invoke_with_args
datafusion/functions/src/utils.rs
Outdated
@@ -170,7 +171,8 @@ pub mod test { | |||
} | |||
else { | |||
// invoke is expected error - cannot use .expect_err() due to Debug not being implemented | |||
match func.invoke_with_args(datafusion_expr::ScalarFunctionArgs{args: $ARGS, number_rows: cardinality, return_type: &return_type.unwrap()}) { | |||
#[allow(deprecated)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
likewise here -- invoke_with_args
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of tests use this macro and pass in &[], so it quite a big change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense in a follow up
@@ -1954,32 +1954,6 @@ The following intervals are supported: | |||
- years | |||
- century | |||
|
|||
#### Example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is because you removed the docs (perhaps accidentally) above -- i left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @joseph-isaacs -- this is looking close to me, though it looks like a few bugs got introduced. I flagged some of them.
I can try to help tomorrow, but I am out of time today
datafusion/expr/src/udf.rs
Outdated
@@ -330,7 +328,7 @@ where | |||
|
|||
pub struct ScalarFunctionArgs<'a> { | |||
// The evaluated arguments to the function | |||
pub args: &'a [ColumnarValue], | |||
pub args: Vec<ColumnarValue>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the change proposed in ✅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
@@ -544,7 +548,7 @@ mod tests { | |||
ColumnarValue::Array(timestamps), | |||
ColumnarValue::Scalar(ScalarValue::TimestampNanosecond(Some(1), None)), | |||
], | |||
batch_size, | |||
1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems incorrect -- the batch size was 6 values, not 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a look for all of these, but its easy to miss one, all the others that pass an literal value take only scalars.
Here is what I suggest we do with this PR:
Then merge this one in. If you have more time @joseph-isaacs it would be even better to change this PR so all the functions used |
I would have liked to move over to invoke_with_args, however its quite a bit more of an undertaking. Possibly in the future I can try |
Makes sense |
I think there was a transient failure |
I retriggered the tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me -- thank you @joseph-isaacs
Thanks @joseph-isaacs @alamb |
🙏 thank you -- we are making progresss |
Which issue does this PR close?
As part of #13238 invoke was replaced with invoke_batch.
Rationale for this change
This will allow the removal of invoke. And maybe if we are quick we can rename
invoke_with_args
back toinvoke
.What changes are included in this PR?
This PR moves function definitions over to use invoke_batch, so invoke can be removed.
Are these changes tested?
Are there any user-facing changes?