From 3d517f79183ef0176a6bcce7eab14d6ac757487d Mon Sep 17 00:00:00 2001 From: Trent Hauck Date: Thu, 16 May 2024 21:19:42 -0700 Subject: [PATCH] docs: better info about handling datafusion udfs --- docs/source/contributor-guide/adding_a_new_expression.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/source/contributor-guide/adding_a_new_expression.md b/docs/source/contributor-guide/adding_a_new_expression.md index 1112568069..67f05c0df6 100644 --- a/docs/source/contributor-guide/adding_a_new_expression.md +++ b/docs/source/contributor-guide/adding_a_new_expression.md @@ -32,7 +32,9 @@ You may have a specific expression in mind that you'd like to add, but if not, y Once you have the expression you'd like to add, you should take inventory of the following: 1. What is the Spark expressions behavior across different Spark versions? These make good test cases, and will inform you of any compatibility issues such as an API change that will have to be addressed. -2. Check if the expression is already implemented in DataFusion and if the is compatible with the Spark expression. If it is, you can potentially reuse the existing implementation. If it's not, consider an initial version in DataFusion Comet and potentially backport it into DataFusion if the expression would be supported by it. +2. Check if the expression is already implemented in DataFusion and if the is compatible with the Spark expression. + 1. If it is, you can potentially reuse the existing implementation though you'll need to add tests to verify compatibility. + 2. If it's not, consider an initial version in DataFusion Comet and potentially backport it into DataFusion if the expression would be supported by it. 3. Test cases for the expression. As mentioned, you can refer to Spark's test cases for a good idea of what to test. Once you know what you want to add, you'll need to update the query planner to recognize the new expression in Scala and potentially add a new expression implementation `core/` in Rust.