Skip to content

Commit

Permalink
Merge branch 'main' into chore/update_approx_size
Browse files Browse the repository at this point in the history
  • Loading branch information
zhang2014 authored Sep 22, 2023
2 parents ba0c142 + 14819b3 commit 7eed602
Show file tree
Hide file tree
Showing 159 changed files with 3,317 additions and 1,695 deletions.
11 changes: 5 additions & 6 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,6 @@ members = [
sled = { git = "https://github.com/datafuse-extras/sled", tag = "v0.34.7-datafuse.1", default-features = false }
opendal = { version = "0.39", features = [
"layers-minitrace",
"layers-metrics",
"services-ipfs",
"services-moka",
"services-redis",
Expand Down
2 changes: 1 addition & 1 deletion docs/doc/13-sql-reference/99-ansi-sql.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ Databend aims to conform to the SQL standard, with particular support for ISO/IE
| E121-17 | WITH HOLD cursors | <span class="text-red">No</span> | |
| **E131** | **Null value support (nulls in lieu of values)** | <span class="text-blue">Yes</span> | |
| **E141** | **Basic integrity constraints** | <span class="text-red">No</span> | |
| E141-01 | NOT NULL constraints | <span class="text-blue">Yes</span> | Default in Databend: All columns are non-nullable (NOT NULL). |
| E141-01 | NOT NULL constraints | <span class="text-blue">Yes</span> | Default in Databend: All columns are nullable. |
| E141-02 | UNIQUE constraint of NOT NULL columns | <span class="text-red">No</span> | |
| E141-03 | PRIMARY KEY constraints | <span class="text-red">No</span> | |
| E141-04 | Basic FOREIGN KEY constraint with the NO ACTION default for both referential delete action and referential update action | <span class="text-red">No</span> | |
Expand Down
6 changes: 1 addition & 5 deletions docs/doc/14-sql-commands/00-ddl/50-udf/_category_.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,3 @@
{
"label": "User-Defined Function",
"link": {
"type": "generated-index",
"slug": "/sql-commands/ddl/udf"
}
"label": "User-Defined Function"
}
27 changes: 19 additions & 8 deletions docs/doc/14-sql-commands/00-ddl/50-udf/ddl-alter-function.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,33 @@ title: ALTER FUNCTION
description:
Modifies the properties for an existing user-defined function.
---
import FunctionDescription from '@site/src/components/FunctionDescription';

<FunctionDescription description="Introduced or updated: v1.2.116"/>

Alters a user-defined function.

## Syntax

```sql
CREATE FUNCTION <name> AS ([ argname ]) -> '<function_definition>'
-- Alter UDF created with lambda expression
ALTER FUNCTION [IF NOT EXISTS] <function_name>
AS (<input_param_names>) -> <lambda_expression>
[DESC='<description>']

-- Alter UDF created with UDF server
ALTER FUNCTION [IF NOT EXISTS] <function_name>
AS (<input_param_types>) RETURNS <return_type> LANGUAGE <language_name>
HANDLER = '<handler_name>' ADDRESS = '<udf_server_address>'
[DESC='<description>']
```

## Examples

```sql
CREATE FUNCTION a_plus_3 AS (a) -> a+3+3;
ALTER FUNCTION a_plus_3 AS (a) -> a+3;

SELECT a_plus_3(2);
+---------+
| (2 + 3) |
+---------+
| 5 |
+---------+
```
CREATE FUNCTION gcd (INT, INT) RETURNS INT LANGUAGE python HANDLER = 'gcd' ADDRESS = 'http://0.0.0.0:8815';
ALTER FUNCTION gcd (INT, INT) RETURNS INT LANGUAGE python HANDLER = 'gcd_new' ADDRESS = 'http://0.0.0.0:8815';
```
118 changes: 114 additions & 4 deletions docs/doc/14-sql-commands/00-ddl/50-udf/ddl-create-function.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,20 +3,44 @@ title: CREATE FUNCTION
description:
Create a new user-defined scalar function.
---
import FunctionDescription from '@site/src/components/FunctionDescription';

<FunctionDescription description="Introduced or updated: v1.2.116"/>

## CREATE FUNCTION

Creates a new UDF (user-defined function), the UDF can contain an SQL expression.
Creates a user-defined function.

## Syntax

```sql
CREATE FUNCTION [ IF NOT EXISTS ] <name> AS ([ argname ]) -> '<function_definition>'
-- Create with lambda expression
CREATE FUNCTION [IF NOT EXISTS] <function_name>
AS (<input_param_names>) -> <lambda_expression>
[DESC='<description>']


-- Create with UDF server
CREATE FUNCTION [IF NOT EXISTS] <function_name>
AS (<input_param_types>) RETURNS <return_type> LANGUAGE <language_name>
HANDLER = '<handler_name>' ADDRESS = '<udf_server_address>'
[DESC='<description>']
```

| Parameter | Description |
|-----------------------|---------------------------------------------------------------------------------------------------|
| `<function_name>` | The name of the function. |
| `<lambda_expression>` | The lambda expression or code snippet defining the function's behavior. |
| `DESC='<description>'` | Description of the UDF.|
| `<<input_param_names>`| A list of input parameter names. Separated by comma.|
| `<<input_param_types>`| A list of input parameter types. Separated by comma.|
| `<return_type>` | The return type of the function. |
| `LANGUAGE` | Specifies the language used to write the function. Available values: `python`. |
| `HANDLER = '<handler_name>'` | Specifies the name of the function's handler. |
| `ADDRESS = '<udf_server_address>'` | Specifies the address of the UDF server. |

## Examples

### Creating UDF with Lambda Expression

```sql
CREATE FUNCTION a_plus_3 AS (a) -> a+3;

Expand Down Expand Up @@ -53,3 +77,89 @@ DROP FUNCTION get_v2;

DROP TABLE json_table;
```

### Creating UDF with UDF Server (Python)

This example demonstrates how to enable and configure a UDF server in Python:

1. Enable UDF server support by adding the following parameters to the [query] section in the [databend-query.toml](https://github.com/datafuselabs/databend/blob/main/scripts/distribution/configs/databend-query.toml) configuration file.

```toml title='databend-query.toml'
[query]
...
enable_udf_server = true
# List the allowed UDF server addresses, separating multiple addresses with commas.
# For example, ['http://0.0.0.0:8815', 'http://example.com']
udf_server_allow_list = ['http://0.0.0.0:8815']
...
```

2. Define your function. This code defines and runs a UDF server in Python, which exposes a custom function *gcd* for calculating the greatest common divisor of two integers and allows remote execution of this function:

:::note
The SDK package is not yet available. Prior to its release, please download the 'udf.py' file from https://github.com/datafuselabs/databend/blob/main/tests/udf-server/udf.py and ensure it is saved in the same directory as this Python script. This step is essential for the code to function correctly.
:::

```python title='udf_server.py'
from udf import *

@udf(
input_types=["INT", "INT"],
result_type="INT",
skip_null=True,
)
def gcd(x: int, y: int) -> int:
while y != 0:
(x, y) = (y, x % y)
return x

if __name__ == '__main__':
# create a UDF server listening at '0.0.0.0:8815'
server = UdfServer("0.0.0.0:8815")
# add defined functions
server.add_function(gcd)
# start the UDF server
server.serve()
```

`@udf` is a decorator used for defining UDFs in Databend, supporting the following parameters:

| Parameter | Description |
|--------------|-----------------------------------------------------------------------------------------------------|
| input_types | A list of strings or Arrow data types that specify the input data types. |
| result_type | A string or an Arrow data type that specifies the return value type. |
| name | An optional string specifying the function name. If not provided, the original name will be used. |
| io_threads | Number of I/O threads used per data chunk for I/O bound functions. |
| skip_null | A boolean value specifying whether to skip NULL values. If set to True, NULL values will not be passed to the function, and the corresponding return value is set to NULL. Default is False. |

This table illustrates the correspondence between Databend data types and their corresponding Python equivalents:

| Databend Type | Python Type |
|-----------------------|-----------------------|
| BOOLEAN | bool |
| TINYINT (UNSIGNED) | int |
| SMALLINT (UNSIGNED) | int |
| INT (UNSIGNED) | int |
| BIGINT (UNSIGNED) | int |
| FLOAT | float |
| DOUBLE | float |
| DECIMAL | decimal.Decimal |
| DATE | datetime.date |
| TIMESTAMP | datetime.datetime |
| VARCHAR | str |
| VARIANT | any |
| MAP(K,V) | dict |
| ARRAY(T) | list[T] |
| TUPLE(T...) | tuple(T...) |

3. Run the Python file to start the UDF server:

```shell
python3 udf_server.py
```

4. Register the function *gcd* with the [CREATE FUNCTION](ddl-create-function.md) in Databend:

```sql
CREATE FUNCTION gcd (INT, INT) RETURNS INT LANGUAGE python HANDLER = 'gcd' ADDRESS = 'http://0.0.0.0:8815'
```
6 changes: 3 additions & 3 deletions docs/doc/14-sql-commands/00-ddl/50-udf/ddl-drop-function.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,12 @@ description:
Drop an existing user-defined function.
---

Drop an existing user-defined function.
Drops a user-defined function.

## Syntax

```sql
DROP FUNCTION [IF EXISTS] <name>
DROP FUNCTION [IF EXISTS] <function_name>
```

## Examples
Expand All @@ -19,4 +19,4 @@ DROP FUNCTION a_plus_3;

SELECT a_plus_3(2);
ERROR 1105 (HY000): Code: 2602, Text = Unknown Function a_plus_3 (while in analyze select projection).
```
```
Loading

0 comments on commit 7eed602

Please sign in to comment.