`abs`

abs(<value1> : number) => number

Positive Modulus of pair of numbers.

abs(-20) -> 20
abs(10) -> 10

`acos`

acos(<value1> : number) => double

Calculates a consine inverse value

acos(1) -> 0.0

`add`

add(<value1> : any, <value2> : any) => any

Adds a pair of strings or numbers. Adds a date to a number of days. Adds a duration to a timestamp. Appends one array of similar type to another. Same as the + operator

add(10, 20) -> 30
10 + 20 -> 30
add('ice', 'cream') -> 'icecream'
'ice' + 'cream' + ' cone' -> 'icecream cone'
add(toDate('2012-12-12'), 3) -> 2012-12-15 (date value)
toDate('2012-12-12') + 3 -> 2012-12-15 (date value)
[10, 20] + [30, 40] => [10, 20, 30, 40]
toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS') + (days(1) + hours(2) - seconds(10)) => 2019-02-04 07:19:18.871

`addDays`

addDays(<date/timestamp> : datetime, <days to add> : integral) => datetime

Add days to a date or timestamp. Same as the + operator for date

addDays(toDate('2016-08-08'), 1) -> 2016-08-09

`addMonths`

addMonths(<date/timestamp> : datetime, <months to add> : integral) => datetime

Add months to a date or timestamp

addMonths(toDate('2016-08-31'), 1) -> 2016-09-30
addMonths(toTimestamp('2016-09-30 10:10:10'), -1) -> 2016-08-31 10:10:10

`and`

and(<value1> : boolean, <value2> : boolean) => boolean

Logical AND operator. Same as &&

and(true, false) -> false
true && false -> false

`asin`

asin(<value1> : number) => double

Calculates an inverse sine value

asin(0) -> 0.0

`atan`

atan(<value1> : number) => double

Calculates a inverse tangent value

atan(0) -> 0.0

`atan2`

atan2(<value1> : number, <value2> : number) => double

Returns the angle in radians between the positive x-axis of a plane and the point given by the coordinates

atan2(0, 0) -> 0.0

`avg`

avg(<value1> : number) => number

Gets the average of values of a column

avg(sales) -> 7523420.234

`avgIf`

avgIf(<value1> : boolean, <value2> : number) => number

Based on a criteria gets the average of values of a column

avgIf(region == 'West', sales) -> 7523420.234

`byName`

byName(<column name> : string) => any

Selects a column value by name in the stream. If there are multiple matches, the first match is returned. If no match it returns a NULL value. The returned value has to be type converted by one of the type conversion functions(TO_DATE, TO_STRING ...). Column names known at design time should be addressed just by their name. Computed inputs are not supported but you can use parameter substitutions

toString(byName('parent')) -> appa
toLong(byName('income')) -> 9000000000009
toBoolean(byName('foster')) -> false
toLong(byName($debtCol)) -> 123456890
birthDate -> 12/31/2050
toString(byName('Bogus Column')) -> NULL

`byPosition`

byPosition(<position> : integer) => any

Selects a column value by its relative position(1 based) in the stream. If the position is out of bounds it returns a NULL value. The returned value has to be type converted by one of the type conversion functions(TO_DATE, TO_STRING ...)Computed inputs are not supported but you can use parameter substitutions

toString(byPosition(1)) -> amma
toDecimal(byPosition(2), 10, 2) -> 199990.99
toBoolean(byName(4)) -> false
toString(byName($colName)) -> family
toString(byPosition(1234)) -> NULL

`case`

case(<condition> : boolean, <true_expression> : any, <false_expression> : any, ...) => any

Based on alternating conditions applies one value or the other. If the number of inputs are even, the other is NULL for last condition

case(custType == 'Premium', 10, 4.5)
case(custType == 'Premium', price*0.95, custType == 'Elite', price*0.9, price*2)
case(dayOfWeek(saleDate) == 1, 'Sunday', dayOfWeek(saleDate) == 6, 'Saturday')

`cbrt`

cbrt(<value1> : number) => double

Calculate the cube root of a number

cbrt(8) -> 2.0

`ceil`

ceil(<value1> : number) => number

Returns the smallest integer not smaller than the number

ceil(-0.1) -> 0

`concat`

concat(<this> : string, <that> : string, ...) => string

Concatenates a variable number of strings together. Same as the + operator with strings

concat('Awesome', 'Cool', 'Product') -> 'AwesomeCoolProduct'
'Awesome' + 'Cool' + 'Product' -> 'AwesomeCoolProduct'
concat(addrLine1, ' ', addrLine2, ' ', city, ' ', state, ' ', zip)
addrLine1 + ' ' + addrLine2 + ' ' + city + ' ' + state + ' ' + zip

`concatWS`

concatWS(<separator> : string, <this> : string, <that> : string, ...) => string

Concatenates a variable number of strings together with a separator. The first parameter is the separator

concatWS(' ', 'Awesome', 'Cool', 'Product') -> 'Awesome Cool Product'
concatWS(' ' , addrLine1, addrLine2, city, state, zip) ->
concatWS(',' , toString(order_total), toString(order_discount))

`cos`

cos(<value1> : number) => double

Calculates a cosine value

cos(10) -> -0.83907152907

`cosh`

cosh(<value1> : number) => double

Calculates a hyperbolic cosine of a value

cosh(0) -> 1.0

`count`

count([<value1> : any]) => long

Gets the aggregate count of values. If the optional column(s) is specified, it ignores NULL values in the count

count(custId) -> 100
count(custId, custName) -> 50
count() -> 125
count(iif(isNull(custId), 1, NULL)) -> 5

`countDistinct`

countDistinct(<value1> : any, [<value2> : any], ...) => long

Gets the aggregate count of distinct values of a set of columns

countDistinct(custId, custName) -> 60

`countIf`

countIf(<value1> : boolean, [<value2> : any]) => long

Based on a criteria gets the aggregate count of values. If the optional column is specified, it ignores NULL values in the count

countIf(state == 'CA' && commission < 10000, name) -> 100

`covariancePopulation`

covariancePopulation(<value1> : number, <value2> : number) => double

Gets the population covariance between two columns

covariancePopulation(sales, profit) -> 122.12

`covariancePopulationIf`

covariancePopulationIf(<value1> : boolean, <value2> : number, <value3> : number) => double

Based on a criteria, gets the population covariance of two columns

covariancePopulationIf(region == 'West', sales) -> 122.12

`covarianceSample`

covarianceSample(<value1> : number, <value2> : number) => double

Gets the sample covariance of two columns

covarianceSample(sales, profit) -> 122.12

`covarianceSampleIf`

covarianceSampleIf(<value1> : boolean, <value2> : number, <value3> : number) => double

Based on a criteria, gets the sample covariance of two columns

covarianceSampleIf(region == 'West', sales, profit) -> 122.12

`crc32`

crc32(<value1> : any, ...) => long

Calculates the CRC32 hash of set of column of varying primitive datatypes given a bit length which can only be of values 0(256), 224, 256, 384, 512. It can be used to calculate a fingerprint for a row

crc32(256, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) -> 3630253689

`cumeDist`

cumeDist() => integer

The CumeDist function computes the position of a value relative to all values in the partition. The result is the number of rows preceding or equal to the current row in the ordering of the partition divided by the total number of rows in the window partition. Any tie values in the ordering will evaluate to the same position.

cumeDist() -> 1

`currentDate`

currentDate([<value1> : string]) => date

Gets the current date when this job starts to run. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default.

currentDate() -> 12-12-2030
currentDate('PST') -> 12-31-2050

`currentTimestamp`

currentTimestamp() => timestamp

Gets the current timestamp when the job starts to run with local time zone

currentTimestamp() -> 12-12-2030T12:12:12

`currentUTC`

currentUTC([<value1> : string]) => timestamp

Gets the current the timestamp as UTC. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. It is defaulted to the current timezone

currentUTC() -> 12-12-2030T19:18:12
currentUTC('Asia/Seoul') -> 12-13-2030T11:18:12

`dayOfMonth`

dayOfMonth(<value1> : datetime) => integer

Gets the day of the month given a date

dayOfMonth(toDate('2018-06-08')) -> 08

`dayOfWeek`

dayOfWeek(<value1> : datetime) => integer

Gets the day of the week given a date. 1 - Sunday, 2 - Monday ..., 7 - Saturday

dayOfWeek(toDate('2018-06-08')) -> 7

`dayOfYear`

dayOfYear(<value1> : datetime) => integer

Gets the day of the year given a date

dayOfYear(toDate('2016-04-09')) -> 100

`degrees`

degrees(<value1> : number) => double

Converts radians to degrees

degrees(3.141592653589793) -> 180

`denseRank`

denseRank(<value1> : any, ...) => integer

Computes the rank of a value in a group of values. The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. The values will not produce gaps in the sequence. Dense Rank works even when data is not sorted and looks for change in values

denseRank(salesQtr, salesAmt) -> 1

`divide`

divide(<value1> : any, <value2> : any) => any

Divides pair of numbers. Same as the / operator

divide(20, 10) -> 2
20 / 10 -> 2

`endsWith`

endsWith(<string> : string, <substring to check> : string) => boolean

Checks if the string ends with the supplied string

endsWith('great', 'eat') -> true

`equals`

equals(<value1> : any, <value2> : any) => boolean

Comparison equals operator. Same as == operator

equals(12, 24) -> false
12==24 -> false
'bad'=='bad' -> true
'good'== NULL -> false
NULL===NULL -> false

`equalsIgnoreCase`

equalsIgnoreCase(<value1> : string, <value2> : string) => boolean

Comparison equals operator ignoring case. Same as <=> operator

'abc'<==>'abc' -> true
equalsIgnoreCase('abc', 'Abc') -> true

`factorial`

factorial(<value1> : number) => long

Calculate the factorial of a number

factorial(5) -> 120

`false`

false() => boolean

Always returns a false value. Use the function syntax(false()) if there is a column named 'false'

isDiscounted == false()
isDiscounted() == false

`first`

first(<value1> : any, [<value2> : boolean]) => any

Gets the first value of a column group. If the second parameter ignoreNulls is omitted, it is assumed false

first(sales) -> 12233.23
first(sales, false) -> NULL

`floor`

floor(<value1> : number) => number

Returns the largest integer not greater than the number

floor(-0.1) -> -1

`fromUTC`

fromUTC(<value1> : timestamp, [<value2> : string]) => timestamp

Converts to the timestamp from UTC. You can optionally pass the timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. It is defaulted to the current timezone

fromUTC(currentTimeStamp()) -> 12-12-2030T19:18:12
fromUTC(currentTimeStamp(), 'Asia/Seoul') -> 12-13-2030T11:18:12

`greater`

greater(<value1> : any, <value2> : any) => boolean

Comparison greater operator. Same as > operator

greater(12, 24) -> false
'abcd' > 'abc' -> true
toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS') < toTimestamp('2019-02-05 08:21:34.890', 'yyyy-MM-dd HH:mm:ss.SSS') => false

`greaterOrEqual`

greaterOrEqual(<value1> : any, <value2> : any) => boolean

Comparison greater than or equal operator. Same as >= operator

greaterOrEqual(12, 12) -> false
'abcd' >= 'abc' -> true

`greatest`

greatest(<value1> : any, ...) => any

Returns the greatest value among the list of values as input. Returns null if all inputs are null

greatest(10, 30, 15, 20) -> 30
greatest(toDate('12/12/2010'), toDate('12/12/2011'), toDate('12/12/2000')) -> '12/12/2011'
greatest(toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS'), toTimestamp('2019-02-05 08:21:34.890', 'yyyy-MM-dd HH:mm:ss.SSS')) => '2019-02-05 08:21:34.890'

`hour`

hour(<value1> : timestamp, [<value2> : string]) => integer

Gets the hour value of a timestamp. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default.

hour(toTimestamp('2009-07-30T12:58:59')) -> 12
hour(toTimestamp('2009-07-30T12:58:59'), 'PST') -> 12

`iif`

iif(<condition> : boolean, <true_expression> : any, [<false_expression> : any]) => any

Based on a condition applies one value or the other. If other is unspecified it is considered NULL. Both the values must be compatible(numeric, string...)

iif(custType == 'Premium', 10, 4.5)
iif(amount > 100, 'High')
iif(dayOfWeek(saleDate) == 6, 'Weekend', 'Weekday')

`in`

in(<array of items> : array, <item to find> : any) => boolean

Checks if an item is in the array

in([10, 20, 30], 10) -> true
in(['good', 'kid'], 'bad') -> false

`initCap`

initCap(<value1> : string) => string

Converts the first letter of every word to uppercase. Words are identified as separated by whitespace

initCap('cool iceCREAM') -> 'Cool IceCREAM'

`instr`

instr(<string> : string, <substring to find> : string) => integer

Finds the position(1 based) of the substring within a string. 0 is returned if not found

instr('great', 'eat') -> 3
instr('microsoft', 'o') -> 7
instr('good', 'bad') -> 0

`isDelete`

isDelete([<value1> : integer]) => boolean

Checks if the row is marked for delete. For transformations taking more than one input stream you can pass the (1-based) index of the stream. Default value for the stream index is 1

isDelete() -> true
isDelete(1) -> false

`isError`

isError([<value1> : integer]) => boolean

Checks if the row is marked as error. For transformations taking more than one input stream you can pass the (1-based) index of the stream. Default value for the stream index is 1

isError() -> true
isError(1) -> false

`isIgnore`

isIgnore([<value1> : integer]) => boolean

Checks if the row is marked to be ignored. For transformations taking more than one input stream you can pass the (1-based) index of the stream. Default value for the stream index is 1

isIgnore() -> true
isIgnore(1) -> false

`isInsert`

isInsert([<value1> : integer]) => boolean

Checks if the row is marked for insert. For transformations taking more than one input stream you can pass the (1-based) index of the stream. Default value for the stream index is 1

isInsert() -> true
isInsert(1) -> false

`isMatch`

isMatch([<value1> : integer]) => boolean

Checks if the row is matched at lookup. For transformations taking more than one input stream you can pass the (1-based) index of the stream. Default value for the stream index is 1

isMatch() -> true
isMatch(1) -> false

`isNull`

isNull(<value1> : any) => boolean

Checks if the value is NULL

isNull(NULL()) -> true
isNull('') -> false'

`isUpdate`

isUpdate([<value1> : integer]) => boolean

Checks if the row is marked for update. For transformations taking more than one input stream you can pass the (1-based) index of the stream. Default value for the stream index is 1

isUpdate() -> true
isUpdate(1) -> false

`kurtosis`

kurtosis(<value1> : number) => double

Gets the kurtosis of a column

kurtosis(sales) -> 122.12

`kurtosisIf`

kurtosisIf(<value1> : boolean, <value2> : number) => double

Based on a criteria, gets the kurtosis of a column

kurtosisIf(region == 'West', sales) -> 122.12

`lag`

lag(<value> : any, [<number of rows to look before> : number], [<default value> : any]) => any

Gets the value of the first parameter evaluated n rows before the current row. The second parameter is the number of rows to look back and the default value is 1. If there are not as many rows a value of null is returned unless a default value is specified

lag(amount, 2) -> 60
lag(amount, 2000, 100) -> 100

`last`

last(<value1> : any, [<value2> : boolean]) => any

Gets the last value of a column group. If the second parameter ignoreNulls is omitted, it is assumed false

last(sales) -> 523.12
last(sales, false) -> NULL

`lastDayOfMonth`

lastDayOfMonth(<value1> : datetime) => date

Gets the last date of the month given a date

lastDayOfMonth(toDate('2009-01-12')) -> 2009-01-31

`lead`

lead(<value> : any, [<number of rows to look after> : number], [<default value> : any]) => any

Gets the value of the first parameter evaluated n rows after the current row. The second parameter is the number of rows to look forward and the default value is 1. If there are not as many rows a value of null is returned unless a default value is specified

lead(amount, 2) -> 60
lead(amount, 2000, 100) -> 100

`least`

least(<value1> : any, ...) => any

Comparison lesser than or equal operator. Same as <= operator

least(10, 30, 15, 20) -> 10
least(toDate('12/12/2010'), toDate('12/12/2011'), toDate('12/12/2000')) -> '12/12/2000'

`left`

left(<string to subset> : string, <number of characters> : integral) => string

Extracts a substring start at index 1 with number of characters. Same as SUBSTRING(str, 1, n)

left('bojjus', 2) -> 'bo'
left('bojjus', 20) -> 'bojjus'

`length`

length(<value1> : string) => integer

Returns the length of the string

length('kiddo') -> 5

`lesser`

lesser(<value1> : any, <value2> : any) => boolean

Comparison less operator. Same as < operator

lesser(12 < 24) -> true
'abcd' < 'abc' -> false
toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS') < toTimestamp('2019-02-05 08:21:34.890', 'yyyy-MM-dd HH:mm:ss.SSS') => true

`lesserOrEqual`

lesserOrEqual(<value1> : any, <value2> : any) => boolean

Comparison lesser than or equal operator. Same as <= operator

lesserOrEqual(12, 12) -> true
'abcd' <= 'abc' -> false

`levenshtein`

levenshtein(<from string> : string, <to string> : string) => integer

Gets the levenshtein distance between two strings

levenshtein('boys', 'girls') -> 4

`like`

like(<string> : string, <pattern match> : string) => boolean

The pattern is a string that is matched literally. The exceptions are the following special symbols: _ matches any one character in the input (similar to . in posix regular expressions) % matches zero or more characters in the input (similar to .* in posix regular expressions). The escape character is ''. If an escape character precedes a special symbol or another escape character, the following character is matched literally. It is invalid to escape any other character.

like('icecream', 'ice%') -> true

`locate`

locate(<substring to find> : string, <string> : string, [<from index - 1-based> : integral]) => integer

Finds the position(1 based) of the substring within a string starting a certain position. If the position is omitted it is considered from the beginning of the string. 0 is returned if not found

locate('eat', 'great') -> 3
locate('o', 'microsoft', 6) -> 7
locate('bad', 'good') -> 0

`log`

log(<value1> : number, [<value2> : number]) => double

Calculates log value. An optional base can be supplied else a euler number if used

log(100, 10) -> 2

`log10`

log10(<value1> : number) => double

Calculates log value based on 10 base

log10(100) -> 2

`lower`

lower(<value1> : string) => string

Lowercases a string

lower('GunChus') -> 'gunchus'

`lpad`

lpad(<string to pad> : string, <final padded length> : integral, <padding> : string) => string

Left pads the string by the supplied padding until it is of a certain length. If the string is equal to or greater than the length, then it is considered a no-op

lpad('great', 10, '-') -> '-----great'
lpad('great', 4, '-') -> 'great'
lpad('great', 8, '<>') -> '<><great'

`ltrim`

ltrim(<string to trim> : string, <trim characters> : string) => string

Left trims a string of leading characters. If second parameter is unspecified, it trims whitespace. Else it trims any character specified in the second parameter

ltrim('!--!wor!ld!', '-!') -> 'wor!ld!'

`max`

max(<value1> : any) => any

Gets the maximum value of a column

MAX(sales) -> 12312131.12

`maxIf`

maxIf(<value1> : boolean, <value2> : any) => any

Based on a criteria, gets the maximum value of a column

maxIf(region == 'West', sales) -> 99999.56

`md5`

md5(<value1> : any, ...) => string

Calculates the MD5 digest of set of column of varying primitive datatypes and returns a 32 character hex string. It can be used to calculate a fingerprint for a row

md5(5, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) -> 'c1527622a922c83665e49835e46350fe'

`mean`

mean(<value1> : number) => number

Gets the mean of values of a column. Same as AVG

mean(sales) -> 7523420.234

`meanIf`

meanIf(<value1> : boolean, <value2> : number) => number

Based on a criteria gets the mean of values of a column. Same as avgIf

meanIf(region == 'West', sales) -> 7523420.234

`min`

min(<value1> : any) => any

Gets the minimum value of a column

min(sales) -> 00.01
min(orderDate) -> 12/12/2000

`minIf`

minIf(<value1> : boolean, <value2> : any) => any

Based on a criteria, gets the minimum value of a column

minIf(region == 'West', sales) -> 00.01

`minus`

minus(<value1> : any, <value2> : any) => any

Subtracts numbers. Subtract from a date number of days. Substract duration from a timestamp. Same as the - operator

minus(20, 10) -> 10
20 - 10 -> 10
minus(toDate('2012-12-15'), 3) -> 2012-12-12 (date value)
toDate('2012-12-15') - 3 -> 2012-12-13 (date value)
toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS') + (days(1) + hours(2) - seconds(10)) => 2019-02-04 07:19:18.871

`minute`

minute(<value1> : timestamp, [<value2> : string]) => integer

Gets the minute value of a timestamp. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default.

minute(toTimestamp('2009-07-30T12:58:59')) -> 58
minute(toTimestamp('2009-07-30T12:58:59', 'PST')) -> 58

`mod`

mod(<value1> : any, <value2> : any) => any

Modulus of pair of numbers. Same as the % operator

mod(20, 8) -> 4
20 % 8 -> 4

`month`

month(<value1> : datetime) => integer

Gets the month value of a date or timestamp

month(toDate('2012-8-8')) -> 8

`monthsBetween`

monthsBetween(<from date/timestamp> : datetime, <to date/timestamp> : datetime, [<time zone> : boolean], [<value4> : string]) => double

Gets the number of months between two datesYou can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default.

monthsBetween(toDate('1997-02-28 10:30:00'), toDate('1996-10-30')) -> 3.94959677

`multiply`

multiply(<value1> : any, <value2> : any) => any

Multiplies pair of numbers. Same as the * operator

multiply(20, 10) -> 200
20 * 10 -> 200

`nTile`

nTile([<value1> : integer]) => integer

The NTile function divides the rows for each window partition into n buckets ranging from 1 to at most n. Bucket values will differ by at most 1. If the number of rows in the partition does not divide evenly into the number of buckets, then the remainder values are distributed one per bucket, starting with the first bucket. The NTile function is useful for the calculation of tertiles, quartiles, deciles, and other common summary statistics. The function calculates two variables during initialization: The size of a regular bucket will have one extra row added to it. Both variables are based on the size of the current partition. During the calculation process the function keeps track of the current row number, the current bucket number, and the row number at which the bucket will change (bucketThreshold). When the current row number reaches bucket threshold, the bucket value is increased by one and the threshold is increased by the bucket size (plus one extra if the current bucket is padded).

nTile() -> 1
nTile(numOfBuckets) -> 1

`negate`

negate(<value1> : number) => number

Negates a number. Turns positive numbers to negative and vice versa

negate(13) -> -13

`nextSequence`

nextSequence() => long

Returns the next unique sequence. The number is consecutive only within a partition and is prefixed by the partitionId

nextSequence() -> 12313112

`normalize`

normalize(<String to normalize> : string) => string

Normalize the string value to separate accented unicode characters

normalize('boys') -> 'boys'

`not`

not(<value1> : boolean) => boolean

Logical negation operator

not(true) -> false
not(premium)

`notEquals`

notEquals(<value1> : any, <value2> : any) => boolean

Comparison not equals operator. Same as != operator

12!=24 -> true
'abc'!='abc' -> false

`null`

null() => null

Returns a NULL value. Use the function syntax(null()) if there is a column named 'null'. Any operation that uses will result in a NULL

custId = NULL (for derived field)
custId == NULL -> NULL
'nothing' + NULL -> NULL
10 * NULL -> NULL'
NULL == '' -> NULL'

`or`

or(<value1> : boolean, <value2> : boolean) => boolean

Logical OR operator. Same as ||

or(true, false) -> true
true || false -> true

`pMod`

pMod(<value1> : any, <value2> : any) => any

Positive Modulus of pair of numbers.

pmod(-20, 8) -> 4

`power`

power(<value1> : number, <value2> : number) => double

Raises one number to the power of another

power(10, 2) -> 100

`rank`

rank(<value1> : any, ...) => integer

Computes the rank of a value in a group of values. The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. The values will produce gaps in the sequence. Rank works even when data is not sorted and looks for change in values

rank(salesQtr, salesAmt) -> 1

`regexExtract`

regexExtract(<string> : string, <regex to find> : string, [<match group 1-based index> : integral]) => string

Extract a matching substring for a given regex pattern. The last parameter identifies the match group and is defaulted to 1 if omitted. Use <regex>(back quote) to match a string without escaping

regexExtract('Cost is between 600 and 800 dollars', '(\\d+) and (\\d+)', 2) -> '800'
regexExtract('Cost is between 600 and 800 dollars', `(\d+) and (\d+)`, 2) -> '800'

`regexMatch`

regexMatch(<string> : string, <regex to match> : string) => boolean

Checks if the string matches the given regex pattern. Use <regex>(back quote) to match a string without escaping

regexMatch('200.50', '(\\d+).(\\d+)') -> true
regexMatch('200.50', `(\d+).(\d+)`) -> true

`regexReplace`

regexReplace(<string> : string, <regex to find> : string, <substring to replace> : string) => string

Replace all occurrences of a regex pattern with another substring in the given string Use <regex>(back quote) to match a string without escaping

regexReplace('100 and 200', '(\\d+)', 'bojjus') -> 'bojjus and bojjus'
regexReplace('100 and 200', `(\d+)`, 'gunchus') -> 'gunchus and gunchus'

`regexSplit`

regexSplit(<string to split> : string, <regex expression> : string) => array

Splits a string based on a delimiter based on regex and returns an array of strings

regexSplit('oneAtwoBthreeC', '[CAB]') -> ['one', 'two', 'three']
regexSplit('oneAtwoBthreeC', '[CAB]')[1] -> 'one'
regexSplit('oneAtwoBthreeC', '[CAB]')[0] -> NULL
regexSplit('oneAtwoBthreeC', '[CAB]')[20] -> NULL

`replace`

replace(<string> : string, <substring to find> : string, <substring to replace> : string) => string

Replace all occurrences of a substring with another substring in the given string

replace('doggie dog', 'dog', 'cat') -> 'catgie cat'
replace('doggie dog', 'dog', '') -> 'gie'

`reverse`

reverse(<value1> : string) => string

Reverses a string

reverse('gunchus') -> 'suhcnug'

`right`

right(<string to subset> : string, <number of characters> : integral) => string

Extracts a substring with number of characters from the right. Same as SUBSTRING(str, LENGTH(str) - n, n)

right('bojjus', 2) -> 'us'
right('bojjus', 20) -> 'bojjus'

`rlike`

rlike(<string> : string, <pattern match> : string) => boolean

Checks if the string matches the given regex pattern

rlike('200.50', '(\d+).(\d+)') -> true

`round`

round(<number> : number, [<scale to round> : number], [<rounding option> : integral]) => double

Rounds a number given an optional scale and an optional rounding mode. If the scale is omitted, it is defaulted to 0. If the mode is omitted, it is defaulted to ROUND_HALF_UP(5). The values for rounding include 1 - ROUND_UP 2 - ROUND_DOWN 3 - ROUND_CEILING 4 - ROUND_FLOOR 5 - ROUND_HALF_UP 6 - ROUND_HALF_DOWN 7 - ROUND_HALF_EVEN 8 - ROUND_UNNECESSARY

round(100.123) -> 100.0
round(2.5, 0) -> 3.0
round(5.3999999999999995, 2, 7) -> 5.40

`rowNumber`

rowNumber() => integer

Assigns a sequential row numbering for rows in a window starting with 1

rowNumber() -> 1

`rpad`

rpad(<string to pad> : string, <final padded length> : integral, <padding> : string) => string

Right pads the string by the supplied padding until it is of a certain length. If the string is equal to or greater than the length, then it is considered a no-op

rpad('great', 10, '-') -> 'great-----'
rpad('great', 4, '-') -> 'great'
rpad('great', 8, '<>') -> 'great<><'

`rtrim`

rtrim(<string to trim> : string, <trim characters> : string) => string

Right trims a string of leading characters. If second parameter is unspecified, it trims whitespace. Else it trims any character specified in the second parameter

rtrim('!--!wor!ld!', '-!') -> '!--!wor!ld'

`second`

second(<value1> : timestamp, [<value2> : string]) => integer

Gets the second value of a date. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default.

second(toTimestamp('2009-07-30T12:58:59')) -> 59

`sha1`

sha1(<value1> : any, ...) => string

Calculates the SHA-1 digest of set of column of varying primitive datatypes and returns a 40 character hex string. It can be used to calculate a fingerprint for a row

sha1(5, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) -> '63849fd2abb65fbc626c60b1f827bd05573f0cea'

`sha2`

sha2(<value1> : integer, <value2> : any, ...) => string

Calculates the SHA-2 digest of set of column of varying primitive datatypes given a bit length which can only be of values 0(256), 224, 256, 384, 512. It can be used to calculate a fingerprint for a row

sha2(256, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) -> 'd3b2bff62c3a00e9370b1ac85e428e661a7df73959fa1a96ae136599e9ee20fd'

`sin`

sin(<value1> : number) => double

Calculates a sine value

sin(2) -> 0.90929742682

`sinh`

sinh(<value1> : number) => double

Calculates a hyperbolic sine value

sinh(0) -> 0.0

`skewness`

skewness(<value1> : number) => double

Gets the skewness of a column

skewness(sales) -> 122.12

`skewnessIf`

skewnessIf(<value1> : boolean, <value2> : number) => double

Based on a criteria, gets the skewness of a column

skewnessIf(region == 'West', sales) -> 122.12

`slice`

slice(<array to slice> : array, <from 1-based index> : integral, [<number of items> : integral]) => array

Extracts a subset of an array from a position. Position is 1 based. If the length is omitted, it is defaulted to end of the string

slice([10, 20, 30, 40], 1, 2) -> [10, 20]
slice([10, 20, 30, 40], 2) -> [20, 30, 40]
slice([10, 20, 30, 40], 2)[1] -> 20
slice([10, 20, 30, 40], 2)[0] -> NULL
slice([10, 20, 30, 40], 2)[20] -> NULL
slice([10, 20, 30, 40], 8) -> []

`soundex`

soundex(<value1> : string) => string

Gets the soundex code for the string

soundex('genius') -> 'G520'

`split`

split(<string to split> : string, <split characters> : string) => array

Splits a string based on a delimiter and returns an array of strings

split('100,200,300', ',') -> ['100', '200', '300']
split('100,200,300', '|') -> ['100,200,300']
split('100, 200, 300', ', ') -> ['100', '200', '300']
split('100, 200, 300', ', ')[1] -> '100'
split('100, 200, 300', ', ')[0] -> NULL
split('100, 200, 300', ', ')[20] -> NULL
split('100200300', ',') -> ['100200300']

`sqrt`

sqrt(<value1> : number) => double

Calculates the square root of a number

sqrt(9) -> 3

`startsWith`

startsWith(<string> : string, <substring to check> : string) => boolean

Checks if the string starts with the supplied string

startsWith('great', 'gr') -> true

`stddev`

stddev(<value1> : number) => double

Gets the standard deviation of a column

stdDev(sales) -> 122.12

`stddevIf`

stddevIf(<value1> : boolean, <value2> : number) => double

Based on a criteria, gets the standard deviation of a column

stddevIf(region == 'West', sales) -> 122.12

`stddevPopulation`

stddevPopulation(<value1> : number) => double

Gets the population standard deviation of a column

stddevPopulation(sales) -> 122.12

`stddevPopulationIf`

stddevPopulationIf(<value1> : boolean, <value2> : number) => double

Based on a criteria, gets the population standard deviation of a column

stddevPopulationIf(region == 'West', sales) -> 122.12

`stddevSample`

stddevSample(<value1> : number) => double

Gets the sample standard deviation of a column

stddevSample(sales) -> 122.12

`stddevSampleIf`

stddevSampleIf(<value1> : boolean, <value2> : number) => double

Based on a criteria, gets the sample standard deviation of a column

stddevSampleIf(region == 'West', sales) -> 122.12

`subDays`

subDays(<date/timestamp> : datetime, <days to subtract> : integral) => datetime

Subtract months from a date. Same as the - operator for date

subDays(toDate('2016-08-08'), 1) -> 2016-08-09

`subMonths`

subMonths(<date/timestamp> : datetime, <months to subtract> : integral) => datetime

Subtract months from a date or timestamp

subMonths(toDate('2016-09-30'), 1) -> 2016-08-31

`substring`

substring(<string to subset> : string, <from 1-based index> : integral, [<number of characters> : integral]) => string

Extracts a substring of a certain length from a position. Position is 1 based. If the length is omitted, it is defaulted to end of the string

substring('Cat in the hat', 5, 2) -> 'in'
substring('Cat in the hat', 5, 100) -> 'in the hat'
substring('Cat in the hat', 5) -> 'in the hat'
substring('Cat in the hat', 100, 100) -> ''

`sum`

sum(<value1> : number) => number

Gets the aggregate sum of a numeric column

sum(col) -> value

`sumDistinct`

sumDistinct(<value1> : number) => number

Gets the aggregate sum of distinct values of a numeric column

sumDistinct(col) -> value

`sumDistinctIf`

sumDistinctIf(<value1> : boolean, <value2> : number) => number

Based on criteria gets the aggregate sum of a numeric column. The condition can be based on any column

sumDistinctIf(state == 'CA' && commission < 10000, sales) -> value
sumDistinctIf(true, sales) -> SUM(sales)

`sumIf`

sumIf(<value1> : boolean, <value2> : number) => number

Based on criteria gets the aggregate sum of a numeric column. The condition can be based on any column

sumIf(state == 'CA' && commission < 10000, sales) -> value
sumIf(true, sales) -> SUM(sales)

`tan`

tan(<value1> : number) => double

Calculates a tangent value

tan(0) -> 0.0

`tanh`

tanh(<value1> : number) => double

Calculates a hyperbolic tangent value

tanh(0) -> 0.0

`toBoolean`

toBoolean(<value1> : string) => boolean

Converts a value of ('t', 'true', 'y', 'yes', '1') to true and ('f', 'false', 'n', 'no', '0') to false and NULL for any other value

toBoolean('true') -> true
toBoolean('n') -> false
toBoolean('truthy') -> NULL

`toDate`

toDate(<string> : any, [<date format> : string]) => date

Converts a string to a date given an optional date format. Refer to Java SimpleDateFormat for all possible formats. If the date format is omitted, combinations of the following are accepted. [ yyyy, yyyy-[M]M, yyyy-[M]M-[d]d, yyyy-[M]M-[d]d, yyyy-[M]M-[d]d, yyyy-[M]M-[d]dT* ]

toDate('2012-8-8') -> 2012-8-8
toDate('12/12/2012', 'MM/dd/yyyy') -> 2012-12-12

`toDecimal`

toDecimal(<value> : any, [<precision> : integral], [<scale> : integral], [<format> : string], [<locale> : string]) => decimal(10,0)

Converts any numeric or string to a decimal value. If precision and scale are not specified, it is defaulted to (10,2).An optional Java decimal format can be used for the conversion. An optional locale format in the form of BCP47 language like en-US, de, zh-CN

toDecimal(123.45) -> 123.45
toDecimal('123.45', 8, 4) -> 123.4500
toDecimal('$123.45', 8, 4,'$###.00') -> 123.4500
toDecimal('Ç123,45', 10, 2, 'Ç###,##', 'de') -> 123.45

`toDouble`

toDouble(<value> : any, [<format> : string], [<locale> : string]) => double

Converts any numeric or string to a double value. An optional Java decimal format can be used for the conversion. An optional locale format in the form of BCP47 language like en-US, de, zh-CN

toDouble(123.45) -> 123.45
toDouble('123.45') -> 123.45
toDouble('$123.45', '$###.00') -> 123.45
toDouble('Ç123,45', 'Ç###,##', 'de') -> 123.45

`toFloat`

toFloat(<value> : any, [<format> : string], [<locale> : string]) => float

Converts any numeric or string to a float value. An optional Java decimal format can be used for the conversion. Truncates any double

toFloat(123.45) -> 123.45
toFloat('123.45') -> 123.45
toFloat('$123.45', '$###.00') -> 123.45

`toInteger`

toInteger(<value> : any, [<format> : string], [<locale> : string]) => integer

Converts any numeric or string to an integer value. An optional Java decimal format can be used for the conversion. Truncates any long, float, double

toInteger(123) -> 123
toInteger('123') -> 123
toInteger('$123', '$###') -> 123

`toLong`

toLong(<value> : any, [<format> : string], [<locale> : string]) => long

Converts any numeric or string to a long value. An optional Java decimal format can be used for the conversion. Truncates any float, double

toLong(123) -> 123
toLong('123') -> 123
toLong('$123', '$###') -> 123

`toShort`

toShort(<value> : any, [<format> : string], [<locale> : string]) => short

Converts any numeric or string to a short value. An optional Java decimal format can be used for the conversion. Truncates any integer, long, float, double

toShort(123) -> 123
toShort('123') -> 123
toShort('$123', '$###') -> 123

`toString`

toString(<value> : any, [<number format/date format> : string]) => string

Converts a primitive datatype to a string. For numbers and date a format can be specified. If unspecified the system default is picked.Java decimal format is used for numbers. Refer to Java SimpleDateFormat for all possible date formats; the default format is yyyy-MM-dd

toString(10) -> '10'
toString('engineer') -> 'engineer'
toString(123456.789, '##,###.##') -> '123,456.79'
toString(123.78, '000000.000') -> '000123.780'
toString(12345, '##0.#####E0') -> '12.345E3'
toString(toDate('2018-12-31')) -> '2018-12-31'
toString(toDate('2018-12-31'), 'MM/dd/yy') -> '12/31/18'
toString(4 == 20) -> 'false'

`toTimestamp`

toTimestamp(<string> : any, [<timestamp format> : string], [<time zone> : string]) => timestamp

Converts a string to a timestamp given an optional timestamp format. Refer to Java SimpleDateFormat for all possible formats. If the timestamp is omitted the default pattern yyyy-[M]M-[d]d hh:mm:ss[.f...] is used. Timestamp supports upto millisecond accuracy with value of 999

toTimestamp('2016-12-31 00:12:00') -> 2012-8-8T00:12:00
toTimestamp('2016/12/31T00:12:00', 'MM/dd/yyyyThh:mm:ss') -> 2012-12-12T00:12:00
toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS') -> 2019-02-03 05:19:28.871

`toUTC`

toUTC(<value1> : timestamp, [<value2> : string]) => timestamp

Converts the timestamp to UTC. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. It is defaulted to the current timezone

toUTC(currentTimeStamp()) -> 12-12-2030T19:18:12
toUTC(currentTimeStamp(), 'Asia/Seoul') -> 12-13-2030T11:18:12

`translate`

translate(<string to translate> : string, <lookup characters> : string, <replace characters> : string) => string

Replace one set of characters by another set of characters in the string. Characters have 1 to 1 replacement

translate('(Hello)', '()', '[]') -> '[Hello]'
translate('(Hello)', '()', '[') -> '[Hello'

`trim`

trim(<string to trim> : string, [<trim characters> : string]) => string

Trims a string of leading and trailing characters. If second parameter is unspecified, it trims whitespace. Else it trims any character specified in the second parameter

trim('!--!wor!ld!', '-!') -> 'wor!ld'

`true`

true() => boolean

Always returns a true value. Use the function syntax(true()) if there is a column named 'true'

isDiscounted == true()
isDiscounted() == true

`typeMatch`

typeMatch(<type> : string, <base type> : string) => boolean

Matches the type of the column. Can only be used in pattern expressions.number matches short, integer, long, double, float or decimal, integral matches short, integer, long, fractional matches double, float, decimal and datetime matches date or timestamp type

typeMatch(type, 'number') -> true
typeMatch('date', 'number') -> false

`upper`

upper(<value1> : string) => string

Uppercases a string

upper('bojjus') -> 'BOJJUS'

`variance`

variance(<value1> : number) => double

Gets the variance of a column

variance(sales) -> 122.12

`varianceIf`

varianceIf(<value1> : boolean, <value2> : number) => double

Based on a criteria, gets the variance of a column

varianceIf(region == 'West', sales) -> 122.12

`variancePopulation`

variancePopulation(<value1> : number) => double

Gets the population variance of a column

variancePopulation(sales) -> 122.12

`variancePopulationIf`

variancePopulationIf(<value1> : boolean, <value2> : number) => double

Based on a criteria, gets the population variance of a column

variancePopulationIf(region == 'West', sales) -> 122.12

`varianceSample`

varianceSample(<value1> : number) => double

Gets the unbiased variance of a column

varianceSample(sales) -> 122.12

`varianceSampleIf`

varianceSampleIf(<value1> : boolean, <value2> : number) => double

Based on a criteria, gets the unbiased variance of a column

varianceSampleIf(region == 'West', sales) -> 122.12

`weekOfYear`

weekOfYear(<value1> : datetime) => integer

Gets the week of the year given a date

weekOfYear(toDate('2008-02-20')) -> 8

`xor`

xor(<value1> : boolean, <value2> : boolean) => boolean

Logical XOR operator. Same as ^ operator

xor(true, false) -> true
xor(true, true) -> false
true ^ false -> true

`year`

year(<value1> : datetime) => integer

Gets the year value of a date

year(toDate('2012-8-8')) -> 2012

Files

adf-data-flow-expression-functions.md

Latest commit

History

adf-data-flow-expression-functions.md

File metadata and controls

abs

acos

add

addDays

addMonths

and

asin

atan

atan2

avg

avgIf

byName

byPosition

case

cbrt

ceil

concat

concatWS

cos

cosh

count

countDistinct

countIf

covariancePopulation

covariancePopulationIf

covarianceSample

covarianceSampleIf

crc32

cumeDist

currentDate

currentTimestamp

currentUTC

dayOfMonth

dayOfWeek

dayOfYear

degrees

denseRank

divide

endsWith

equals

equalsIgnoreCase

factorial

false

first

floor

fromUTC

greater

greaterOrEqual

greatest

hour

iif

in

initCap

instr

isDelete

isError

isIgnore

isInsert

isMatch

isNull

isUpdate

kurtosis

kurtosisIf

lag

last

lastDayOfMonth

lead

least

left

length

lesser

lesserOrEqual

levenshtein

like