Pop-out for expression functions
abs(<value1> : number) => number
Positive Modulus of pair of numbers.
abs(-20) -> 20
abs(10) -> 10
acos(<value1> : number) => double
Calculates a consine inverse value
acos(1) -> 0.0
add(<value1> : any, <value2> : any) => any
Adds a pair of strings or numbers. Adds a date to a number of days. Adds a duration to a timestamp. Appends one array of similar type to another. Same as the + operator
add(10, 20) -> 30
10 + 20 -> 30
add('ice', 'cream') -> 'icecream'
'ice' + 'cream' + ' cone' -> 'icecream cone'
add(toDate('2012-12-12'), 3) -> 2012-12-15 (date value)
toDate('2012-12-12') + 3 -> 2012-12-15 (date value)
[10, 20] + [30, 40] => [10, 20, 30, 40]
toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS') + (days(1) + hours(2) - seconds(10)) => 2019-02-04 07:19:18.871
addDays(<date/timestamp> : datetime, <days to add> : integral) => datetime
Add days to a date or timestamp. Same as the + operator for date
addDays(toDate('2016-08-08'), 1) -> 2016-08-09
addMonths(<date/timestamp> : datetime, <months to add> : integral) => datetime
Add months to a date or timestamp
addMonths(toDate('2016-08-31'), 1) -> 2016-09-30
addMonths(toTimestamp('2016-09-30 10:10:10'), -1) -> 2016-08-31 10:10:10
and(<value1> : boolean, <value2> : boolean) => boolean
Logical AND operator. Same as &&
and(true, false) -> false
true && false -> false
asin(<value1> : number) => double
Calculates an inverse sine value
asin(0) -> 0.0
atan(<value1> : number) => double
Calculates a inverse tangent value
atan(0) -> 0.0
atan2(<value1> : number, <value2> : number) => double
Returns the angle in radians between the positive x-axis of a plane and the point given by the coordinates
atan2(0, 0) -> 0.0
avg(<value1> : number) => number
Gets the average of values of a column
avg(sales) -> 7523420.234
avgIf(<value1> : boolean, <value2> : number) => number
Based on a criteria gets the average of values of a column
avgIf(region == 'West', sales) -> 7523420.234
byName(<column name> : string) => any
Selects a column value by name in the stream. If there are multiple matches, the first match is returned. If no match it returns a NULL value. The returned value has to be type converted by one of the type conversion functions(TO_DATE, TO_STRING ...). Column names known at design time should be addressed just by their name. Computed inputs are not supported but you can use parameter substitutions
toString(byName('parent')) -> appa
toLong(byName('income')) -> 9000000000009
toBoolean(byName('foster')) -> false
toLong(byName($debtCol)) -> 123456890
birthDate -> 12/31/2050
toString(byName('Bogus Column')) -> NULL
byPosition(<position> : integer) => any
Selects a column value by its relative position(1 based) in the stream. If the position is out of bounds it returns a NULL value. The returned value has to be type converted by one of the type conversion functions(TO_DATE, TO_STRING ...)Computed inputs are not supported but you can use parameter substitutions
toString(byPosition(1)) -> amma
toDecimal(byPosition(2), 10, 2) -> 199990.99
toBoolean(byName(4)) -> false
toString(byName($colName)) -> family
toString(byPosition(1234)) -> NULL
case(<condition> : boolean, <true_expression> : any, <false_expression> : any, ...) => any
Based on alternating conditions applies one value or the other. If the number of inputs are even, the other is NULL for last condition
case(custType == 'Premium', 10, 4.5)
case(custType == 'Premium', price*0.95, custType == 'Elite', price*0.9, price*2)
case(dayOfWeek(saleDate) == 1, 'Sunday', dayOfWeek(saleDate) == 6, 'Saturday')
cbrt(<value1> : number) => double
Calculate the cube root of a number
cbrt(8) -> 2.0
ceil(<value1> : number) => number
Returns the smallest integer not smaller than the number
ceil(-0.1) -> 0
concat(<this> : string, <that> : string, ...) => string
Concatenates a variable number of strings together. Same as the + operator with strings
concat('Awesome', 'Cool', 'Product') -> 'AwesomeCoolProduct'
'Awesome' + 'Cool' + 'Product' -> 'AwesomeCoolProduct'
concat(addrLine1, ' ', addrLine2, ' ', city, ' ', state, ' ', zip)
addrLine1 + ' ' + addrLine2 + ' ' + city + ' ' + state + ' ' + zip
concatWS(<separator> : string, <this> : string, <that> : string, ...) => string
Concatenates a variable number of strings together with a separator. The first parameter is the separator
concatWS(' ', 'Awesome', 'Cool', 'Product') -> 'Awesome Cool Product'
concatWS(' ' , addrLine1, addrLine2, city, state, zip) ->
concatWS(',' , toString(order_total), toString(order_discount))
cos(<value1> : number) => double
Calculates a cosine value
cos(10) -> -0.83907152907
cosh(<value1> : number) => double
Calculates a hyperbolic cosine of a value
cosh(0) -> 1.0
count([<value1> : any]) => long
Gets the aggregate count of values. If the optional column(s) is specified, it ignores NULL values in the count
count(custId) -> 100
count(custId, custName) -> 50
count() -> 125
count(iif(isNull(custId), 1, NULL)) -> 5
countDistinct(<value1> : any, [<value2> : any], ...) => long
Gets the aggregate count of distinct values of a set of columns
countDistinct(custId, custName) -> 60
countIf(<value1> : boolean, [<value2> : any]) => long
Based on a criteria gets the aggregate count of values. If the optional column is specified, it ignores NULL values in the count
countIf(state == 'CA' && commission < 10000, name) -> 100
covariancePopulation(<value1> : number, <value2> : number) => double
Gets the population covariance between two columns
covariancePopulation(sales, profit) -> 122.12
covariancePopulationIf(<value1> : boolean, <value2> : number, <value3> : number) => double
Based on a criteria, gets the population covariance of two columns
covariancePopulationIf(region == 'West', sales) -> 122.12
covarianceSample(<value1> : number, <value2> : number) => double
Gets the sample covariance of two columns
covarianceSample(sales, profit) -> 122.12
covarianceSampleIf(<value1> : boolean, <value2> : number, <value3> : number) => double
Based on a criteria, gets the sample covariance of two columns
covarianceSampleIf(region == 'West', sales, profit) -> 122.12
crc32(<value1> : any, ...) => long
Calculates the CRC32 hash of set of column of varying primitive datatypes given a bit length which can only be of values 0(256), 224, 256, 384, 512. It can be used to calculate a fingerprint for a row
crc32(256, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) -> 3630253689
cumeDist() => integer
The CumeDist function computes the position of a value relative to all values in the partition. The result is the number of rows preceding or equal to the current row in the ordering of the partition divided by the total number of rows in the window partition. Any tie values in the ordering will evaluate to the same position.
cumeDist() -> 1
currentDate([<value1> : string]) => date
Gets the current date when this job starts to run. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default.
currentDate() -> 12-12-2030
currentDate('PST') -> 12-31-2050
currentTimestamp() => timestamp
Gets the current timestamp when the job starts to run with local time zone
currentTimestamp() -> 12-12-2030T12:12:12
currentUTC([<value1> : string]) => timestamp
Gets the current the timestamp as UTC. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. It is defaulted to the current timezone
currentUTC() -> 12-12-2030T19:18:12
currentUTC('Asia/Seoul') -> 12-13-2030T11:18:12
dayOfMonth(<value1> : datetime) => integer
Gets the day of the month given a date
dayOfMonth(toDate('2018-06-08')) -> 08
dayOfWeek(<value1> : datetime) => integer
Gets the day of the week given a date. 1 - Sunday, 2 - Monday ..., 7 - Saturday
dayOfWeek(toDate('2018-06-08')) -> 7
dayOfYear(<value1> : datetime) => integer
Gets the day of the year given a date
dayOfYear(toDate('2016-04-09')) -> 100
degrees(<value1> : number) => double
Converts radians to degrees
degrees(3.141592653589793) -> 180
denseRank(<value1> : any, ...) => integer
Computes the rank of a value in a group of values. The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. The values will not produce gaps in the sequence. Dense Rank works even when data is not sorted and looks for change in values
denseRank(salesQtr, salesAmt) -> 1
divide(<value1> : any, <value2> : any) => any
Divides pair of numbers. Same as the / operator
divide(20, 10) -> 2
20 / 10 -> 2
endsWith(<string> : string, <substring to check> : string) => boolean
Checks if the string ends with the supplied string
endsWith('great', 'eat') -> true
equals(<value1> : any, <value2> : any) => boolean
Comparison equals operator. Same as == operator
equals(12, 24) -> false
12==24 -> false
'bad'=='bad' -> true
'good'== NULL -> false
NULL===NULL -> false
equalsIgnoreCase(<value1> : string, <value2> : string) => boolean
Comparison equals operator ignoring case. Same as <=> operator
'abc'<==>'abc' -> true
equalsIgnoreCase('abc', 'Abc') -> true
factorial(<value1> : number) => long
Calculate the factorial of a number
factorial(5) -> 120
false() => boolean
Always returns a false value. Use the function syntax(false()) if there is a column named 'false'
isDiscounted == false()
isDiscounted() == false
first(<value1> : any, [<value2> : boolean]) => any
Gets the first value of a column group. If the second parameter ignoreNulls is omitted, it is assumed false
first(sales) -> 12233.23
first(sales, false) -> NULL
floor(<value1> : number) => number
Returns the largest integer not greater than the number
floor(-0.1) -> -1
fromUTC(<value1> : timestamp, [<value2> : string]) => timestamp
Converts to the timestamp from UTC. You can optionally pass the timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. It is defaulted to the current timezone
fromUTC(currentTimeStamp()) -> 12-12-2030T19:18:12
fromUTC(currentTimeStamp(), 'Asia/Seoul') -> 12-13-2030T11:18:12
greater(<value1> : any, <value2> : any) => boolean
Comparison greater operator. Same as > operator
greater(12, 24) -> false
'abcd' > 'abc' -> true
toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS') < toTimestamp('2019-02-05 08:21:34.890', 'yyyy-MM-dd HH:mm:ss.SSS') => false
greaterOrEqual(<value1> : any, <value2> : any) => boolean
Comparison greater than or equal operator. Same as >= operator
greaterOrEqual(12, 12) -> false
'abcd' >= 'abc' -> true
greatest(<value1> : any, ...) => any
Returns the greatest value among the list of values as input. Returns null if all inputs are null
greatest(10, 30, 15, 20) -> 30
greatest(toDate('12/12/2010'), toDate('12/12/2011'), toDate('12/12/2000')) -> '12/12/2011'
greatest(toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS'), toTimestamp('2019-02-05 08:21:34.890', 'yyyy-MM-dd HH:mm:ss.SSS')) => '2019-02-05 08:21:34.890'
hour(<value1> : timestamp, [<value2> : string]) => integer
Gets the hour value of a timestamp. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default.
hour(toTimestamp('2009-07-30T12:58:59')) -> 12
hour(toTimestamp('2009-07-30T12:58:59'), 'PST') -> 12
iif(<condition> : boolean, <true_expression> : any, [<false_expression> : any]) => any
Based on a condition applies one value or the other. If other is unspecified it is considered NULL. Both the values must be compatible(numeric, string...)
iif(custType == 'Premium', 10, 4.5)
iif(amount > 100, 'High')
iif(dayOfWeek(saleDate) == 6, 'Weekend', 'Weekday')
in(<array of items> : array, <item to find> : any) => boolean
Checks if an item is in the array
in([10, 20, 30], 10) -> true
in(['good', 'kid'], 'bad') -> false
initCap(<value1> : string) => string
Converts the first letter of every word to uppercase. Words are identified as separated by whitespace
initCap('cool iceCREAM') -> 'Cool IceCREAM'
instr(<string> : string, <substring to find> : string) => integer
Finds the position(1 based) of the substring within a string. 0 is returned if not found
instr('great', 'eat') -> 3
instr('microsoft', 'o') -> 7
instr('good', 'bad') -> 0
isDelete([<value1> : integer]) => boolean
Checks if the row is marked for delete. For transformations taking more than one input stream you can pass the (1-based) index of the stream. Default value for the stream index is 1
isDelete() -> true
isDelete(1) -> false
isError([<value1> : integer]) => boolean
Checks if the row is marked as error. For transformations taking more than one input stream you can pass the (1-based) index of the stream. Default value for the stream index is 1
isError() -> true
isError(1) -> false
isIgnore([<value1> : integer]) => boolean
Checks if the row is marked to be ignored. For transformations taking more than one input stream you can pass the (1-based) index of the stream. Default value for the stream index is 1
isIgnore() -> true
isIgnore(1) -> false
isInsert([<value1> : integer]) => boolean
Checks if the row is marked for insert. For transformations taking more than one input stream you can pass the (1-based) index of the stream. Default value for the stream index is 1
isInsert() -> true
isInsert(1) -> false
isMatch([<value1> : integer]) => boolean
Checks if the row is matched at lookup. For transformations taking more than one input stream you can pass the (1-based) index of the stream. Default value for the stream index is 1
isMatch() -> true
isMatch(1) -> false
isNull(<value1> : any) => boolean
Checks if the value is NULL
isNull(NULL()) -> true
isNull('') -> false'
isUpdate([<value1> : integer]) => boolean
Checks if the row is marked for update. For transformations taking more than one input stream you can pass the (1-based) index of the stream. Default value for the stream index is 1
isUpdate() -> true
isUpdate(1) -> false
kurtosis(<value1> : number) => double
Gets the kurtosis of a column
kurtosis(sales) -> 122.12
kurtosisIf(<value1> : boolean, <value2> : number) => double
Based on a criteria, gets the kurtosis of a column
kurtosisIf(region == 'West', sales) -> 122.12
lag(<value> : any, [<number of rows to look before> : number], [<default value> : any]) => any
Gets the value of the first parameter evaluated n rows before the current row. The second parameter is the number of rows to look back and the default value is 1. If there are not as many rows a value of null is returned unless a default value is specified
lag(amount, 2) -> 60
lag(amount, 2000, 100) -> 100
last(<value1> : any, [<value2> : boolean]) => any
Gets the last value of a column group. If the second parameter ignoreNulls is omitted, it is assumed false
last(sales) -> 523.12
last(sales, false) -> NULL
lastDayOfMonth(<value1> : datetime) => date
Gets the last date of the month given a date
lastDayOfMonth(toDate('2009-01-12')) -> 2009-01-31
lead(<value> : any, [<number of rows to look after> : number], [<default value> : any]) => any
Gets the value of the first parameter evaluated n rows after the current row. The second parameter is the number of rows to look forward and the default value is 1. If there are not as many rows a value of null is returned unless a default value is specified
lead(amount, 2) -> 60
lead(amount, 2000, 100) -> 100
least(<value1> : any, ...) => any
Comparison lesser than or equal operator. Same as <= operator
least(10, 30, 15, 20) -> 10
least(toDate('12/12/2010'), toDate('12/12/2011'), toDate('12/12/2000')) -> '12/12/2000'
left(<string to subset> : string, <number of characters> : integral) => string
Extracts a substring start at index 1 with number of characters. Same as SUBSTRING(str, 1, n)
left('bojjus', 2) -> 'bo'
left('bojjus', 20) -> 'bojjus'
length(<value1> : string) => integer
Returns the length of the string
length('kiddo') -> 5
lesser(<value1> : any, <value2> : any) => boolean
Comparison less operator. Same as < operator
lesser(12 < 24) -> true
'abcd' < 'abc' -> false
toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS') < toTimestamp('2019-02-05 08:21:34.890', 'yyyy-MM-dd HH:mm:ss.SSS') => true
lesserOrEqual(<value1> : any, <value2> : any) => boolean
Comparison lesser than or equal operator. Same as <= operator
lesserOrEqual(12, 12) -> true
'abcd' <= 'abc' -> false
levenshtein(<from string> : string, <to string> : string) => integer
Gets the levenshtein distance between two strings
levenshtein('boys', 'girls') -> 4
like(<string> : string, <pattern match> : string) => boolean
The pattern is a string that is matched literally. The exceptions are the following special symbols: _ matches any one character in the input (similar to . in posix regular expressions)
% matches zero or more characters in the input (similar to .* in posix regular expressions).
The escape character is ''. If an escape character precedes a special symbol or another escape character, the following character is matched literally. It is invalid to escape any other character.
like('icecream', 'ice%') -> true
locate(<substring to find> : string, <string> : string, [<from index - 1-based> : integral]) => integer
Finds the position(1 based) of the substring within a string starting a certain position. If the position is omitted it is considered from the beginning of the string. 0 is returned if not found
locate('eat', 'great') -> 3
locate('o', 'microsoft', 6) -> 7
locate('bad', 'good') -> 0
log(<value1> : number, [<value2> : number]) => double
Calculates log value. An optional base can be supplied else a euler number if used
log(100, 10) -> 2
log10(<value1> : number) => double
Calculates log value based on 10 base
log10(100) -> 2
lower(<value1> : string) => string
Lowercases a string
lower('GunChus') -> 'gunchus'
lpad(<string to pad> : string, <final padded length> : integral, <padding> : string) => string
Left pads the string by the supplied padding until it is of a certain length. If the string is equal to or greater than the length, then it is considered a no-op
lpad('great', 10, '-') -> '-----great'
lpad('great', 4, '-') -> 'great'
lpad('great', 8, '<>') -> '<><great'
ltrim(<string to trim> : string, <trim characters> : string) => string
Left trims a string of leading characters. If second parameter is unspecified, it trims whitespace. Else it trims any character specified in the second parameter
ltrim('!--!wor!ld!', '-!') -> 'wor!ld!'
max(<value1> : any) => any
Gets the maximum value of a column
MAX(sales) -> 12312131.12
maxIf(<value1> : boolean, <value2> : any) => any
Based on a criteria, gets the maximum value of a column
maxIf(region == 'West', sales) -> 99999.56
md5(<value1> : any, ...) => string
Calculates the MD5 digest of set of column of varying primitive datatypes and returns a 32 character hex string. It can be used to calculate a fingerprint for a row
md5(5, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) -> 'c1527622a922c83665e49835e46350fe'
mean(<value1> : number) => number
Gets the mean of values of a column. Same as AVG
mean(sales) -> 7523420.234
meanIf(<value1> : boolean, <value2> : number) => number
Based on a criteria gets the mean of values of a column. Same as avgIf
meanIf(region == 'West', sales) -> 7523420.234
min(<value1> : any) => any
Gets the minimum value of a column
min(sales) -> 00.01
min(orderDate) -> 12/12/2000
minIf(<value1> : boolean, <value2> : any) => any
Based on a criteria, gets the minimum value of a column
minIf(region == 'West', sales) -> 00.01
minus(<value1> : any, <value2> : any) => any
Subtracts numbers. Subtract from a date number of days. Substract duration from a timestamp. Same as the - operator
minus(20, 10) -> 10
20 - 10 -> 10
minus(toDate('2012-12-15'), 3) -> 2012-12-12 (date value)
toDate('2012-12-15') - 3 -> 2012-12-13 (date value)
toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS') + (days(1) + hours(2) - seconds(10)) => 2019-02-04 07:19:18.871
minute(<value1> : timestamp, [<value2> : string]) => integer
Gets the minute value of a timestamp. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default.
minute(toTimestamp('2009-07-30T12:58:59')) -> 58
minute(toTimestamp('2009-07-30T12:58:59', 'PST')) -> 58
mod(<value1> : any, <value2> : any) => any
Modulus of pair of numbers. Same as the % operator
mod(20, 8) -> 4
20 % 8 -> 4
month(<value1> : datetime) => integer
Gets the month value of a date or timestamp
month(toDate('2012-8-8')) -> 8
monthsBetween(<from date/timestamp> : datetime, <to date/timestamp> : datetime, [<time zone> : boolean], [<value4> : string]) => double
Gets the number of months between two datesYou can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default.
monthsBetween(toDate('1997-02-28 10:30:00'), toDate('1996-10-30')) -> 3.94959677
multiply(<value1> : any, <value2> : any) => any
Multiplies pair of numbers. Same as the * operator
multiply(20, 10) -> 200
20 * 10 -> 200
nTile([<value1> : integer]) => integer
The NTile function divides the rows for each window partition into n
buckets ranging from 1 to at most n
. Bucket values will differ by at most 1. If the number of rows in the partition does not divide evenly into the number of buckets, then the remainder values are distributed one per bucket, starting with the first bucket. The NTile function is useful for the calculation of tertiles, quartiles, deciles, and other common summary statistics. The function calculates two variables during initialization: The size of a regular bucket will have one extra row added to it. Both variables are based on the size of the current partition. During the calculation process the function keeps track of the current row number, the current bucket number, and the row number at which the bucket will change (bucketThreshold). When the current row number reaches bucket threshold, the bucket value is increased by one and the threshold is increased by the bucket size (plus one extra if the current bucket is padded).
nTile() -> 1
nTile(numOfBuckets) -> 1
negate(<value1> : number) => number
Negates a number. Turns positive numbers to negative and vice versa
negate(13) -> -13
nextSequence() => long
Returns the next unique sequence. The number is consecutive only within a partition and is prefixed by the partitionId
nextSequence() -> 12313112
normalize(<String to normalize> : string) => string
Normalize the string value to separate accented unicode characters
normalize('boys') -> 'boys'
not(<value1> : boolean) => boolean
Logical negation operator
not(true) -> false
not(premium)
notEquals(<value1> : any, <value2> : any) => boolean
Comparison not equals operator. Same as != operator
12!=24 -> true
'abc'!='abc' -> false
null() => null
Returns a NULL value. Use the function syntax(null()) if there is a column named 'null'. Any operation that uses will result in a NULL
custId = NULL (for derived field)
custId == NULL -> NULL
'nothing' + NULL -> NULL
10 * NULL -> NULL'
NULL == '' -> NULL'
or(<value1> : boolean, <value2> : boolean) => boolean
Logical OR operator. Same as ||
or(true, false) -> true
true || false -> true
pMod(<value1> : any, <value2> : any) => any
Positive Modulus of pair of numbers.
pmod(-20, 8) -> 4
power(<value1> : number, <value2> : number) => double
Raises one number to the power of another
power(10, 2) -> 100
rank(<value1> : any, ...) => integer
Computes the rank of a value in a group of values. The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. The values will produce gaps in the sequence. Rank works even when data is not sorted and looks for change in values
rank(salesQtr, salesAmt) -> 1
regexExtract(<string> : string, <regex to find> : string, [<match group 1-based index> : integral]) => string
Extract a matching substring for a given regex pattern. The last parameter identifies the match group and is defaulted to 1 if omitted. Use <regex>
(back quote) to match a string without escaping
regexExtract('Cost is between 600 and 800 dollars', '(\\d+) and (\\d+)', 2) -> '800'
regexExtract('Cost is between 600 and 800 dollars', `(\d+) and (\d+)`, 2) -> '800'
regexMatch(<string> : string, <regex to match> : string) => boolean
Checks if the string matches the given regex pattern. Use <regex>
(back quote) to match a string without escaping
regexMatch('200.50', '(\\d+).(\\d+)') -> true
regexMatch('200.50', `(\d+).(\d+)`) -> true
regexReplace(<string> : string, <regex to find> : string, <substring to replace> : string) => string
Replace all occurrences of a regex pattern with another substring in the given string Use <regex>
(back quote) to match a string without escaping
regexReplace('100 and 200', '(\\d+)', 'bojjus') -> 'bojjus and bojjus'
regexReplace('100 and 200', `(\d+)`, 'gunchus') -> 'gunchus and gunchus'
regexSplit(<string to split> : string, <regex expression> : string) => array
Splits a string based on a delimiter based on regex and returns an array of strings
regexSplit('oneAtwoBthreeC', '[CAB]') -> ['one', 'two', 'three']
regexSplit('oneAtwoBthreeC', '[CAB]')[1] -> 'one'
regexSplit('oneAtwoBthreeC', '[CAB]')[0] -> NULL
regexSplit('oneAtwoBthreeC', '[CAB]')[20] -> NULL
replace(<string> : string, <substring to find> : string, <substring to replace> : string) => string
Replace all occurrences of a substring with another substring in the given string
replace('doggie dog', 'dog', 'cat') -> 'catgie cat'
replace('doggie dog', 'dog', '') -> 'gie'
reverse(<value1> : string) => string
Reverses a string
reverse('gunchus') -> 'suhcnug'
right(<string to subset> : string, <number of characters> : integral) => string
Extracts a substring with number of characters from the right. Same as SUBSTRING(str, LENGTH(str) - n, n)
right('bojjus', 2) -> 'us'
right('bojjus', 20) -> 'bojjus'
rlike(<string> : string, <pattern match> : string) => boolean
Checks if the string matches the given regex pattern
rlike('200.50', '(\d+).(\d+)') -> true
round(<number> : number, [<scale to round> : number], [<rounding option> : integral]) => double
Rounds a number given an optional scale and an optional rounding mode. If the scale is omitted, it is defaulted to 0. If the mode is omitted, it is defaulted to ROUND_HALF_UP(5). The values for rounding include
1 - ROUND_UP
2 - ROUND_DOWN
3 - ROUND_CEILING
4 - ROUND_FLOOR
5 - ROUND_HALF_UP
6 - ROUND_HALF_DOWN
7 - ROUND_HALF_EVEN
8 - ROUND_UNNECESSARY
round(100.123) -> 100.0
round(2.5, 0) -> 3.0
round(5.3999999999999995, 2, 7) -> 5.40
rowNumber() => integer
Assigns a sequential row numbering for rows in a window starting with 1
rowNumber() -> 1
rpad(<string to pad> : string, <final padded length> : integral, <padding> : string) => string
Right pads the string by the supplied padding until it is of a certain length. If the string is equal to or greater than the length, then it is considered a no-op
rpad('great', 10, '-') -> 'great-----'
rpad('great', 4, '-') -> 'great'
rpad('great', 8, '<>') -> 'great<><'
rtrim(<string to trim> : string, <trim characters> : string) => string
Right trims a string of leading characters. If second parameter is unspecified, it trims whitespace. Else it trims any character specified in the second parameter
rtrim('!--!wor!ld!', '-!') -> '!--!wor!ld'
second(<value1> : timestamp, [<value2> : string]) => integer
Gets the second value of a date. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. The local timezone is used as the default.
second(toTimestamp('2009-07-30T12:58:59')) -> 59
sha1(<value1> : any, ...) => string
Calculates the SHA-1 digest of set of column of varying primitive datatypes and returns a 40 character hex string. It can be used to calculate a fingerprint for a row
sha1(5, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) -> '63849fd2abb65fbc626c60b1f827bd05573f0cea'
sha2(<value1> : integer, <value2> : any, ...) => string
Calculates the SHA-2 digest of set of column of varying primitive datatypes given a bit length which can only be of values 0(256), 224, 256, 384, 512. It can be used to calculate a fingerprint for a row
sha2(256, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) -> 'd3b2bff62c3a00e9370b1ac85e428e661a7df73959fa1a96ae136599e9ee20fd'
sin(<value1> : number) => double
Calculates a sine value
sin(2) -> 0.90929742682
sinh(<value1> : number) => double
Calculates a hyperbolic sine value
sinh(0) -> 0.0
skewness(<value1> : number) => double
Gets the skewness of a column
skewness(sales) -> 122.12
skewnessIf(<value1> : boolean, <value2> : number) => double
Based on a criteria, gets the skewness of a column
skewnessIf(region == 'West', sales) -> 122.12
slice(<array to slice> : array, <from 1-based index> : integral, [<number of items> : integral]) => array
Extracts a subset of an array from a position. Position is 1 based. If the length is omitted, it is defaulted to end of the string
slice([10, 20, 30, 40], 1, 2) -> [10, 20]
slice([10, 20, 30, 40], 2) -> [20, 30, 40]
slice([10, 20, 30, 40], 2)[1] -> 20
slice([10, 20, 30, 40], 2)[0] -> NULL
slice([10, 20, 30, 40], 2)[20] -> NULL
slice([10, 20, 30, 40], 8) -> []
soundex(<value1> : string) => string
Gets the soundex code for the string
soundex('genius') -> 'G520'
split(<string to split> : string, <split characters> : string) => array
Splits a string based on a delimiter and returns an array of strings
split('100,200,300', ',') -> ['100', '200', '300']
split('100,200,300', '|') -> ['100,200,300']
split('100, 200, 300', ', ') -> ['100', '200', '300']
split('100, 200, 300', ', ')[1] -> '100'
split('100, 200, 300', ', ')[0] -> NULL
split('100, 200, 300', ', ')[20] -> NULL
split('100200300', ',') -> ['100200300']
sqrt(<value1> : number) => double
Calculates the square root of a number
sqrt(9) -> 3
startsWith(<string> : string, <substring to check> : string) => boolean
Checks if the string starts with the supplied string
startsWith('great', 'gr') -> true
stddev(<value1> : number) => double
Gets the standard deviation of a column
stdDev(sales) -> 122.12
stddevIf(<value1> : boolean, <value2> : number) => double
Based on a criteria, gets the standard deviation of a column
stddevIf(region == 'West', sales) -> 122.12
stddevPopulation(<value1> : number) => double
Gets the population standard deviation of a column
stddevPopulation(sales) -> 122.12
stddevPopulationIf(<value1> : boolean, <value2> : number) => double
Based on a criteria, gets the population standard deviation of a column
stddevPopulationIf(region == 'West', sales) -> 122.12
stddevSample(<value1> : number) => double
Gets the sample standard deviation of a column
stddevSample(sales) -> 122.12
stddevSampleIf(<value1> : boolean, <value2> : number) => double
Based on a criteria, gets the sample standard deviation of a column
stddevSampleIf(region == 'West', sales) -> 122.12
subDays(<date/timestamp> : datetime, <days to subtract> : integral) => datetime
Subtract months from a date. Same as the - operator for date
subDays(toDate('2016-08-08'), 1) -> 2016-08-09
subMonths(<date/timestamp> : datetime, <months to subtract> : integral) => datetime
Subtract months from a date or timestamp
subMonths(toDate('2016-09-30'), 1) -> 2016-08-31
substring(<string to subset> : string, <from 1-based index> : integral, [<number of characters> : integral]) => string
Extracts a substring of a certain length from a position. Position is 1 based. If the length is omitted, it is defaulted to end of the string
substring('Cat in the hat', 5, 2) -> 'in'
substring('Cat in the hat', 5, 100) -> 'in the hat'
substring('Cat in the hat', 5) -> 'in the hat'
substring('Cat in the hat', 100, 100) -> ''
sum(<value1> : number) => number
Gets the aggregate sum of a numeric column
sum(col) -> value
sumDistinct(<value1> : number) => number
Gets the aggregate sum of distinct values of a numeric column
sumDistinct(col) -> value
sumDistinctIf(<value1> : boolean, <value2> : number) => number
Based on criteria gets the aggregate sum of a numeric column. The condition can be based on any column
sumDistinctIf(state == 'CA' && commission < 10000, sales) -> value
sumDistinctIf(true, sales) -> SUM(sales)
sumIf(<value1> : boolean, <value2> : number) => number
Based on criteria gets the aggregate sum of a numeric column. The condition can be based on any column
sumIf(state == 'CA' && commission < 10000, sales) -> value
sumIf(true, sales) -> SUM(sales)
tan(<value1> : number) => double
Calculates a tangent value
tan(0) -> 0.0
tanh(<value1> : number) => double
Calculates a hyperbolic tangent value
tanh(0) -> 0.0
toBoolean(<value1> : string) => boolean
Converts a value of ('t', 'true', 'y', 'yes', '1') to true and ('f', 'false', 'n', 'no', '0') to false and NULL for any other value
toBoolean('true') -> true
toBoolean('n') -> false
toBoolean('truthy') -> NULL
toDate(<string> : any, [<date format> : string]) => date
Converts a string to a date given an optional date format. Refer to Java SimpleDateFormat for all possible formats. If the date format is omitted, combinations of the following are accepted. [ yyyy, yyyy-[M]M, yyyy-[M]M-[d]d, yyyy-[M]M-[d]d, yyyy-[M]M-[d]d, yyyy-[M]M-[d]dT* ]
toDate('2012-8-8') -> 2012-8-8
toDate('12/12/2012', 'MM/dd/yyyy') -> 2012-12-12
toDecimal(<value> : any, [<precision> : integral], [<scale> : integral], [<format> : string], [<locale> : string]) => decimal(10,0)
Converts any numeric or string to a decimal value. If precision and scale are not specified, it is defaulted to (10,2).An optional Java decimal format can be used for the conversion. An optional locale format in the form of BCP47 language like en-US, de, zh-CN
toDecimal(123.45) -> 123.45
toDecimal('123.45', 8, 4) -> 123.4500
toDecimal('$123.45', 8, 4,'$###.00') -> 123.4500
toDecimal('Ç123,45', 10, 2, 'Ç###,##', 'de') -> 123.45
toDouble(<value> : any, [<format> : string], [<locale> : string]) => double
Converts any numeric or string to a double value. An optional Java decimal format can be used for the conversion. An optional locale format in the form of BCP47 language like en-US, de, zh-CN
toDouble(123.45) -> 123.45
toDouble('123.45') -> 123.45
toDouble('$123.45', '$###.00') -> 123.45
toDouble('Ç123,45', 'Ç###,##', 'de') -> 123.45
toFloat(<value> : any, [<format> : string], [<locale> : string]) => float
Converts any numeric or string to a float value. An optional Java decimal format can be used for the conversion. Truncates any double
toFloat(123.45) -> 123.45
toFloat('123.45') -> 123.45
toFloat('$123.45', '$###.00') -> 123.45
toInteger(<value> : any, [<format> : string], [<locale> : string]) => integer
Converts any numeric or string to an integer value. An optional Java decimal format can be used for the conversion. Truncates any long, float, double
toInteger(123) -> 123
toInteger('123') -> 123
toInteger('$123', '$###') -> 123
toLong(<value> : any, [<format> : string], [<locale> : string]) => long
Converts any numeric or string to a long value. An optional Java decimal format can be used for the conversion. Truncates any float, double
toLong(123) -> 123
toLong('123') -> 123
toLong('$123', '$###') -> 123
toShort(<value> : any, [<format> : string], [<locale> : string]) => short
Converts any numeric or string to a short value. An optional Java decimal format can be used for the conversion. Truncates any integer, long, float, double
toShort(123) -> 123
toShort('123') -> 123
toShort('$123', '$###') -> 123
toString(<value> : any, [<number format/date format> : string]) => string
Converts a primitive datatype to a string. For numbers and date a format can be specified. If unspecified the system default is picked.Java decimal format is used for numbers. Refer to Java SimpleDateFormat for all possible date formats; the default format is yyyy-MM-dd
toString(10) -> '10'
toString('engineer') -> 'engineer'
toString(123456.789, '##,###.##') -> '123,456.79'
toString(123.78, '000000.000') -> '000123.780'
toString(12345, '##0.#####E0') -> '12.345E3'
toString(toDate('2018-12-31')) -> '2018-12-31'
toString(toDate('2018-12-31'), 'MM/dd/yy') -> '12/31/18'
toString(4 == 20) -> 'false'
toTimestamp(<string> : any, [<timestamp format> : string], [<time zone> : string]) => timestamp
Converts a string to a timestamp given an optional timestamp format. Refer to Java SimpleDateFormat for all possible formats. If the timestamp is omitted the default pattern yyyy-[M]M-[d]d hh:mm:ss[.f...] is used. Timestamp supports upto millisecond accuracy with value of 999
toTimestamp('2016-12-31 00:12:00') -> 2012-8-8T00:12:00
toTimestamp('2016/12/31T00:12:00', 'MM/dd/yyyyThh:mm:ss') -> 2012-12-12T00:12:00
toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS') -> 2019-02-03 05:19:28.871
toUTC(<value1> : timestamp, [<value2> : string]) => timestamp
Converts the timestamp to UTC. You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. It is defaulted to the current timezone
toUTC(currentTimeStamp()) -> 12-12-2030T19:18:12
toUTC(currentTimeStamp(), 'Asia/Seoul') -> 12-13-2030T11:18:12
translate(<string to translate> : string, <lookup characters> : string, <replace characters> : string) => string
Replace one set of characters by another set of characters in the string. Characters have 1 to 1 replacement
translate('(Hello)', '()', '[]') -> '[Hello]'
translate('(Hello)', '()', '[') -> '[Hello'
trim(<string to trim> : string, [<trim characters> : string]) => string
Trims a string of leading and trailing characters. If second parameter is unspecified, it trims whitespace. Else it trims any character specified in the second parameter
trim('!--!wor!ld!', '-!') -> 'wor!ld'
true() => boolean
Always returns a true value. Use the function syntax(true()) if there is a column named 'true'
isDiscounted == true()
isDiscounted() == true
typeMatch(<type> : string, <base type> : string) => boolean
Matches the type of the column. Can only be used in pattern expressions.number matches short, integer, long, double, float or decimal, integral matches short, integer, long, fractional matches double, float, decimal and datetime matches date or timestamp type
typeMatch(type, 'number') -> true
typeMatch('date', 'number') -> false
upper(<value1> : string) => string
Uppercases a string
upper('bojjus') -> 'BOJJUS'
variance(<value1> : number) => double
Gets the variance of a column
variance(sales) -> 122.12
varianceIf(<value1> : boolean, <value2> : number) => double
Based on a criteria, gets the variance of a column
varianceIf(region == 'West', sales) -> 122.12
variancePopulation(<value1> : number) => double
Gets the population variance of a column
variancePopulation(sales) -> 122.12
variancePopulationIf(<value1> : boolean, <value2> : number) => double
Based on a criteria, gets the population variance of a column
variancePopulationIf(region == 'West', sales) -> 122.12
varianceSample(<value1> : number) => double
Gets the unbiased variance of a column
varianceSample(sales) -> 122.12
varianceSampleIf(<value1> : boolean, <value2> : number) => double
Based on a criteria, gets the unbiased variance of a column
varianceSampleIf(region == 'West', sales) -> 122.12
weekOfYear(<value1> : datetime) => integer
Gets the week of the year given a date
weekOfYear(toDate('2008-02-20')) -> 8
xor(<value1> : boolean, <value2> : boolean) => boolean
Logical XOR operator. Same as ^ operator
xor(true, false) -> true
xor(true, true) -> false
true ^ false -> true
year(<value1> : datetime) => integer
Gets the year value of a date
year(toDate('2012-8-8')) -> 2012