Chinaunix首页 | 论坛 | 博客
  • 博客访问: 374231
  • 博文数量: 129
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 1038
  • 用 户 组: 普通用户
  • 注册时间: 2016-03-11 14:13
  • 认证徽章:
个人简介

狂甩酷拽吊炸天

文章分类

全部博文(129)

文章存档

2019年(27)

2018年(17)

2017年(35)

2016年(50)

分类: LINUX

2019-08-14 17:28:47

!

! expr :逻辑非。

%

expr1 % expr2 - 返回 expr1/expr2 的余数.

例子:

> SELECT 2 % 1.8;
 0.2
> SELECT MOD(2, 1.8);
 0.2

&

expr1 & expr2 - 返回 expr1 和 expr2 的按位AND的结果。

例子:

> SELECT 3 & 5;
 1

*

expr1 * expr2 - 返回 expr1*expr2.

例子:

> SELECT 2 * 3;
 6

+

expr1 + expr2 - 返回 expr1+expr2.

例子:

> SELECT 1 + 2;
 3

-

expr1 - expr2 - 返回 expr1-expr2.

例子:

> SELECT 2 - 1;
 1

/

expr1 / expr2 - 返回 expr1/expr2,返回结果总是浮点数。

例子:

> SELECT 3 / 2;
 1.5
> SELECT 2L / 2L;
 1.0

<

expr1 < expr2 - 如果 expr1 小于 expr2 则返回 true.

参数:
expr1, expr2 - 比较的两个参数类型必须一致,或者可以转换成一样的类型,而且这个类型支持排序。比如 map 类型就是不支持比较的,所以这个操作符不支持 map 类型的参数。

例子:

> SELECT 1 < 2;
 true
> SELECT 1.1 < '1';
 false
> SELECT to_date('2009-07-30 04:17:52') < to_date('2009-07-30 04:17:52');
 false
> SELECT to_date('2009-07-30 04:17:52') < to_date('2009-08-01 04:17:52');
 true
> SELECT 1 < NULL;
 NULL

<=

expr1 <= expr2 - 如果 expr1 小于等于 expr2。

例子:
expr1, expr2 - 比较的两个参数类型必须一致,或者可以转换成一样的类型,而且这个类型支持排序。比如 map 类型就是不支持比较的,所以这个操作符不支持 map 类型的参数。

例子:

> SELECT 2 <= 2;
 true
> SELECT 1.0 <= '1';
 true
> SELECT to_date('2009-07-30 04:17:52') <= to_date('2009-07-30 04:17:52');
 true
> SELECT to_date('2009-07-30 04:17:52') <= to_date('2009-08-01 04:17:52');
 true
> SELECT 1 <= NULL;
 NULL

<=>

expr1 <=> expr2 - 返回的结果和 EQUAL(=) 一样。如果操作符两边都是 null,该操作符返回 true;仅一边为null则返回false。

参数:
expr1, expr2 - 比较的两个参数类型必须一致,或者可以转换成一样的类型,而且这个类型支持排序。比如 map 类型就是不支持比较的,所以这个操作符不支持 map 类型的参数。
例子:

> SELECT 2 <=> 2;
 true
> SELECT 1 <=> '1';
 true
> SELECT true <=> NULL;
 false
> SELECT NULL <=> NULL;
 true

=

expr1 = expr2 - 如果 expr1 等于 expr2 则返回true,否则返回false。

参数:
expr1, expr2 - 比较的两个参数类型必须一致,或者可以转换成一样的类型,而且这个类型支持排序。比如 map 类型就是不支持比较的,所以这个操作符不支持 map 类型的参数。
例子:

> SELECT 2 = 2;
 true
> SELECT 1 = '1';
 true
> SELECT true = NULL;
 NULL
> SELECT NULL = NULL;
 NULL

==

expr1 == expr2 - 如果 expr1 等于 expr2 则返回true,否则返回false。

参数:
expr1, expr2 - 比较的两个参数类型必须一致,或者可以转换成一样的类型,而且这个类型支持排序。比如 map 类型就是不支持比较的,所以这个操作符不支持 map 类型的参数。
例子:

> SELECT 2 == 2;
 true
> SELECT 1 == '1';
 true
> SELECT true == NULL;
 NULL
> SELECT NULL == NULL;
 NULL

>

expr1 > expr2 - 如果 expr1 大于 expr2 则返回 true。

参数:
expr1, expr2 - 比较的两个参数类型必须一致,或者可以转换成一样的类型,而且这个类型支持排序。比如 map 类型就是不支持比较的,所以这个操作符不支持 map 类型的参数。
例子:

> SELECT 2 > 1;
 true
> SELECT 2 > '1.1';
 true
> SELECT to_date('2009-07-30 04:17:52') > to_date('2009-07-30 04:17:52');
 false
> SELECT to_date('2009-07-30 04:17:52') > to_date('2009-08-01 04:17:52');
 false
> SELECT 1 > NULL;
 NULL

>=

expr1 >= expr2 - 如果 expr1 大于等于 expr2 则返回 true。

参数:
expr1, expr2 - 比较的两个参数类型必须一致,或者可以转换成一样的类型,而且这个类型支持排序。比如 map 类型就是不支持比较的,所以这个操作符不支持 map 类型的参数。
例子:

> SELECT 2 >= 1;
 true
> SELECT 2.0 >= '2.1';
 false
> SELECT to_date('2009-07-30 04:17:52') >= to_date('2009-07-30 04:17:52');
 true
> SELECT to_date('2009-07-30 04:17:52') >= to_date('2009-08-01 04:17:52');
 false
> SELECT 1 >= NULL;
 NULL

^

expr1 ^ expr2 - 返回 expr1 和 expr2 的按位异或的结果。

例子:

> SELECT 3 ^ 5;
 2

abs

abs(expr) - 返回数值的绝对值。
例子:

> SELECT abs(-1);
 1

acos

acos(expr) - 如果 -1 <= expr <= 1,则返回 expr 的反余弦,否则返回 NaN。
例子:

> SELECT acos(1);
 0.0
> SELECT acos(2);
 NaN

add_months

add_months(start_date, num_months)

例子:

> SELECT add_months('2016-08-31', 1);
 2016-09-30

Since: 1.5.0

and

expr1 and expr2 - 逻辑 AND.

approx_count_distinct

approx_count_distinct(expr[, relativeSD]) - 通过 HyperLogLog ++ 返回估计的基数. relativeSD 定义允许的最大估计误差。

approx_percentile

approx_percentile(col, percentage [, accuracy]) - 返回给定百分比处数值列 col 的近似百分位数值。百分比的值必须是 0.0 到 1.0 之间。

例子:

> SELECT approx_percentile(10.0, array(0.5, 0.4, 0.1), 100);
 [10.0,10.0,10.0]
> SELECT approx_percentile(10.0, 0.5, 100);
 10.0

array

array(expr, ...) - 返回给定值组成的数组。

例子:

> SELECT array(1, 2, 3);
 [1,2,3]

array_contains

array_contains(array, value) - 如果数组包含了 value,则返回 true。

例子:

> SELECT array_contains(array(1, 2, 3), 2);
 true

ascii

ascii(str) - 返回 str 的第一个字符的 ascii 数值。

例子:

> SELECT ascii('222');
 50
> SELECT ascii(2);
 50

asin

asin(expr) - 如果 -1 <= expr <= 1,则返回 expr 的反正弦,否则返回 NaN。

例子:

> SELECT asin(0);
 0.0
> SELECT asin(2);
 NaN

assert_true

assert_true(expr) - 如果 expr 表达式的返回值不是 true 则抛出异常。

例子:

> SELECT assert_true(0 < 1);
 NULL

atan

atan(expr) - 返回 expr 的反正切。

例子:

> SELECT atan(0);
 0.0

atan2

atan2(expr1, expr2) - 返回平面的正 x 轴与由坐标(expr1,expr2)点之间的弧度角度。

例子:

> SELECT atan2(0, 0);
 0.0

avg

avg(expr) - 返回 expr 表达式的平均值。

base64

base64(bin) - 将参数从二进制文件转换为 base64 的字符串。

例子:

> SELECT base64('Spark SQL');
 U3BhcmsgU1FM

bigint

bigint(expr) - 将值 expr 转换为 bigint 数据类型。

bin

bin(expr) - 返回 long 类型的参数 expr 的二进制字符串表示形式。

例子:

> SELECT bin(13);
 1101
> SELECT bin(-13);
 1111111111111111111111111111111111111111111111111111111111110011
> SELECT bin(13.3);
 1101

binary

binary(expr) - 将值 expr 转换为 binary 数据类型。

bit_length

bit_length(expr) - 返回字符串数据的位长度或二进制数据的位数。

例子:

> SELECT bit_length('Spark SQL');
 72

boolean

boolean(expr) - 将值 expr 转换为 boolean 数据类型。

bround

bround(expr, d) - 使用 HALF_EVEN 舍入模式返回 expr 四舍五入至 d 位小数点的数据。

例子:

> SELECT bround(2.5, 0);
 2.0

cast

cast(expr AS type) - 将 expr 转换成 type 类型的数据。

例子:

> SELECT cast('10' as int);
 10

cbrt

cbrt(expr) - 返回 expr 的立方根。

例子:

> SELECT cbrt(27.0);
 3.0

ceil

ceil(expr) - 返回不小于 expr 的最小整数。

例子:

> SELECT ceil(-0.1);
 0
> SELECT ceil(5);
 5

ceiling

ceiling(expr) - 返回不小于 expr 的最小整数。

例子:

> SELECT ceiling(-0.1);
 0
> SELECT ceiling(5);
 5

char

char(expr) - 返回二进制等效于 expr 的 ASCII 字符。 如果 n 大于256,则结果等于 chr(n%256)

例子:

> SELECT char(65);
 A

char_length

char_length(expr) - 返回字符串数据的字符长度或二进制数据的字节数。 字符串数据的长度包括尾随空格,二进制数据的长度包括二进制零。

例子:

> SELECT char_length('Spark SQL ');
 10
> SELECT CHAR_LENGTH('Spark SQL ');
 10
> SELECT CHARACTER_LENGTH('Spark SQL ');
 10

character_length

character_length(expr) - 返回字符串数据的字符长度或二进制数据的字节数。 字符串数据的长度包括尾随空格,二进制数据的长度包括二进制零。

例子:

> SELECT character_length('Spark SQL ');
 10
> SELECT CHAR_LENGTH('Spark SQL ');
 10
> SELECT CHARACTER_LENGTH('Spark SQL ');
 10

chr

chr(expr) - 返回二进制等效于 expr 的 ASCII 字符。 如果 n 大于256,则结果等于 chr(n%256)

例子:

> SELECT chr(65);
 A

coalesce

coalesce(expr1, expr2, ...) - 返回第一个非空参数(如果存在)。 否则,返回 null。

例子:

> SELECT coalesce(NULL, 1, NULL);
 1

collect_list

collect_list(expr) - 收集并返回非唯一元素列表。

collect_set

collect_set(expr) - 收集并返回唯一元素列表。

concat

concat(str1, str2, ..., strN) - 返回由 str1, str2, ..., strN 组成的字符串。

例子:

> SELECT concat('Spark', 'SQL');
 SparkSQL

concat_ws

concat_ws(sep, [str | array(str)]+) - 返回由 sep 分隔组成的字符串连接。

例子:

> SELECT concat_ws(' ', 'Spark', 'SQL');
  Spark SQL

conv

conv(num, from_base, to_base) - 将 num 从 from_base 进制转换为 to_base 进制。

例子:

> SELECT conv('100', 2, 10);
 4
> SELECT conv(-10, 16, -10);
 -16

corr

corr(expr1, expr2) - Returns Pearson coefficient of correlation between a set of number pairs.

cos

cos(expr) - 返回 expr 的余弦。

例子:

> SELECT cos(0);
 1.0

cosh

cosh(expr) - 返回 expr 的双曲余弦。

例子:

> SELECT cosh(0);
 1.0

cot

cot(expr) - 返回 expr 的余切值。

例子:

> SELECT cot(1);
 0.6420926159343306

count

count(*) - Returns the total number of retrieved rows, including rows containing null.

count(expr) - Returns the number of rows for which the supplied expression is non-null.

count(DISTINCT expr[, expr...]) - Returns the number of rows for which the supplied expression(s) are unique and non-null.

count_min_sketch

count_min_sketch(col, eps, confidence, seed) - Returns a count-min sketch of a column with the given esp,
confidence and seed. The result is an array of bytes, which can be deserialized to a
CountMinSketch before usage. Count-min sketch is a probabilistic data structure used for
cardinality estimation using sub-linear space.

covar_pop

covar_pop(expr1, expr2) - Returns the population covariance of a set of number pairs.

covar_samp

covar_samp(expr1, expr2) - Returns the sample covariance of a set of number pairs.

crc32

crc32(expr) - Returns a cyclic redundancy check value of the expr as a bigint.

Examples:

> SELECT crc32('Spark');
 1557323817

cube

cume_dist

cume_dist() - Computes the position of a value relative to all values in the partition.

current_database

current_database() - Returns the current database.

Examples:

> SELECT current_database();
 default

current_date

current_date() - Returns the current date at the start of query evaluation.

Since: 1.5.0

current_timestamp

current_timestamp() - Returns the current timestamp at the start of query evaluation.

Since: 1.5.0

date

date(expr) - Casts the value expr to the target data type date.

date_add

date_add(start_date, num_days) - Returns the date that is num_days after start_date.

Examples:

> SELECT date_add('2016-07-30', 1);
 2016-07-31

Since: 1.5.0

date_format

date_format(timestamp, fmt) - Converts timestamp to a value of string in the format specified by the date format fmt.

Examples:

> SELECT date_format('2016-04-08', 'y');
 2016

Since: 1.5.0

date_sub

date_sub(start_date, num_days) - Returns the date that is num_days before start_date.

Examples:

> SELECT date_sub('2016-07-30', 1);
 2016-07-29

Since: 1.5.0

date_trunc

date_trunc(fmt, ts) - Returns timestamp ts truncated to the unit specified by the format model fmt.
fmt should be one of ["YEAR", "YYYY", "YY", "MON", "MONTH", "MM", "DAY", "DD", "HOUR", "MINUTE", "SECOND", "WEEK", "QUARTER"]

Examples:

> SELECT date_trunc('YEAR', '2015-03-05T09:32:05.359');
 2015-01-01 00:00:00
> SELECT date_trunc('MM', '2015-03-05T09:32:05.359');
 2015-03-01 00:00:00
> SELECT date_trunc('DD', '2015-03-05T09:32:05.359');
 2015-03-05 00:00:00
> SELECT date_trunc('HOUR', '2015-03-05T09:32:05.359');
 2015-03-05 09:00:00

Since: 2.3.0

datediff

datediff(endDate, startDate) - Returns the number of days from startDate to endDate.

Examples:

> SELECT datediff('2009-07-31', '2009-07-30');
 1
 
> SELECT datediff('2009-07-30', '2009-07-31');
 -1

Since: 1.5.0

day

day(date) - Returns the day of month of the date/timestamp.

Examples:

> SELECT day('2009-07-30');
 30

Since: 1.5.0

dayofmonth

dayofmonth(date) - Returns the day of month of the date/timestamp.

Examples:

> SELECT dayofmonth('2009-07-30');
 30

Since: 1.5.0

dayofweek

dayofweek(date) - Returns the day of the week for date/timestamp (1 = Sunday, 2 = Monday, ..., 7 = Saturday).

Examples:

> SELECT dayofweek('2009-07-30');
 5

Since: 2.3.0

dayofyear

dayofyear(date) - Returns the day of year of the date/timestamp.

Examples:

> SELECT dayofyear('2016-04-09');
 100

Since: 1.5.0

decimal

decimal(expr) - Casts the value expr to the target data type decimal.

decode

decode(bin, charset) - Decodes the first argument using the second argument character set.

Examples:

> SELECT decode(encode('abc', 'utf-8'), 'utf-8');
 abc

degrees

degrees(expr) - Converts radians to degrees.

Arguments:

  • expr - angle in radians

Examples:

> SELECT degrees(3.141592653589793);
 180.0

dense_rank

dense_rank() - Computes the rank of a value in a group of values. The result is one plus the
previously assigned rank value. Unlike the function rank, dense_rank will not produce gaps
in the ranking sequence.

double

double(expr) - Casts the value expr to the target data type double.

e

e() - Returns Euler's number, e.

Examples:

> SELECT e();
 2.718281828459045

elt

elt(n, input1, input2, ...) - Returns the n-th input, e.g., returns input2 when n is 2.

Examples:

> SELECT elt(1, 'scala', 'java');
 scala

encode

encode(str, charset) - Encodes the first argument using the second argument character set.

Examples:

> SELECT encode('abc', 'utf-8');
 abc

exp

exp(expr) - Returns e to the power of expr.

Examples:

> SELECT exp(0);
 1.0

explode

explode(expr) - Separates the elements of array expr into multiple rows, or the elements of map exprinto multiple rows and columns.

Examples:

> SELECT explode(array(10, 20));
 10
 20

explode_outer

explode_outer(expr) - Separates the elements of array expr into multiple rows, or the elements of map expr into multiple rows and columns.

Examples:

> SELECT explode_outer(array(10, 20));
 10
 20

expm1

expm1(expr) - Returns exp(expr) - 1.

Examples:

> SELECT expm1(0);
 0.0

factorial

factorial(expr) - Returns the factorial of expr. expr is [0..20]. Otherwise, null.

Examples:

> SELECT factorial(5);
 120

find_in_set

find_in_set(str, str_array) - Returns the index (1-based) of the given string (str) in the comma-delimited list (str_array).
Returns 0, if the string was not found or if the given string (str) contains a comma.

Examples:

> SELECT find_in_set('ab','abc,b,ab,c,def');
 3

first

first(expr[, isIgnoreNull]) - Returns the first value of expr for a group of rows.
If isIgnoreNull is true, returns only non-null values.

first_value

first_value(expr[, isIgnoreNull]) - Returns the first value of expr for a group of rows.
If isIgnoreNull is true, returns only non-null values.

float

float(expr) - Casts the value expr to the target data type float.

floor

floor(expr) - Returns the largest integer not greater than expr.

Examples:

> SELECT floor(-0.1);
 -1
> SELECT floor(5);
 5

format_number

format_number(expr1, expr2) - Formats the number expr1 like '#,###,###.##', rounded to expr2
decimal places. If expr2 is 0, the result has no decimal point or fractional part.
This is supposed to function like MySQL's FORMAT.

Examples:

> SELECT format_number(12332.123456, 4);
 12,332.1235

format_string

format_string(strfmt, obj, ...) - Returns a formatted string from printf-style format strings.

Examples:

> SELECT format_string("Hello World %d %s", 100, "days");
 Hello World 100 days

from_json

from_json(jsonStr, schema[, options]) - Returns a struct value with the given jsonStr and schema.

Examples:

> SELECT from_json('{"a":1, "b":0.8}', 'a INT, b DOUBLE');
 {"a":1, "b":0.8}
> SELECT from_json('{"time":"26/08/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'));
 {"time":"2015-08-26 00:00:00.0"}

Since: 2.2.0

from_unixtime

from_unixtime(unix_time, format) - Returns unix_time in the specified format.

Examples:

> SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss');
 1970-01-01 00:00:00

Since: 1.5.0

from_utc_timestamp

from_utc_timestamp(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.

Examples:

> SELECT from_utc_timestamp('2016-08-31', 'Asia/Seoul');
 2016-08-31 09:00:00

Since: 1.5.0

get_json_object

get_json_object(json_txt, path) - Extracts a json object from path.

Examples:

> SELECT get_json_object('{"a":"b"}', '$.a');
 b

greatest

greatest(expr, ...) - Returns the greatest value of all parameters, skipping null values.

Examples:

> SELECT greatest(10, 9, 2, 4, 3);
 10

grouping

grouping_id

hash

hash(expr1, expr2, ...) - Returns a hash value of the arguments.

Examples:

> SELECT hash('Spark', array(123), 2);
 -1321691492

hex

hex(expr) - Converts expr to hexadecimal.

Examples:

> SELECT hex(17);
 11
> SELECT hex('Spark SQL');
 537061726B2053514C

hour

hour(timestamp) - Returns the hour component of the string/timestamp.

Examples:

> SELECT hour('2009-07-30 12:58:59');
 12

Since: 1.5.0

hypot

hypot(expr1, expr2) - Returns sqrt(expr12 + expr22).

Examples:

> SELECT hypot(3, 4);
 5.0

if

if(expr1, expr2, expr3) - If expr1 evaluates to true, then returns expr2; otherwise returns expr3.

Examples:

> SELECT if(1 < 2, 'a', 'b');
 a

ifnull

ifnull(expr1, expr2) - Returns expr2 if expr1 is null, or expr1 otherwise.

Examples:

> SELECT ifnull(NULL, array('2'));
 ["2"]

in

expr1 in(expr2, expr3, ...) - Returns true if expr equals to any valN.

Arguments:

  • expr1, expr2, expr3, ... - the arguments must be same type.

Examples:

> SELECT 1 in(1, 2, 3);
 true
> SELECT 1 in(2, 3, 4);
 false
> SELECT named_struct('a', 1, 'b', 2) in(named_struct('a', 1, 'b', 1), named_struct('a', 1, 'b', 3));
 false
> SELECT named_struct('a', 1, 'b', 2) in(named_struct('a', 1, 'b', 2), named_struct('a', 1, 'b', 3));
 true

initcap

initcap(str) - Returns str with the first letter of each word in uppercase.
All other letters are in lowercase. Words are delimited by white space.

Examples:

> SELECT initcap('sPark sql');
 Spark Sql

inline

inline(expr) - Explodes an array of structs into a table.

Examples:

> SELECT inline(array(struct(1, 'a'), struct(2, 'b')));
 1  a
 2  b

inline_outer

inline_outer(expr) - Explodes an array of structs into a table.

Examples:

> SELECT inline_outer(array(struct(1, 'a'), struct(2, 'b')));
 1  a
 2  b

input_file_block_length

input_file_block_length() - Returns the length of the block being read, or -1 if not available.

input_file_block_start

input_file_block_start() - Returns the start offset of the block being read, or -1 if not available.

input_file_name

input_file_name() - Returns the name of the file being read, or empty string if not available.

instr

instr(str, substr) - Returns the (1-based) index of the first occurrence of substr in str.

Examples:

> SELECT instr('SparkSQL', 'SQL');
 6

int

int(expr) - Casts the value expr to the target data type int.

isnan

isnan(expr) - Returns true if expr is NaN, or false otherwise.

Examples:

> SELECT isnan(cast('NaN' as double));
 true

isnotnull

isnotnull(expr) - Returns true if expr is not null, or false otherwise.

Examples:

> SELECT isnotnull(1);
 true

isnull

isnull(expr) - Returns true if expr is null, or false otherwise.

Examples:

> SELECT isnull(1);
 false

java_method

java_method(class, method[, arg1[, arg2 ..]]) - Calls a method with reflection.

Examples:

> SELECT java_method('java.util.UUID', 'randomUUID');
 c33fb387-8500-4bfa-81d2-6e0e3e930df2
> SELECT java_method('java.util.UUID', 'fromString', 'a5cf6c42-0c85-418f-af6c-3e4e5b1328f2');
 a5cf6c42-0c85-418f-af6c-3e4e5b1328f2

json_tuple

json_tuple(jsonStr, p1, p2, ..., pn) - Returns a tuple like the function get_json_object, but it takes multiple names. All the input parameters and output column types are string.

Examples:

> SELECT json_tuple('{"a":1, "b":2}', 'a', 'b');
 1  2

kurtosis

kurtosis(expr) - Returns the kurtosis value calculated from values of a group.

lag

lag(input[, offset[, default]]) - Returns the value of input at the offsetth row
before the current row in the window. The default value of offset is 1 and the default
value of default is null. If the value of input at the offsetth row is null,
null is returned. If there is no such offset row (e.g., when the offset is 1, the first
row of the window does not have any previous row), default is returned.

last

last(expr[, isIgnoreNull]) - Returns the last value of expr for a group of rows.
If isIgnoreNull is true, returns only non-null values.

last_day

last_day(date) - Returns the last day of the month which the date belongs to.

Examples:

> SELECT last_day('2009-01-12');
 2009-01-31

Since: 1.5.0

last_value

last_value(expr[, isIgnoreNull]) - Returns the last value of expr for a group of rows.
If isIgnoreNull is true, returns only non-null values.

lcase

lcase(str) - Returns str with all characters changed to lowercase.

Examples:

> SELECT lcase('SparkSql');
 sparksql

lead

lead(input[, offset[, default]]) - Returns the value of input at the offsetth row
after the current row in the window. The default value of offset is 1 and the default
value of default is null. If the value of input at the offsetth row is null,
null is returned. If there is no such an offset row (e.g., when the offset is 1, the last
row of the window does not have any subsequent row), default is returned.

least

least(expr, ...) - Returns the least value of all parameters, skipping null values.

Examples:

> SELECT least(10, 9, 2, 4, 3);
 2

left

left(str, len) - Returns the leftmost len(len can be string type) characters from the string str,if lenis less or equal than 0 the result is an empty string.

Examples:

> SELECT left('Spark SQL', 3);
 Spa

length

length(expr) - Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.

Examples:

> SELECT length('Spark SQL ');
 10
> SELECT CHAR_LENGTH('Spark SQL ');
 10
> SELECT CHARACTER_LENGTH('Spark SQL ');
 10

levenshtein

levenshtein(str1, str2) - Returns the Levenshtein distance between the two given strings.

Examples:

> SELECT levenshtein('kitten', 'sitting');
 3

like

str like pattern - Returns true if str matches pattern, null if any arguments are null, false otherwise.

Arguments:

  • str - a string expression
  • pattern - a string expression. The pattern is a string which is matched literally, with
    exception to the following special symbols:

    _ matches any one character in the input (similar to . in posix regular expressions)

    % matches zero or more characters in the input (similar to .* in posix regular
    expressions)

    The escape character is '\'. If an escape character precedes a special symbol or another
    escape character, the following character is matched literally. It is invalid to escape
    any other character.

    Since Spark 2.0, string literals are unescaped in our SQL parser. For example, in order
    to match "\abc", the pattern should be "\abc".

    When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, it fallbacks
    to Spark 1.6 behavior regarding string literal parsing. For example, if the config is
    enabled, the pattern to match "\abc" should be "\abc".

Examples:

> SELECT '%SystemDrive%\Users\John' like '\%SystemDrive\%\\Users%'
true

Note:

Use RLIKE to match with standard regular expressions.

ln

ln(expr) - Returns the natural logarithm (base e) of expr.

Examples:

> SELECT ln(1);
 0.0

locate

locate(substr, str[, pos]) - Returns the position of the first occurrence of substr in str after position pos.
The given pos and return value are 1-based.

Examples:

> SELECT locate('bar', 'foobarbar');
 4
> SELECT locate('bar', 'foobarbar', 5);
 7
> SELECT POSITION('bar' IN 'foobarbar');
 4

log

log(base, expr) - Returns the logarithm of expr with base.

Examples:

> SELECT log(10, 100);
 2.0

log10

log10(expr) - Returns the logarithm of expr with base 10.

Examples:

> SELECT log10(10);
 1.0

log1p

log1p(expr) - Returns log(1 + expr).

Examples:

> SELECT log1p(0);
 0.0

log2

log2(expr) - Returns the logarithm of expr with base 2.

Examples:

> SELECT log2(2);
 1.0

lower

lower(str) - Returns str with all characters changed to lowercase.

Examples:

> SELECT lower('SparkSql');
 sparksql

lpad

lpad(str, len, pad) - Returns str, left-padded with pad to a length of len.
If str is longer than len, the return value is shortened to len characters.

Examples:

> SELECT lpad('hi', 5, '??');
 ???hi
> SELECT lpad('hi', 1, '??');
 h

ltrim

ltrim(str) - Removes the leading space characters from str.

ltrim(trimStr, str) - Removes the leading string contains the characters from the trim string

Arguments:

  • str - a string expression
  • trimStr - the trim string characters to trim, the default value is a single space

Examples:

> SELECT ltrim('    SparkSQL   ');
 SparkSQL
> SELECT ltrim('Sp', 'SSparkSQLS');
 arkSQLS

map

map(key0, value0, key1, value1, ...) - Creates a map with the given key/value pairs.

Examples:

> SELECT map(1.0, '2', 3.0, '4');
 {1.0:"2",3.0:"4"}

map_keys

map_keys(map) - Returns an unordered array containing the keys of the map.

Examples:

> SELECT map_keys(map(1, 'a', 2, 'b'));
 [1,2]

map_values

map_values(map) - Returns an unordered array containing the values of the map.

Examples:

> SELECT map_values(map(1, 'a', 2, 'b'));
 ["a","b"]

max

max(expr) - Returns the maximum value of expr.

md5

md5(expr) - Returns an MD5 128-bit checksum as a hex string of expr.

Examples:

> SELECT md5('Spark');
 8cde774d6f7333752ed72cacddb05126

mean

mean(expr) - Returns the mean calculated from values of a group.

min

min(expr) - Returns the minimum value of expr.

minute

minute(timestamp) - Returns the minute component of the string/timestamp.

Examples:

> SELECT minute('2009-07-30 12:58:59');
 58

Since: 1.5.0

mod

expr1 mod expr2 - Returns the remainder after expr1/expr2.

Examples:

> SELECT 2 mod 1.8;
 0.2
> SELECT MOD(2, 1.8);
 0.2

monotonically_increasing_id

monotonically_increasing_id() - Returns monotonically increasing 64-bit integers. The generated ID is guaranteed
to be monotonically increasing and unique, but not consecutive. The current implementation
puts the partition ID in the upper 31 bits, and the lower 33 bits represent the record number
within each partition. The assumption is that the data frame has less than 1 billion
partitions, and each partition has less than 8 billion records.

month

month(date) - Returns the month component of the date/timestamp.

Examples:

> SELECT month('2016-07-30');
 7

Since: 1.5.0

months_between

months_between(timestamp1, timestamp2) - Returns number of months between timestamp1 and timestamp2.

Examples:

> SELECT months_between('1997-02-28 10:30:00', '1996-10-30');
 3.94959677

Since: 1.5.0

named_struct

named_struct(name1, val1, name2, val2, ...) - Creates a struct with the given field names and values.

Examples:

> SELECT named_struct("a", 1, "b", 2, "c", 3);
 {"a":1,"b":2,"c":3}

nanvl

nanvl(expr1, expr2) - Returns expr1 if it's not NaN, or expr2 otherwise.

Examples:

> SELECT nanvl(cast('NaN' as double), 123);
 123.0

negative

negative(expr) - Returns the negated value of expr.

Examples:

> SELECT negative(1);
 -1

next_day

next_day(start_date, day_of_week) - Returns the first date which is later than start_date and named as indicated.

Examples:

> SELECT next_day('2015-01-14', 'TU');
 2015-01-20

Since: 1.5.0

not

not expr - Logical not.

now

now() - Returns the current timestamp at the start of query evaluation.

Since: 1.5.0

ntile

ntile(n) - Divides the rows for each window partition into n buckets ranging
from 1 to at most n.

nullif

nullif(expr1, expr2) - Returns null if expr1 equals to expr2, or expr1 otherwise.

Examples:

> SELECT nullif(2, 2);
 NULL

nvl

nvl(expr1, expr2) - Returns expr2 if expr1 is null, or expr1 otherwise.

Examples:

> SELECT nvl(NULL, array('2'));
 ["2"]

nvl2

nvl2(expr1, expr2, expr3) - Returns expr2 if expr1 is not null, or expr3 otherwise.

Examples:

> SELECT nvl2(NULL, 2, 1);
 1

octet_length

octet_length(expr) - Returns the byte length of string data or number of bytes of binary data.

Examples:

> SELECT octet_length('Spark SQL');
 9

or

expr1 or expr2 - Logical OR.

parse_url

parse_url(url, partToExtract[, key]) - Extracts a part from a URL.

Examples:

> SELECT parse_url('http://spark.apache.org/path?query=1', 'HOST')
 spark.apache.org
> SELECT parse_url('http://spark.apache.org/path?query=1', 'QUERY')
 query=1
> SELECT parse_url('http://spark.apache.org/path?query=1', 'QUERY', 'query')
 1

percent_rank

percent_rank() - Computes the percentage ranking of a value in a group of values.

percentile

percentile(col, percentage [, frequency]) - Returns the exact percentile value of numeric column
col at the given percentage. The value of percentage must be between 0.0 and 1.0. The
value of frequency should be positive integral

percentile(col, array(percentage1 [, percentage2]...) [, frequency]) - Returns the exact
percentile value array of numeric column col at the given percentage(s). Each value
of the percentage array must be between 0.0 and 1.0. The value of frequency should be
positive integral

percentile_approx

percentile_approx(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric
column col at the given percentage. The value of percentage must be between 0.0
and 1.0. The accuracy parameter (default: 10000) is a positive numeric literal which
controls approximation accuracy at the cost of memory. Higher value of accuracy yields
better accuracy, 1.0/accuracy is the relative error of the approximation.
When percentage is an array, each value of the percentage array must be between 0.0 and 1.0.
In this case, returns the approximate percentile array of column col at the given
percentage array.

Examples:

> SELECT percentile_approx(10.0, array(0.5, 0.4, 0.1), 100);
 [10.0,10.0,10.0]
> SELECT percentile_approx(10.0, 0.5, 100);
 10.0

pi

pi() - Returns pi.

Examples:

> SELECT pi();
 3.141592653589793

pmod

pmod(expr1, expr2) - Returns the positive value of expr1 mod expr2.

Examples:

> SELECT pmod(10, 3);
 1
> SELECT pmod(-10, 3);
 2

pos e x p lode

pos ex plode(expr) - Separates the elements of array expr into multiple rows with positions, or the elements of map expr into multiple rows and columns with positions.

Examples:

> SELECT pos e x plode(array(10,20));
 0  10
 1  20

pos e x plode_outer

pos e x plode_outer(expr) - Separates the elements of array expr into multiple rows with positions, or the elements of map expr into multiple rows and columns with positions.

Examples:

> SELECT pos e x plode_outer(array(10,20));
 0  10
 1  20

position

position(substr, str[, pos]) - Returns the position of the first occurrence of substr in str after position pos.
The given pos and return value are 1-based.

Examples:

> SELECT position('bar', 'foobarbar');
 4
> SELECT position('bar', 'foobarbar', 5);
 7
> SELECT POSITION('bar' IN 'foobarbar');
 4

positive

positive(expr) - Returns the value of expr.

pow

pow(expr1, expr2) - Raises expr1 to the power of expr2.

Examples:

> SELECT pow(2, 3);
 8.0

power

power(expr1, expr2) - Raises expr1 to the power of expr2.

Examples:

> SELECT power(2, 3);
 8.0

printf

printf(strfmt, obj, ...) - Returns a formatted string from printf-style format strings.

Examples:

> SELECT printf("Hello World %d %s", 100, "days");
 Hello World 100 days

quarter

quarter(date) - Returns the quarter of the year for date, in the range 1 to 4.

Examples:

> SELECT quarter('2016-08-31');
 3

Since: 1.5.0

radians

radians(expr) - Converts degrees to radians.

Arguments:

  • expr - angle in degrees

Examples: