内置操作符Built-in Operators
Complex Type Constructors
map (key1, value1, key2, value2, ...) Creates a map with the given key/value pairs.
named_struct (name1, val1, name2, val2, ...) Creates a struct with the given field names and values.
map和named_struct的区别及该如何选择?
区别:
map和struct中都保存的k-v对,但是map中k-v对数量和类型是不确定的,而struct中k-v对数量和类型都是确定的。
选择:
如果某列的k-v对数量和类型都是确定且一样的,使用struct。例如某列保存的是用户的个人信息,包含user_id,age、gender。(named_stuct(“user_id”,“age”,“gender”)),相当于定义了一个模板,一个类。数据:named_stuct(“user_id”:“001”,“age”:18,“gender”:“M”),是这个类的对象,查询user_id属性,使用.user_id。
如果某列的k-v对数量和类型不完全一样,使用map。例如只能确定每列是key:String,value:Int类型的数据。该列某行数据是{“name”:“jack”,“age”:18},另一行数据是{“name”:“mart”,“age”:28,“gender”:“M”},则该列只能使用map不能用struct。
内置udf函数
Mathematical Functions
。。。
Collection Functions内置收集函数
int size(Map<K.V>) 返回 Map 类型中的元素数。
int size(Array<T>) 返回数组类型中的元素数。select size(null); -- 结果-1
boolean array_contains(Array<T>, value) 如果数组包含值,则返回 TRUE。
array<t> sort_array(Array<T>) Sorts the input array in ascending order according to the natural ordering of the array elements and returns it.
类型转换功能
-- cast(coalesce(id,-1) as BIGINT) as id
-- decode(gender, 'M', 1, 'F', 0, -1) as gender
Date Functions
时间转换
将timestamp转换成s为单位
timestamp = timestamp > 1000000000000L ? timestamp / 1000 : timestamp;
from_unixtime(bigint unixtime[, string format]) Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zonein the format of "uuuu-MM-dd HH:mm:ss".
from_unixtime(cast(1644675749144/1000 as bigint))
FROM_UNIXTIME(SUBSTR(str_create_time,1,10))
bigint unix_timestamp() Gets current Unix timestamp in seconds.
bigint unix_timestamp(string date) Converts time string in format uuuu-MM-dd HH:mm:ss to Unix timestamp (in seconds)
bigint unix_timestamp(string date, string pattern) Convert time string with given pattern
取时间对应的date/年份/季度/月份/天/小时/分钟/秒钟/周数
date to_date(string timestamp) Returns the date part of a timestamp string (pre-Hive 2.1.0): to_date("1970-01-01 00:00:00") = "1970-01-01".
int year(string date) Returns the year part of a date or a timestamp string: year("1970-01-01 00:00:00") = 1970.
int quarter(date/timestamp/string) Returns the quarter of the year for a date, timestamp, or string in the range 1 to 4 . Example: quarter('2015-04-08') = 2.
int month(string date) Returns the month part of a date or a timestamp string: month("1970-11-01 00:00:00") = 11, month("1970-11-01") = 11.
int day(string date) dayofmonth(date) Returns the day part of a date or a timestamp string: day("1970-11-01 00:00:00") = 1, day("1970-11-01") = 1.
int hour(string date) Returns the hour of the timestamp: hour('2009-07-30 12:58:59') = 12, hour('12:58:59') = 12.
--取时间为当前小时(方式2?): substr('yyyy-mm-dd hh:mi:ss'(换成具体日期),12,2) = '${hh24}'
int minute(string date) Returns the minute of the timestamp.
int second(string date) Returns the second of the timestamp.
int weeko