hive: 常用函数

内置操作符Built-in Operators

Complex Type Constructors

map    (key1, value1, key2, value2, ...)    Creates a map with the given key/value pairs.
named_struct    (name1, val1, name2, val2, ...)    Creates a struct with the given field names and values.
map和named_struct的区别及该如何选择?
区别:
map和struct中都保存的k-v对,但是map中k-v对数量和类型是不确定的,而struct中k-v对数量和类型都是确定的。
选择:
如果某列的k-v对数量和类型都是确定且一样的,使用struct。例如某列保存的是用户的个人信息,包含user_id,age、gender。(named_stuct(“user_id”,“age”,“gender”)),相当于定义了一个模板,一个类。数据:named_stuct(“user_id”:“001”,“age”:18,“gender”:“M”),是这个类的对象,查询user_id属性,使用.user_id。
如果某列的k-v对数量和类型不完全一样,使用map。例如只能确定每列是key:String,value:Int类型的数据。该列某行数据是{“name”:“jack”,“age”:18},另一行数据是{“name”:“mart”,“age”:28,“gender”:“M”},则该列只能使用map不能用struct。
 

内置udf函数

Mathematical Functions

。。。

Collection Functions内置收集函数

int size(Map<K.V>)  返回 Map 类型中的元素数。
int size(Array<T>)  返回数组类型中的元素数。select size(null); -- 结果-1

boolean    array_contains(Array<T>, value)    如果数组包含值,则返回 TRUE。

array<t> sort_array(Array<T>) Sorts the input array in ascending order according to the natural ordering of the array elements and returns it.

类型转换功能

-- cast(coalesce(id,-1) as BIGINT) as id
-- decode(gender, 'M', 1, 'F', 0, -1) as gender

Date Functions

时间转换

将timestamp转换成s为单位
timestamp = timestamp > 1000000000000L ? timestamp / 1000 : timestamp;

from_unixtime(bigint unixtime[, string format])    Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zonein the format of "uuuu-MM-dd HH:mm:ss".
from_unixtime(cast(1644675749144/1000 as bigint))
FROM_UNIXTIME(SUBSTR(str_create_time,1,10))

bigint  unix_timestamp()    Gets current Unix timestamp in seconds. 
bigint  unix_timestamp(string date) Converts time string in format uuuu-MM-dd HH:mm:ss to Unix timestamp (in seconds)
bigint  unix_timestamp(string date, string pattern) Convert time string with given pattern

取时间对应的date/年份/季度/月份/天/小时/分钟/秒钟/周数

date    to_date(string timestamp)    Returns the date part of a timestamp string (pre-Hive 2.1.0): to_date("1970-01-01 00:00:00") = "1970-01-01".
int    year(string date)    Returns the year part of a date or a timestamp string: year("1970-01-01 00:00:00") = 1970.
int    quarter(date/timestamp/string)    Returns the quarter of the year for a date, timestamp, or string in the range 1 to 4 . Example: quarter('2015-04-08') = 2.
int    month(string date)    Returns the month part of a date or a timestamp string: month("1970-11-01 00:00:00") = 11, month("1970-11-01") = 11.
int    day(string date) dayofmonth(date)    Returns the day part of a date or a timestamp string: day("1970-11-01 00:00:00") = 1, day("1970-11-01") = 1.
int    hour(string date)    Returns the hour of the timestamp: hour('2009-07-30 12:58:59') = 12, hour('12:58:59') = 12.
--取时间为当前小时(方式2?):  substr('yyyy-mm-dd hh:mi:ss'(换成具体日期),12,2) = '${hh24}'
int    minute(string date)    Returns the minute of the timestamp.
int    second(string date)    Returns the second of the timestamp.
int    weeko

猜你喜欢

转载自blog.csdn.net/pipisorry/article/details/129172093