Hive自定义的UDF函数

1、定义函数

定义自己的UDF函数，根据官网的案例https://cwiki.apache.org/confluence/display/Hive/HivePlugins

package com.test.hive;

import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;

public class HelloWorld extends UDF {
    public Text evaluate(Text input) {
        if (s == null) { return null; }
        return new Text("HelloWorld:" + input);
    }
}

2、临时函数

把函数打包成jar包，放到linux中，打开hive，使用

add jar jar包路径;

来把jar包加到hive中。
使用

create temporary function helloWorld as 'com.test.hive.HelloWorld';

来创建临时函数，这个函数只能在创建的session中使用，其他session则不能使用。

3、永久函数

上传jar到hdfs中

CREATE FUNCTION say_hello2 AS 'com.ruozedata.udf.HelloUDF' USING JAR 'hdfs://hadoop000:8020/lib/hive-1.0.jar';

hive的show functions命令不能显示出来自定义的函数，原数据里面可以查看。

修改源码来创建永久函数，show functions命令可以查看到

在FunctionRegistry.java中，放的是所有的内置函数，可以把自定义的函数放到源码内，在这个文件内添加一行代码，然后重新打包上传部署，这样show functions命令就可以查看到。

Hive自定义的UDF函数

1、定义函数

2、临时函数

3、永久函数

修改源码来创建永久函数，show functions命令可以查看到

猜你喜欢