Python使用thrift操作HBase
thrift支持多种语言进行连接使用,但是没找到linux中的cli操作命令行的形式。所以如果服务器有python环境的话,可以使用python进行连接,快速测试。
确认hbase和thrift服务已经安装,启动
相关hbase和thrift的安装,启动参考
注意:我这里使用的CDH套装中的hbase服务,如果单独安装hbase使用的话,请参考文末附录。
hadoop基础----hadoop实战(七)-----hadoop管理工具—使用Cloudera Manager安装Hadoop—Cloudera Manager和CDH5.8离线安装
hadoop组件—面向列的开源数据库(三)—hbase的接口thrift简介和安装
在root权限下使用命令 (如果是个人账户,有可能看不到root账户安装的程序)
jps
输出如下:
root@master:/# jps
3332 Jps
3254 ThriftServer
2685 HMaster
有HMaster 说明 hbase服务正常运行,有ThriftServer说明thrift服务正常运行。
python2连接hbase
检查环境
明确python的版本 和pip是否安装
[zzq@host252 ~]$ python --version
Python 2.7.11
pip
[zzq@host252 ~]$ pip
Usage:
pip <command> [options]
Commands:
install Install packages.
download Download packages.
uninstall Uninstall packages.
freeze Output installed packages in requirements format.
list List installed packages.
show Show information about installed packages.
check Verify installed packages have compatible dependencies.
config Manage local and global configuration.
search Search PyPI for packages.
wheel Build wheels from your requirements.
hash Compute hashes of package archives.
completion A helper command used for command completion.
help Show help for commands.
General Options:
-h, --help Show help.
--isolated Run pip in an isolated mode, ignoring environment variables and user configuration.
-v, --verbose Give more output. Option is additive, and can be used up to 3 times.
-V, --version Show version and exit.
-q, --quiet Give less output. Option is additive, and can be used up to 3 times (corresponding to WARNING, ERROR, and CRITICAL logging levels).
--log <path> Path to a verbose appending log.
--proxy <proxy> Specify a proxy in the form [user:passwd@]proxy.server:port.
--retries <retries> Maximum number of retries each connection should attempt (default 5 times).
--timeout <sec> Set the socket timeout (default 15 seconds).
--exists-action <action> Default action when a path already exists: (s)witch, (i)gnore, (w)ipe, (b)ackup, (a)bort).
--trusted-host <hostname> Mark this host as trusted, even though it does not have valid or any HTTPS.
--cert <path> Path to alternate CA bundle.
--client-cert <path> Path to SSL client certificate, a single file containing the private key and the certificate in PEM format.
--cache-dir <dir> Store the cache data in <dir>.
--no-cache-dir Disable the cache.
--disable-pip-version-check
Don't periodically check PyPI to determine whether a new version of pip is available for download. Implied with --no-index.
--no-color Suppress colored output
[zzq@host252 ~]$
可能遇到的问题–bash: pip: command not found
解决方法 把对应python路径中的pip连接到系统层面
首先查下安装路径:
find / -name pip
做个软连接
ln -sv /usr/local/python/bin/pip /usr/bin/pip
创建虚拟环境
为了不影响系统的python环境 最好新建一个 虚拟环境来运行(当然也可以不创建,直接在系统python环境中操作)
只有python2.7及更高版本才支持virtualenv这个脚本的运行
使用命令如下:
pip install virtualenv
或
pip2 install virtualenv -i https://pypi.douban.com/simple
安装完成后使用命令校验
[zzq@host252 ~]$ virtualenv
You must provide a DEST_DIR
Usage: virtualenv [OPTIONS] DEST_DIR
Options:
--version show program's version number and exit
-h, --help show this help message and exit
-v, --verbose Increase verbosity.
-q, --quiet Decrease verbosity.
-p PYTHON_EXE, --python=PYTHON_EXE
The Python interpreter to use, e.g.,
--python=python3.5 will use the python3.5 interpreter
to create the new environment. The default is the
interpreter that virtualenv was installed with
(/usr/bin/python3.6)
--clear Clear out the non-root install and start from scratch.
--no-site-packages DEPRECATED. Retained only for backward compatibility.
Not having access to global site-packages is now the
default behavior.
--system-site-packages
Give the virtual environment access to the global
site-packages.
--always-copy Always copy files rather than symlinking.
--relocatable Make an EXISTING virtualenv environment relocatable.
This fixes up scripts and makes all .pth files
relative.
--no-setuptools Do not install setuptools in the new virtualenv.
--no-pip Do not install pip in the new virtualenv.
--no-wheel Do not install wheel in the new virtualenv.
--extra-search-dir=DIR
Directory to look for setuptools/pip distributions in.
This option can be used multiple times.
--download Download preinstalled packages from PyPI.
--no-download, --never-download
Do not download preinstalled packages from PyPI.
--prompt=PROMPT Provides an alternative prompt prefix for this
environment.
--setuptools DEPRECATED. Retained only for backward compatibility.
This option has no effect.
--distribute DEPRECATED. Retained only for backward compatibility.
This option has no effect.
--unzip-setuptools DEPRECATED. Retained only for backward compatibility.
This option has no effect.
创建虚拟环境使用命令
mkdir my-python2hbase-env
cd my-python2hbase-env
创建
virtualenv project-env
使用命令查看当前目录
pwd
输出为:
/home/zzq/my-python2hbase-env
进入虚拟环境
source /home/zzq/my-python2hbase-env/project-env/bin/activate
安装依赖包
一共需要两个依赖包 Thrift和hbase-thrift 使用命令如下:
python连接hbase的包也有很多种
HBase-Thrift
happyhbase
hbase-python 的pypi仓库
hbase-python github
我们这里使用HBase-Thrift
安装Thrift依赖包
pip install thrift
安装成功输出如下:
(project-env) [root@host3 my-python2hbase-env]# pip install thrift
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support
Collecting thrift
Downloading https://files.pythonhosted.org/packages/c6/b4/510617906f8e0c5660e7d96fbc5585113f83ad547a3989b80297ac72a74c/thrift-0.11.0.tar.gz (52kB)
|████████████████████████████████| 61kB 46kB/s
Collecting six>=1.7.2
Downloading https://files.pythonhosted.org/packages/73/fb/00a976f728d0d1fecfe898238ce23f502a721c0ac0ecfedb80e0d88c64e9/six-1.12.0-py2.py3-none-any.whl
Building wheels for collected packages: thrift
Building wheel for thrift (setup.py) ... done
Created wheel for thrift: filename=thrift-0.11.0-cp27-cp27mu-linux_x86_64.whl size=264173 sha256=8392860fa66ddd575b004c4d1ef13f1a462c01a779ddfa1929db42bcebe26a34
Stored in directory: /root/.cache/pip/wheels/be/36/81/0f93ba89a1cb7887c91937948519840a72c0ffdd57cac0ae8f
Successfully built thrift
Installing collected packages: six, thrift
Successfully installed six-1.12.0 thrift-0.11.0
(project-env) [root@host3 my-python2hbase-env]#
安装hbase-thrift依赖包
pip install hbase-thrift
安装成功输出如下:
(project-env) [root@host3 my-python2hbase-env]# pip install hbase-thrift
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support
Collecting hbase-thrift
Downloading https://files.pythonhosted.org/packages/89/f7/dbb6c764bb909ed361c255828701228d8c9867d541cfef84127e6f3704cc/hbase-thrift-0.20.4.tar.gz
Requirement already satisfied: Thrift in ./project-env/lib/python2.7/site-packages (from hbase-thrift) (0.11.0)
Requirement already satisfied: six>=1.7.2 in ./project-env/lib/python2.7/site-packages (from Thrift->hbase-thrift) (1.12.0)
Building wheels for collected packages: hbase-thrift
Building wheel for hbase-thrift (setup.py) ... done
Created wheel for hbase-thrift: filename=hbase_thrift-0.20.4-cp27-none-any.whl size=19705 sha256=c3334f4d28c385ec7b29fda6db64c128c76e08e4bc2cfe9e1d20ff8dbd813629
Stored in directory: /root/.cache/pip/wheels/fe/51/f2/afb7b010cd97910aa0b651d492735a38ed69a93a817444904e
Successfully built hbase-thrift
Installing collected packages: hbase-thrift
Successfully installed hbase-thrift-0.20.4
You have mail in /var/spool/mail/root
(project-env) [root@host3 my-python2hbase-env]#
python连接thrift代码
目前的Hbase有两套thrift接口(可以叫thrift和thrift2),它们并不兼容
先来看看连接thrift的代码
vi query.py
注意 localhost 和端口9090(thrift默认端口) 需要与自己的对应
输入内容如下:
from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
from hbase import Hbase
from hbase.ttypes import *
transport = TSocket.TSocket('localhost', 9090)
transport = TTransport.TBufferedTransport(transport)
protocol = TBinaryProtocol.TBinaryProtocol(transport)
client = Hbase.Client(protocol)
transport.open()
print client.getTableNames()
可能遇到问题–thrift.Thrift.TApplicationException: Invalid method name: 'getTableNames
原因
客户端thrift版本和hbase thrift server的thrift版本不一致造成的。
thrift server上是使用的thrift2启动的,而客户端使用的是thrift访问的。
解决方法
因为根本原因在于客户端和服务器thrift版本不一致,那么解决方法有两个:
1、服务端以启动thrift版本的thrift server
hbase 的 thrift server以thrift1方式启动。
hbase-daemon.sh stop thrift2
#启动命令
hbase-daemon.sh start thrift
如果想使用happybase这个好用的模块去连接hbase,只能使用thrift,因为happybase目前还不支持thrift2
python连接thrift2代码
python连接thrift2要稍微麻烦一些
生成对应编译器–注意thrift版本和thrift2版本
需要安装Thrift编译器,才能生成HBase跨语言的API。
生成编译器的工具的路径如下
如果是原生安装的hbase路径为:
$HBASE_HOME/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift
如果是CDH安装的hbase路径为:
/opt/cloudera/parcels/CDH/lib/hue/apps/hbase/thrift/Hbase.thrift
如果实在找不到则使用全局搜索命令
sudo find / -name "Hbase.thrift"
如图:
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-xfIRozbm-1575285150936)(http://image.525.life/FsupHgwkScRimAgDcivI66mAnMRA)]
使用命令生成python版本的编译器
thrift --gen py /opt/cloudera/parcels/CDH/lib/hue/apps/hbase/thrift/Hbase.thrift
如果报错 -bash: thrift: command not found
需要安装 thrift,参考 hadoop组件—面向列的开源数据库(三)—hbase的接口thrift简介和安装
该命令会在当前目录下生成 gen-py文件夹
因为 CDH的hbase只提供了thrift1类型的编译器,所以 需要我们在其他地方找一下thrift2的编译器Hbase.thrift。
如果是HDP版本的话Hbase则提供了两个版本的编译器,路径和使用的命令可能如下:
# hdp hbase.thrift 文件路径
cd /usr/hdp/3.0.0.0-1634/hbase/include/thrift/
# 生成 python
# 该路径下存在 thrift1 和 thrift2 两种,可以自行选择
thrift -gen py hbase1.thrift 或 thrift -gen py hbase2.thrift
如果不是使用的HDP版本的Hbase的话,需要去github里找到hbase源码项目中,有thrift1和thrift2两个版本的编译器
thrift --gen py ../../../../../hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift1/hbase.thrift
或者
thrift --gen py ../../../../../hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift2/hbase.thrift
甚至可以直接下载编译好的文件使用
https://github.com/apache/hbase/tree/master/hbase-examples/src/main/python
我们这里直接下载 thrift2编译好的文件
链接:https://pan.baidu.com/s/1s3iysNJHW7s8lW6ni4qxrw
提取码:is1j
thrift1版本可以得到一组 Python 文件:
[zzq@host252 thrift-0.10.0]$ ll gen-py/hbased/
total 440
-rw-rw-r--. 1 zzq zzq 326 Nov 28 19:38 constants.py
-rw-rw-r--. 1 zzq zzq 384499 Nov 28 19:38 Hbase.py
-rwxr-xr-x. 1 zzq zzq 14386 Nov 28 19:38 Hbase-remote
-rw-rw-r--. 1 zzq zzq 43 Nov 28 19:38 __init__.py
-rw-rw-r--. 1 zzq zzq 38776 Nov 28 19:38 ttypes.py
[zzq@host252 thrift-0.10.0]$
thrift2版本会得到以下文件
$ ls gen-py
gen-py/hbase/__init__.py
gen-py/hbase/constants.py
gen-py/hbase/THBaseService.py
gen-py/hbase/ttypes.py
因为thrift2没有getTableNames()方法,所以我们需要先手动创建一个测试用的table。
hbase shell
hbase(main):001:0> create "example", NAME => "family"
0 row(s) in 1.6480 seconds
=> Hbase::Table - example
hbase(main):002:0>
假如我们的gen-py路径为:
/home/zzq/thrift2/gen-py
则使用命令创建测试脚本test.py
vim test.py
输入内容如下:
import sys
import os
import time
from thrift.transport import TTransport
from thrift.transport import TSocket
from thrift.transport import THttpClient
from thrift.protocol import TBinaryProtocol
# Add path for local "gen-py/hbase" for the pre-generated module
sys.path.append("/home/zzq/thrift2/gen-py")
from hbase import THBaseService
from hbase.ttypes import *
print "Thrift2 Demo"
print "This demo assumes you have a table called \"example\" with a column family called \"family\""
host = "192.168.30.250"
port = 9090
framed = False
socket = TSocket.TSocket(host, port)
if framed:
transport = TTransport.TFramedTransport(socket)
else:
transport = TTransport.TBufferedTransport(socket)
protocol = TBinaryProtocol.TBinaryProtocol(transport)
client = THBaseService.Client(protocol)
transport.open()
table = "example"
put = TPut(row="row1", columnValues=[TColumnValue(family="family",qualifier="qualifier1",value="value1")])
print "Putting:", put
client.put(table, put)
get = TGet(row="row1")
print "Getting:", get
result = client.get(table, get)
print "Result:", result
transport.close()
使用命令运行
python test.py
输出如下:
[zzq@host252 ~]$ vi test.py
[zzq@host252 ~]$ python test.py
Thrift2 Demo
This demo assumes you have a table called "example" with a column family called "family"
Putting: TPut(durability=None, timestamp=None, cellVisibility=None, attributes=None, columnValues=[TColumnValue(qualifier='qualifier1', family='family', tags=None, timestamp=None, value='value1', type=None)], row='row1')
Getting: TGet(storeOffset=None, existence_only=None, authorizations=None, filterString=None, timestamp=None, maxVersions=None, timeRange=None, filterBytes=None, targetReplicaId=None, consistency=None, attributes=None, storeLimit=None, cacheBlocks=None, columns=None, row='row1')
Result: TResult(partial=False, stale=False, columnValues=[TColumnValue(qualifier='qualifier1', family='family', tags=None, timestamp=1575022330934, value='value1', type=None)], row='row1')
[zzq@host252 ~]$
python3连接hbase
检查环境
明确python的版本 和pip是否安装
(project-env) [zzq@host252 ~]$ python --version
Python 3.6.5
(project-env) [zzq@host252 ~]$ pip3
Usage:
pip3 <command> [options]
Commands:
install Install packages.
download Download packages.
uninstall Uninstall packages.
freeze Output installed packages in requirements format.
list List installed packages.
show Show information about installed packages.
check Verify installed packages have compatible dependencies.
config Manage local and global configuration.
search Search PyPI for packages.
wheel Build wheels from your requirements.
hash Compute hashes of package archives.
completion A helper command used for command completion.
debug Show information useful for debugging.
help Show help for commands.
General Options:
-h, --help Show help.
--isolated Run pip in an isolated mode, ignoring environment variables and user configuration.
-v, --verbose Give more output. Option is additive, and can be used up to 3 times.
-V, --version Show version and exit.
-q, --quiet Give less output. Option is additive, and can be used up to 3 times (corresponding to WARNING, ERROR, and CRITICAL logging levels).
--log <path> Path to a verbose appending log.
--proxy <proxy> Specify a proxy in the form [user:passwd@]proxy.server:port.
--retries <retries> Maximum number of retries each connection should attempt (default 5 times).
--timeout <sec> Set the socket timeout (default 15 seconds).
--exists-action <action> Default action when a path already exists: (s)witch, (i)gnore, (w)ipe, (b)ackup, (a)bort.
--trusted-host <hostname> Mark this host or host:port pair as trusted, even though it does not have valid or any HTTPS.
--cert <path> Path to alternate CA bundle.
--client-cert <path> Path to SSL client certificate, a single file containing the private key and the certificate in PEM format.
--cache-dir <dir> Store the cache data in <dir>.
--no-cache-dir Disable the cache.
--disable-pip-version-check
Don't periodically check PyPI to determine whether a new version of pip is available for download. Implied with --no-index.
--no-color Suppress colored output
(project-env) [zzq@host252 ~]$
安装依赖包
一共需要两个依赖包 Thrift和hbase-thrift 使用命令如下:
python连接hbase的包也有很多种
HBase-Thrift
happyhbase
hbase-python 的pypi仓库
hbase-python github
我们这里使用HBase-Thrift
安装Thrift依赖包
pip3 install thrift
安装hbase-thrift依赖包
pip3 install hbase-thrift
python3连接thrift1代码
目前的Hbase有两套thrift1接口(可以叫thrift1和thrift2),它们并不兼容
先来看看连接thrift1的代码
vi query.py
注意 localhost 和端口9090(thrift默认端口) 需要与自己的对应
输入内容如下:
from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
from hbase import Hbase
from hbase.ttypes import *
transport = TSocket.TSocket('192.168.30.250', 9090)
transport = TTransport.TBufferedTransport(transport)
protocol = TBinaryProtocol.TBinaryProtocol(transport)
client = Hbase.Client(protocol)
transport.open()
print(client.getTableNames())
运行命令
python query.py
会报错如下:
(project-env) [zzq@host252 ~]$ python query.py
Traceback (most recent call last):
File "query.py", line 6, in <module>
from hbase import Hbase
File "/home/zzq/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/Hbase.py", line 2066
except IOError, io:
^
SyntaxError: invalid syntax
thrift连接的时候需要导入一个Hbase包, 实际是需要另外下载一个第三方包hbase-thrift, 这个包是用Python2写的,加载时会出现兼容性问题。
网上有别人修改好的兼容python3版本的文件,需要下载python3的Hbase文件,替换Hbase文件/usr/local/lib/python3.6/site-packages/hbase/Hbase.py和ttypes.py
如果是虚拟环境则路径 查找如下:
(project-env) [zzq@host252 ~]$ which python
~/my-python2hbase-env/project-env/bin/python
(project-env) [zzq@host252 ~]$ ls ~/my-python2hbase-env/project-env/
bin include lib lib64
(project-env) [zzq@host252 ~]$ ls ~/my-python2hbase-env/project-env/lib/
python3.6
(project-env) [zzq@host252 ~]$ ls ~/my-python2hbase-env/project-env/lib/python3.6/
abc.py codecs.py copy.py encodings __future__.py hmac.py keyword.py no-global-site-packages.txt os.py reprlib.py site-packages sre_parse.py tempfile.py warnings.py
base64.py collections copyreg.py enum.py genericpath.py importlib lib-dynload ntpath.py posixpath.py re.py site.py stat.py tokenize.py weakref.py
bisect.py _collections_abc.py distutils fnmatch.py hashlib.py imp.py linecache.py operator.py __pycache__ rlcompleter.py sre_compile.py struct.py token.py _weakrefset.py
_bootlocale.py config-3.6m-x86_64-linux-gnu _dummy_thread.py functools.py heapq.py io.py locale.py orig-prefix.txt random.py shutil.py sre_constants.py tarfile.py types.py
(project-env) [zzq@host252 ~]$ ls ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase
hbase/ hbase_thrift-0.20.4.dist-info/
(project-env) [zzq@host252 ~]$ ls ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/
constants.py Hbase.py __init__.py __pycache__ ttypes.py
下载地址为:
链接:https://pan.baidu.com/s/1-yKP1ghu2IAswnXzWpGNbw
提取码:d132
替换使用命令如下 :
(project-env) [zzq@host252 ~]$ unzip hbase3.6.zip
Archive: hbase3.6.zip
inflating: hbase3.6/Hbase.py
inflating: hbase3.6/readme
inflating: hbase3.6/ttypes.py
(project-env) [zzq@host252 ~]$ cp hbase3.6/Hbase.py ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/
(project-env) [zzq@host252 ~]$ cp hbase3.6/ttypes.py ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/
(project-env) [zzq@host252 ~]$ ll ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/
total 276
-rw-rw-r--. 1 zzq zzq 150 Nov 28 12:13 constants.py
-rw-rw-r--. 1 zzq zzq 240677 Dec 2 16:44 Hbase.py
-rw-rw-r--. 1 zzq zzq 43 Nov 28 12:13 __init__.py
drwxrwxr-x. 2 zzq zzq 4096 Nov 28 12:13 __pycache__
-rw-rw-r--. 1 zzq zzq 25228 Dec 2 16:44 ttypes.py
可能遇到问题–thrift.Thrift.TApplicationException: Invalid method name: 'getTableNames
原因
客户端thrift版本和hbase thrift server的thrift版本不一致造成的。
thrift server上是使用的thrift2启动的,而客户端使用的是thrift访问的。
解决方法
因为根本原因在于客户端和服务器thrift版本不一致,那么解决方法有两个:
1、服务端以启动thrift版本的thrift server
hbase 的 thrift server以thrift1方式启动。
hbase-daemon.sh stop thrift2
#启动命令
hbase-daemon.sh start thrift
如果想连接服务端的thrift2,参考下节
python3连接thrift2代码
生成对应编译器–注意thrift版本和thrift2版本
流程跟python2的差不多,需要注意的是使用thrift0.10.0以上版本生成编译器,才支持python3.5以上的版本。
thrift --gen py ../../../../../hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift1/hbase.thrift
或者
thrift --gen py ../../../../../hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift2/hbase.thrift
我们还是可以直接下载编译好的文件使用
https://github.com/apache/hbase/tree/master/hbase-examples/src/main/python
我们这里直接下载 thrift2编译好的文件
链接:https://pan.baidu.com/s/1s3iysNJHW7s8lW6ni4qxrw
提取码:is1j
thrift1版本可以得到一组 Python 文件:
[zzq@host252 thrift-0.10.0]$ ll gen-py/hbased/
total 440
-rw-rw-r--. 1 zzq zzq 326 Nov 28 19:38 constants.py
-rw-rw-r--. 1 zzq zzq 384499 Nov 28 19:38 Hbase.py
-rwxr-xr-x. 1 zzq zzq 14386 Nov 28 19:38 Hbase-remote
-rw-rw-r--. 1 zzq zzq 43 Nov 28 19:38 __init__.py
-rw-rw-r--. 1 zzq zzq 38776 Nov 28 19:38 ttypes.py
[zzq@host252 thrift-0.10.0]$
thrift2版本会得到以下文件
$ ls gen-py
gen-py/hbase/__init__.py
gen-py/hbase/constants.py
gen-py/hbase/THBaseService.py
gen-py/hbase/ttypes.py
因为thrift2没有getTableNames()方法,所以我们需要先手动创建一个测试用的table。
hbase shell
hbase(main):001:0> create "example", NAME => "family"
0 row(s) in 1.6480 seconds
=> Hbase::Table - example
hbase(main):002:0>
假如我们的gen-py路径为:
/home/zzq/thrift2/gen-py
则使用命令创建测试脚本test.py
vim test.py
输入内容如下:
import sys
import os
import time
from thrift.transport import TTransport
from thrift.transport import TSocket
from thrift.transport import THttpClient
from thrift.protocol import TBinaryProtocol
# Add path for local "gen-py/hbase" for the pre-generated module
sys.path.append("/home/zzq/thrift2/gen-py")
from hbase import THBaseService
from hbase.ttypes import *
print("Thrift2 Demo")
print("This demo assumes you have a table called \"example\" with a column family called \"family\"")
host = "192.168.30.250"
port = 9090
framed = False
socket = TSocket.TSocket(host, port)
if framed:
transport = TTransport.TFramedTransport(socket)
else:
transport = TTransport.TBufferedTransport(socket)
protocol = TBinaryProtocol.TBinaryProtocol(transport)
client = THBaseService.Client(protocol)
transport.open()
table = "example"
tableName = str.encode(table)
rowKey =str.encode('row2')
put = TPut()
put.row = rowKey
columnValues=[TColumnValue(family=str.encode("family"),qualifier=str.encode("qualifier2"),value=str.encode("value2"))]
put.columnValues = columnValues
result = client.put(tableName, put)
rowKey =str.encode('row2')
get = TGet()
get.row = rowKey
result = client.get(tableName, get)
print(result.row)
print(result.columnValues)
for i in result.columnValues:
print(i.value)
transport.close()
使用命令运行
python test.py
输出如下:
(project-env) [zzq@host252 ~]$ python test.py
Thrift2 Demo
This demo assumes you have a table called "example" with a column family called "family"
b'row2'
[TColumnValue(family=b'family', qualifier=b'qualifier2', value=b'value2', timestamp=1575285482023, tags=None, type=None)]
b'value2'
(project-env) [zzq@host252 ~]$
可能遇到报错–ImportError: cannot import name ‘THBaseService’
(project-env) [zzq@host252 ~]$ python3.6 test.py
Traceback (most recent call last):
File "test.py", line 12, in <module>
from hbase import THBaseService
ImportError: cannot import name 'THBaseService'
原因 默认先加载了 project-env/lib/python3.6/site-packages/hbase/路径的hbase.py文件。
没有识别到 gen-py目录
解决方法一
修改路径名
把生成的gen-py目录修改成genpy,否则python3导入会出现问题。
解决方法2 用新的覆盖lib包里的文件
(project-env) [zzq@host252 ~]$ cp thrift2/gen-py/hbase/* ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/
(project-env) [zzq@host252 ~]$ ll ~/my-python2hbase-env/project-env/lib/python3.6/site-packages/hbase/
total 1200
-rw-rw-r--. 1 zzq zzq 366 Dec 2 16:59 constants.py
-rw-rw-r--. 1 zzq zzq 240677 Dec 2 16:44 Hbase.py
-rw-rw-r--. 1 zzq zzq 51 Dec 2 16:59 __init__.py
-rw-rw-r--. 1 zzq zzq 199 Dec 2 16:59 __init__.pyc
drwxrwxr-x. 2 zzq zzq 4096 Dec 2 16:49 __pycache__
-rw-rw-r--. 1 zzq zzq 369677 Dec 2 16:59 THBaseService.py
-rw-rw-r--. 1 zzq zzq 359818 Dec 2 16:59 THBaseService.pyc
-rw-rw-r--. 1 zzq zzq 14357 Dec 2 16:59 THBaseService-remote
-rw-rw-r--. 1 zzq zzq 119702 Dec 2 16:59 ttypes.py
-rw-rw-r--. 1 zzq zzq 97317 Dec 2 16:59 ttypes.pyc
更多用法参考
https://blog.csdn.net/qq_21153619/article/details/86502624
https://blog.csdn.net/m0_37634723/article/details/79191420
https://blog.csdn.net/zjerryj/article/details/80045657
https://blog.csdn.net/luanpeng825485697/article/details/81048468
附录—单独的hbase服务安装和thrift启动
安装jdk
配置hbase的依赖环境JAVA_HOME
参考文章
Hbase下载
下载地址:http://hbase.apache.org/downloads.html
本地Hbase安装
root@master:/usr/local/setup_tools# tar -zxvf hbase-2.0.0-bin.tar.gz
root@master:/usr/local/setup_tools# mv hbase-2.0.0 /usr/local/
root@master:/usr/local/setup_tools# cd /usr/local
root@master:/usr/local# ls | grep hbase
hbase-2.0.0
root@master:/usr/local/hbase-2.0.0# vi /etc/profile
export HBASE_HOME=/usr/local/hbase-2.0.0
export PATH=.:$PATH:$JAVA_HOME/bin:$SCALA_HOME/bin:$HADOOP_HOME/bin:$SPARK_HOME/bin:$HIVE_HOME/bin:$FLUME_HOME/bin:$ZOOKEEPER_HOME/bin:$KAFKA_HOME/bin:$IDEA_HOME/bin:$eclipse_HOME:$MAVEN_HOME/bin:$ALLUXIO_HOME/bin:$HBASE_HOME/bin
root@master:/usr/local/hbase-2.0.0# source /etc/profile
配置
修改hbase-site.xml,设置存储数据的根目录。
root@master:/usr/local/hbase-2.0.0/conf# vi hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///usr/local/hbase-2.0.0/data</value>
</property>
</configuration>
启动hbase
root@master:/usr/local/hbase-2.0.0# cd bin
root@master:/usr/local/hbase-2.0.0/bin# ls
considerAsDead.sh hbase hbase-config.cmd hbase-jruby master-backup.sh replication start-hbase.sh zookeepers.sh
draining_servers.rb hbase-cleanup.sh hbase-config.sh hirb.rb region_mover.rb rolling-restart.sh stop-hbase.cmd
get-active-master.rb hbase.cmd hbase-daemon.sh local-master-backup.sh regionservers.sh shutdown_regionserver.rb stop-hbase.sh
graceful_stop.sh hbase-common.sh hbase-daemons.sh local-regionservers.sh region_status.rb start-hbase.cmd test
root@master:/usr/local/hbase-2.0.0/bin# start-hbase.sh
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hbase-2.0.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/alluxio-1.7.0-hadoop-2.6/client/alluxio-1.7.0-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
running master, logging to /usr/local/hbase-2.0.0/logs/hbase-root-master-master.out
root@master:/usr/local/hbase-2.0.0/bin# jps
2757 Jps
2685 HMaster
使用hbase shell
root@master:/usr/local/hbase-2.0.0/bin# hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hbase-2.0.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/alluxio-1.7.0-hadoop-2.6/client/alluxio-1.7.0-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 2.0.0, r7483b111e4da77adbfc8062b3b22cbe7c2cb91c1, Sun Apr 22 20:26:55 PDT 2018
Took 0.0044 seconds
hbase(main):001:0>
hbase(main):003:0> version
2.0.0, r7483b111e4da77adbfc8062b3b22cbe7c2cb91c1, Sun Apr 22 20:26:55 PDT 2018
Took 0.0054 seconds
hbase(main):004:0>
启动hbase thrift服务
root@master:/usr/local/hbase-2.0.0/bin# hbase-daemon.sh start thrift
running thrift, logging to /usr/local/hbase-2.0.0/logs/hbase-root-thrift-master.out
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hbase-2.0.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/alluxio-1.7.0-hadoop-2.6/client/alluxio-1.7.0-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
root@master:/usr/local/hbase-2.0.0/bin# jps
3332 Jps
3254 ThriftServer
2685 HMaster