Phoenix常见用法

1. Load Data into Phoenix

Using our map-reduce based CSV loader for bigger data sets http://phoenix.apache.org/bulk_dataload.html

hadoop jar phoenix-<version>-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool --table EXAMPLE --input /data/example.csv

Using psql.py to load .csv file http://phoenix.apache.org/bulk_dataload.html

bin/psql.py -t EXAMPLE localhost data.csv

Mapping an existing HBase table to a Phoenix table and using the UPSERT SELECT command to populate a new table.

create 't1', {NAME => 'cf1', VERSIONS => 5}   --define table and column family 'cf1' in hbase shell
CREATE VIEW t1 ( pk VARCHAR PRIMARY KEY, cf1.column1 VARCHAR, cf1.column2 INTEGER);    --create view in phoenix sqlline.py for t1

--Instead, you can create the table in phoenix directly
CREATE TABLE t1 ( pk VARCHAR PRIMARY KEY, column1 VARCHAR,  column2 INTEGER);

Populating the table through our UPSERT VALUES command.

upsert into test_table values (2,'World!');

2. Using client tool SQuirrel

Remove prior phoenix-[oldversion]-client.jar from the lib directory of SQuirrel, copy phoenix-[newversion]-client.jar to the lib directory (newversion should be compatible with the version of the phoenix server jar used with your HBase installation)
Start SQuirrel and add new driver to SQuirrel (Drivers -> New Driver)
In Add Driver dialog box, set Name to Phoenix, and set the Example URL to jdbc:phoenix:localhost.
Type “org.apache.phoenix.jdbc.PhoenixDriver” into the Class Name textbox and click OK to close this dialog.
Switch to Alias tab and create the new Alias (Aliases -> New Aliases)
In the dialog box, Name: any name, Driver: Phoenix, User Name: anything, Password: anything
Construct URL as follows: jdbc:phoenix: zookeeper quorum server. For example, to connect to a local HBase use: jdbc:phoenix:localhost
Press Test (which should succeed if everything is setup correctly) and press OK to close.
Now double click on your newly created Phoenix alias and click Connect. Now you are ready to run SQL queries against Phoenix.

3. Performance optimizing

pre-splitting the data into multiple regions

CREATE TABLE TEST (HOST VARCHAR NOT NULL PRIMARY KEY, DESCRIPTION VARCHAR) SALT_BUCKETS=16

Per-split table by row key

CREATE TABLE TEST (HOST VARCHAR NOT NULL PRIMARY KEY, DESCRIPTION VARCHAR) SPLIT ON ('CS','EU','NA')

Use multiple column families

CREATE TABLE TEST (MYKEY VARCHAR NOT NULL PRIMARY KEY, A.COL1 VARCHAR, A.COL2 VARCHAR, B.COL3 VARCHAR)

Use compression On disk compression improves performance on large tables

CREATE TABLE TEST (HOST VARCHAR NOT NULL PRIMARY KEY, DESCRIPTION VARCHAR) COMPRESSION='GZ'

Others:

Create indexes See faq.html#/How_do_I_create_Secondary_Index_on_a_table
Optimize cluster parameters See http://hbase.apache.org/book/performance.html
Optimize Phoenix parameters See tuning.html

4. Should I pool Phoenix JDBC Connections?

No, it is not necessary to pool Phoenix JDBC Connections.

Phoenix’s Connection objects are different from most other JDBC Connections due to the underlying HBase connection. The Phoenix Connection object is designed to be a thin object that is inexpensive to create. If Phoenix Connections are reused, it is possible that the underlying HBase connection is not always left in a healthy state by the previous user. It is better to create new Phoenix Connections to ensure that you avoid any potential issues.

参考：

http://phoenix.apache.org/installation.html

http://phoenix.apache.org/Phoenix-in-15-minutes-or-less.html

http://phoenix.apache.org/language/index.html

http://phoenix.apache.org/faq.html

猜你喜欢