Hbase Solr 二级索引,更新数据部分字段丢失问题

问题:

第一次往hbase put数据,索引同步三个字段,第二次更新hbase数据,只更新一个字段,其他两个字段会消失。

 

原因:

在创建Hbase Indexer 时我们配置文件指定了  read-row="never" 

 

$ cat morphline-hbase-mapper.xml 
<?xml version="1.0"?>
<!-- table:需要索引的HBase表名称-->
<!-- mapper:用来实现和读取指定的Morphline配置文件类,固定为MorphlineResultToSolrMapper-->
<indexer table="tableName" mapper="com.ngdata.hbaseindexer.morphline.MorphlineResultToSolrMapper" read-row="never" >
<!--param中的name参数用来指定当前配置为morphlineFile文件 -->
<!--value用来指定morphlines.conf文件的路径,绝对或者相对路径用来指定本地路径,如果是使用Cloudera Manager来管理morphlines.conf就直接写入值morphlines.conf"-->
    <param name="morphlineFile" value="morphlines.conf"/>
<!--value="ZDTableMap",这里test3Map是自定义,接下来要使用。其他的mapper,param name等属性默认即可-->
    <param name="morphlineId" value="TableMap"/>
</indexer>

修改为 read-row="dynamic"  ,再次测试,发现不会丢失字段

read-row 说明:https://github.com/NGDATA/hbase-indexer/wiki/Indexer-configuration#read-row

read-row

The read-row attribute has two possible values: dynamic, or never.

This attribute is only important when using row-based indexing. It specifies whether or not the indexer should re-read data from HBase in order to perform indexing.

When set to "dynamic", the indexer will read the necessary data from a row if a partial update to the row is performed in HBase. In dynamic mode, the row will not be re-read if all data needed to perform indexing is included in the row update.

If this attribute is set to never, a row will never be re-read by the indexer.

The default setting is "dynamic".

 但可能会遇到以下问题,使用前需要充分的测试

HBase Indexer导致Solr与HBase数据不一致问题解决:

https://blog.csdn.net/d6619309/article/details/51579594

发布了131 篇原创文章 · 获赞 33 · 访问量 66万+

猜你喜欢

转载自blog.csdn.net/zhangshenghang/article/details/103581766