第26节:bulk批量增删改

课程大纲

 

1、bulk语法

 

POST /_bulk

{ "delete": { "_index": "test_index", "_type": "test_type", "_id": "3" }}

{ "create": { "_index": "test_index", "_type": "test_type", "_id": "12" }}

{ "test_field":    "test12" }

{ "index":  { "_index": "test_index", "_type": "test_type", "_id": "2" }}

{ "test_field":    "replaced test2" }

{ "update": { "_index": "test_index", "_type": "test_type", "_id": "1", "_retry_on_conflict" : 3} }

{ "doc" : {"test_field2" : "bulk test1"} }

 

每一个操作要两个json串,语法如下:

 

{"action": {"metadata"}}

{"data"}

 

举例,比如你现在要创建一个文档,放bulk里面,看起来会是这样子的:

 

{"index": {"_index": "test_index", "_type", "test_type", "_id": "1"}}

{"test_field1": "test1", "test_field2": "test2"}

 

有哪些类型的操作可以执行呢?

(1)delete:删除一个文档,只要1个json串就可以了

(2)create:PUT /index/type/id/_create,强制创建

(3)index:普通的put操作,可以是创建文档,也可以是全量替换文档

(4)update:执行的partial update操作

 

bulk api对json的语法,有严格的要求,每个json串不能换行,只能放一行,同时一个json串和一个json串之间,必须有一个换行

 

{

  "error": {

    "root_cause": [

      {

        "type": "json_e_o_f_exception",

        "reason": "Unexpected end-of-input: expected close marker for Object (start marker at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@5a5932cd; line: 1, column: 1])\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@5a5932cd; line: 1, column: 3]"

      }

    ],

    "type": "json_e_o_f_exception",

    "reason": "Unexpected end-of-input: expected close marker for Object (start marker at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@5a5932cd; line: 1, column: 1])\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@5a5932cd; line: 1, column: 3]"

  },

  "status": 500

}

 

{

  "took": 41,

  "errors": true,

  "items": [

    {

      "delete": {

        "found": true,

        "_index": "test_index",

        "_type": "test_type",

        "_id": "10",

        "_version": 3,

        "result": "deleted",

        "_shards": {

          "total": 2,

          "successful": 1,

          "failed": 0

        },

        "status": 200

      }

    },

    {

      "create": {

        "_index": "test_index",

        "_type": "test_type",

        "_id": "3",

        "_version": 1,

        "result": "created",

        "_shards": {

          "total": 2,

          "successful": 1,

          "failed": 0

        },

        "created": true,

        "status": 201

      }

    },

    {

      "create": {

        "_index": "test_index",

        "_type": "test_type",

        "_id": "2",

        "status": 409,

        "error": {

          "type": "version_conflict_engine_exception",

          "reason": "[test_type][2]: version conflict, document already exists (current version [1])",

          "index_uuid": "6m0G7yx7R1KECWWGnfH1sw",

          "shard": "2",

          "index": "test_index"

        }

      }

    },

    {

      "index": {

        "_index": "test_index",

        "_type": "test_type",

        "_id": "4",

        "_version": 1,

        "result": "created",

        "_shards": {

          "total": 2,

          "successful": 1,

          "failed": 0

        },

        "created": true,

        "status": 201

      }

    },

    {

      "index": {

        "_index": "test_index",

        "_type": "test_type",

        "_id": "2",

        "_version": 2,

        "result": "updated",

        "_shards": {

          "total": 2,

          "successful": 1,

          "failed": 0

        },

        "created": false,

        "status": 200

      }

    },

    {

      "update": {

        "_index": "test_index",

        "_type": "test_type",

        "_id": "1",

        "_version": 3,

        "result": "updated",

        "_shards": {

          "total": 2,

          "successful": 1,

          "failed": 0

        },

        "status": 200

      }

    }

  ]

}

 

bulk操作中,任意一个操作失败,是不会影响其他的操作的,但是在返回结果里,会告诉你异常日志

 

POST /test_index/_bulk

{ "delete": { "_type": "test_type", "_id": "3" }}

{ "create": { "_type": "test_type", "_id": "12" }}

{ "test_field":    "test12" }

{ "index":  { "_type": "test_type" }}

{ "test_field":    "auto-generate id test" }

{ "index":  { "_type": "test_type", "_id": "2" }}

{ "test_field":    "replaced test2" }

{ "update": { "_type": "test_type", "_id": "1", "_retry_on_conflict" : 3} }

{ "doc" : {"test_field2" : "bulk test1"} }

 

POST /test_index/test_type/_bulk

{ "delete": { "_id": "3" }}

{ "create": { "_id": "12" }}

{ "test_field":    "test12" }

{ "index":  { }}

{ "test_field":    "auto-generate id test" }

{ "index":  { "_id": "2" }}

{ "test_field":    "replaced test2" }

{ "update": { "_id": "1", "_retry_on_conflict" : 3} }

{ "doc" : {"test_field2" : "bulk test1"} }

 

2、bulk size最佳大小

 

bulk request会加载到内存里,如果太大的话,性能反而会下降,因此需要反复尝试一个最佳的bulk size。一般从1000~5000条数据开始,尝试逐渐增加。另外,如果看大小的话,最好是在5~15MB之间。

 

 

 

 

 

猜你喜欢

转载自blog.csdn.net/qq_35524586/article/details/86593482