ElasticSearch入门
1.什么是ElasticSearch
基于Apache Lucene构建的开源搜索引擎
采用Java编写，提供简单易用的的RESTFULL API
轻松的横向扩展，可支持PB级的结构化或非结构化数据处理

2.可用的应场景
海量数据分析引擎
站内搜索引擎
数据仓库
举例：
英国卫报-实时分析公众对文章的回应
维基百科、GitHub-站内实时搜索
百度-实时日志监控平台
阿里巴巴 Google 小米京东都在使用
3.学习前提
技术要求：Maven构建项目、了解Spring Boot的基本使用
环境要求：IDE工具 Java-JDK1.8 NodeJs（6.0以上）
4.学习线路图

单机安装
这里只介绍linux的，如果想用windows或者mac的自行百度
1、下载
https://www.elastic.co/kr/downloads/elasticsearch
2、解压 tar -xvf elasticsearch-6.4.2.tar.gz

解压：tar -zxvf elasticsearch-6.6.2.tar.gz -C /opt/
启动：[root@localhost bin]# ./elasticsearch

Bug:
Caused by: java.lang.RuntimeException: can not run elasticsearch as root

Bug说明：
ElasticSearch 默认不能用root用户启动
修改措施：可以创建用户和用户组
chown -R blank:blank elasticsearch-6.6.2/ 修改用户和用户组
chmod 770 elasticsearch-6.6.2/ 修改用户权限
切换用户：su blank 可以切换命令

2.1 创建ES数据文件和日志文件
[root@localhost opt]# mkdir elastic-data 创建文件
[root@localhost opt]# chown -R blank:blank elastic-data/ 修改用户和用户组

创建文件与日志文件
[blank@localhost opt]$ mkdir -p elastic-data/data
[blank@localhost opt]$ mkdir -p elastic-data/logs

2.2 修改配置文件修改elasticsearch.yml配置文件指定文件内容文件和日志文件

vim /opt/elasticsearch-6.6.2/config/elasticsearch.yml

path.data: /opt/elastic-data/data

Path to log files:

path.logs: /opt/elastic-data/logs

----------------------------------- Memory -----------------------------------

Lock the memory on startup:

#bootstrap.memory_lock: true
bootstrap.memory_lock: false
bootstrap.system_call_filter: false

---------------------------------- Network -----------------------------------

Set the bind address to a specific IP (IPv4 or IPv6):

network.host: 0.0.0.0

Set a custom port for HTTP:

http.port: 9200

3、vim config/elasticsearch.yml
# 增加
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: [“192.168.247.150”]
bootstrap.memory_lock: false
bootstrap.system_call_filter: false

4、编辑 vim /etc/security/limits.conf，追加以下内容；
* soft nofile 65536
* hard nofile 65536
此文件修改后需要重新登录用户，才会生效

5、编辑 vim /etc/sysctl.conf，追加以下内容
vm.max_map_count=655360
保存后，执行：
sysctl -p

7、修改limits.d目录下的配置文件：
vi /etc/security/limits.d/90-nproc.conf

```
     soft    nproc     4096
```

root soft nproc 4096

6、新增用户 useradd testuser 创建用户testuser
passwd testuser 给已创建的用户testuser设置密码
chown -R testuser:testuser elasticsearch-node1/
切换到 testuser
9、到elasticseach bin 目录下执行 ./elasticseach
启动成功界面

执行 ./elasticesrarch -d 是后台运行

查看[root@localhost ~]# ps -ef|grep elastic
杀死：[root@localhost ~]# kill -9 2373

插件安装
elasticsearch-head
是一个界面化的集群操作和管理工具，可以对集群进行傻瓜式操作。elasticsearch-head是Elasticsearch的图形化界面，方便用户对数据进行增删改查，基于REST的四种方式进行数据交互
es-head主要有三个方面的操作：

显示集群的拓扑,并且能够执行索引和节点级别操作
搜索接口能够查询集群中原始json或表格格式的检索数据
能够快速访问并显示集群的状态
有一个输入窗口,允许任意调用RESTful API。这个接口包含几个选项,可以组合在一起以产生有趣的结果;
请求方法(get、put、post、delete),查询json数据,节点和路径
支持JSON验证器
支持重复请求计时器
支持使用javascript表达式变换结果
收集结果的能力随着时间的推移(使用定时器),或比较的结果
能力图表转换后的结果在一个简单的条形图(包括时间序列)
下载地址：|https://github.com/mobz/elasticsearch-head
下载安装 elasticsearch-head-master.zip
解压 unzip elasticsearch-head-master.zip

下载地址：https://nodejs.org/en/download/
下载node.js node-v8.12.0-linux-x64.tar.xz
需要用xz -d xxx.tar.xz 将 xxx.tar.xz解压成 xxx.tar 然后，再用 tar xvf xxx.tar来解包
安装 tar -xvf node-v8.12.0-linux-x64.tar.xz

配置node.js 环境变量
vim /etc/profile
#set for nodejs
export NODE_HOME=/usr/local/node/0.10.24
export PATH= $NODE_HOME/bin:$ PATH
source /etc/profile
查看版本号 node -v

1、安装cnpm
npm install -g cnpm --registry=https://registry.npm.taobao.org
如果外网不通需要配置
vim /etc/resolv.conf
nameserver 114.114.114.114

7、修改elasticsearch-6.4.2 中的修改elasticsearch.yml配置文件允许跨域访问
/opt/elasticsearch-6.6.2/config

http.cors.enabled: true

http.cors.allow-origin: ‘*’

8、启动 ./elasticsearch

9、cd elasticsearch-head-master
cnpm install
Cnpm run start
10、浏览器查看 192.168.247.150:9100

集群配置（不讲解)
在集群机器上创建一个master节点和一个slave节点
1、Master elasticseach.yml 添加如下配置
http.cors.enabled: true
http.cors.allow-origin: ‘*’
#修改以下项
##表示集群标识，同一个集群中的多个节点使用相同的标识
cluster.name: elasticsearch
##节点名称
node.name: “master”
node.master: true
##数据存储目录
path.data: data/elasticsearch1/data
##日志目录
path.logs: data/elasticsearch1/logs
##节点所绑定的IP地址，并且该节点会被通知到集群中的其他节点
network.host: 192.168.247.150
##绑定监听的网络接口，监听传入的请求，可以设置为IP地址或者主机名
#network.bind_host: 192.168.247.150
##发布地址，用于通知集群中的其他节点，和其他节点通讯，不设置的话默认可以自动设置。必须是一个存在的IP地址
#network.publish_host: 192.168.247.150
##对外提供服务的http端口，默认为9200
http.port: 9200
##集群中主节点的初始列表，当主节点启动时会使用这个列表进行非主节点的监测
discovery.zen.ping.unicast.hosts: [“192.168.247.150”,“192.168.247.150”,“192.168.247.151”]
##下面这个参数控制的是，一个节点需要看到的具有master节点资格的最小数量，然后才能在集群中做操作。官方推荐值是(N/2)+1；
##其中N是具有master资格的节点的数量（我们的情况是3，因此这个参数设置为2)
##但是：但对于只有2个节点的情况，设置为2就有些问题了，一个节点DOWN掉后，肯定连不上2台服务器了，这点需要注意
discovery.zen.minimum_master_nodes: 1
##ES默认开启了内存地址锁定，为了避免内存交换提高性能。但是Centos6不支持SecComp功能，启动会报错，所以需要将其设置为false
bootstrap.memory_lock: false
bootstrap.system_call_filter: false
2、slave 配置：
cluster.name: elasticsearch
node.name: “es-node3”
path.data: data/elasticsearch3/data
path.logs: data/elasticsearch3/logs
network.host: 192.168.247.150
#network.bind_host: 192.168.247.150
#network.publish_host: 192.168.247.150
#节点间的通信端口，接收单值或者一个范围。如果指定一个范围，该节点将会绑定范围的第一个可用顶点
#transport.tcp.port: 9301
http.port: 9201
discovery.zen.ping.unicast.hosts: [“192.168.247.150”,“192.168.247.150”,“192.168.247.151”]
discovery.zen.minimum_master_nodes: 1
bootstrap.memory_lock: false

bootstrap.system_call_filter: false
3、192.168.247.151 配置
cluster.name: elasticsearch
node.name: “slave2”
path.data: data/elasticsearch3/data
path.logs: data/elasticsearch3/logs
network.host: 192.168.247.151
#network.bind_host: 192.168.247.150
##network.publish_host: 192.168.247.150
##节点间的通信端口，接收单值或者一个范围。如果指定一个范围，该节点将会绑定范围的第一个可用顶点
##transport.tcp.port: 9301
#http.port: 9201
discovery.zen.ping.unicast.hosts: [“192.168.247.150”,“192.168.247.150”,“192.168.247.151”]
discovery.zen.minimum_master_nodes: 1
bootstrap.memory_lock: false
bootstrap.system_call_filter: false

优先启动 slave节点
启动成功后如图显示

基础概念
索引：含有相同属性的文档集合
类型：索引可以定义一个或多个类型
文档：文档是可以被索引的基本数据单位
分片：每个索引都有多个分片，每个分片是一个lucene索引
备份：拷贝一份分片就完成了分片的备份
索引建立
利用springBoot 集成elasticseach
Pom.xml

org.springframework.boot
spring-boot-starter-data-elasticsearch

org.elasticsearch.client
transport

org.elasticsearch.plugin
transport-netty4-client

org.elasticsearch.client
transport
6.4.2

org.springframework.boot
spring-boot-starter-web

org.projectlombok
lombok
1.16.18

com.alibaba
fastjson
1.2.47

org.springframework.boot
spring-boot-starter-test
test

org.springframework.boot spring-boot-maven-plugin

Appliaction.properties
#对应集群中的名字
spring.data.elasticsearch.cluster-name=elasticsearch
#集群中的节点信息
spring.data.elasticsearch.cluster-nodes=192.168.247.150:9300,192.168.247.150:9301,192.168.247.151:9300

创建实体
package com.wemall.elasticsearch_server;
import lombok.Data;
import lombok.ToString;
import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;
import java.io.Serializable;

/**

用户实体
date : 2017/10/31
time : 16:07
@author Nero
/
@Data
@ToString
@Document(indexName = “my_test”, type = “user”, createIndex = false)
public class User implements Serializable {
@Id
private Long id;
/*
- 名称
  /
  private String name;
  /*
- 描述
  /
  private String desc;
  /*
- 年龄
  /
  private Integer age;
  /*
- 所在科室
  */
  private Integer hid;
  }

创建持久化层
package com.wemall.elasticsearch_server;
import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;
import java.util.List;

/**

用户信息搜索基础实现
date : 2017/11/27
time : 15:29
@author Nero
*/
interface UserRepositoryImpl extends ElasticsearchRepository<User, Long>{

/**
- 根据名称查询
- @param name 名称
- @return 用户列表
  */
  List findByName(String name);
  }

Controller:
package com.wemall.elasticsearch_server;

import com.alibaba.fastjson.JSON;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

import java.util.Optional;

@RestController
public class ElasticsearchController {
@Autowired
private UserRepositoryImpl userRepository;

//增加
@RequestMapping("/add")
public String add(){

    User employee=new User();
    employee.setId(1L);
    employee.setAge(18);
    employee.setName("zhangsan");
    employee.setDesc("desc");
    userRepository.save(employee);

    System.err.println("add a obj");

    return "success";
}

//删除
@RequestMapping("/delete")
public String delete(){
    userRepository.deleteById(1L);

    return "success";
}

//局部更新
@RequestMapping("/update")
public String update(){

    Optional<User> employee=userRepository.findById(1L);
    employee.get().setName("李四");
    userRepository.save(employee.get());
    System.err.println("update a obj");
    return "success";
}

//查询
@RequestMapping("/query")
public User query(){

    Optional<User> accountInfo=userRepository.findById(1L);
    System.err.println(JSON.toJSONString(accountInfo.get()));
    return accountInfo.get();
}

}

中文分词
下载地址
https://github.com/medcl/elasticsearch-analysis-ik/tree/master 选择相应的版本
6.直接下载
https://github.com/medcl/elasticsearch-analysis-ik/archive/master.zip
解压用maven 编译
Mvn install

在plugins 创建一个ik 目录
把elasticsearch-analysis-ik-6.3.0.zip 放到ik目录下
Unzip elasticsearch-analysis-ik-6.3.0.zip
启动

代码
package com.wemall.elasticsearch_server;
import lombok.Data;
import lombok.ToString;
import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;
import org.springframework.data.elasticsearch.annotations.Field;
import org.springframework.data.elasticsearch.annotations.FieldType;

import java.io.Serializable;

/**

用户实体
date : 2017/10/31
time : 16:07
@author Nero
/
@Data
@ToString
@Document(indexName = “my_test”, type = “user”, createIndex = false)
public class User implements Serializable {
@Id
private Long id;
/*
- 名称
  /
  @Field(type = FieldType.Text, analyzer = “ik_max_word”, searchAnalyzer = “ik_max_word”)
  private String name;
  /*
- 描述
  /
  private String desc;
  /*
- 年龄
  /
  private Integer age;
  /*
- 所在科室
  */
  private Integer hid;
  }

Controller：

@RequestMapping("/queryGroup")
public void queryGroup(){
Iterable search = userRepository.search(new MatchQueryBuilder(“name”, “呵呵”));
// search.forEach(resource -> {
// System.out.println(resource.getName());
// System.out.println(resource.getAge());
// });
Iterator iterator = search.iterator();
while (iterator.hasNext()){
User next = iterator.next();
System.out.println(next.getAge());
System.out.println(next.getName());

}
}

关于ElasticSearch配置文件说明
# 集群名称，默认是elasticsearch

cluster.name: my-application

# 节点名称
# 默认从elasticsearch-2.4.3/lib/elasticsearch-2.4.3.jar!
#config/names.txt中随机选择一个名称

node.name: node-1

# 可以指定es的数据存储目录，默认存储在es_home/data目录下

path.data: /path/to/data

# 可以指定es的日志存储目录，默认存储在es_home/logs目录下

path.logs: /path/to/logs

# 锁定物理内存地址，防止elasticsearch内存被交换出去,也就是避免es使用swap交换分区

bootstrap.memory_lock: true

# 为es设置ip绑定，默认是127.0.0.1，也就是默认只能通过127.0.0.1 或者localhost才能访问
# es1.x版本默认绑定的是0.0.0.0 所以不需要配置，但是es2.x版本默认绑定的是127.0.0.1，需要配置

Set the bind address to a specific IP (IPv4 or IPv6):

network.host: 192.168.0.1

# 为es设置自定义端口，默认是9200
# 注意：在同一个服务器中启动多个es节点的话，默认监听的端口号会自动加1：例如：9200，9201，9202…

Set a custom port for HTTP:

http.port: 9200

# 当启动新节点时，通过这个ip列表进行节点发现，组建集群
# 默认节点列表：
# 127.0.0.1，表示ipv4的回环地址。

[::1]，表示ipv6的回环地址

# 在es1.x中默认使用的是组播(multicast)协议，默认会自动发现同一网段的es节点组建集群，
# 在es2.x中默认使用的是单播(unicast)协议，想要组建集群的话就需要在这指定要发现的节点信息了。
# 注意：如果是发现其他服务器中的es服务，可以不指定端口[默认9300]，如果是发现同一个服务器中的es服务，就需要指定端口了。

Pass an initial list of hosts to perform discovery when new node is started:

The default list of hosts is [“127.0.0.1”, “[::1]”]

discovery.zen.ping.unicast.hosts: [“host1”, “host2”]

# 通过配置这个参数来防止集群脑裂现象 (集群总节点数量/2)+1

Prevent the “split brain” by configuring the majority of nodes (total number of nodes / 2 + 1):

discovery.zen.minimum_master_nodes: 3

# 一个集群中的N个节点启动后,才允许进行数据恢复处理，默认是1

gateway.recover_after_nodes: 3

# 在一台服务器上禁止启动多个es服务

Disable starting multiple nodes on a single system:

node.max_local_storage_nodes: 1

3、注意：这个文件是yaml格式的文件
　　　　（1）：属性顶格写，不能有空格
　　　　（2）：缩进一定不能使用tab制表符
　　　　（3）：属性和值之间的:后面需要有空格
　　　　　　　　network.host: 192.168.80.200

参考网址:https://www.cnblogs.com/lizichao1991/p/7809156.html

unable to install syscall filter: java.lang.UnsupportedOperationException: seccomp unavailable: requires kernel 3.5+ with CONFIG_SECCOMP and CONFIG_SECCOMP_FILTER compiled in
at org.elasticsearch.bootstrap.SystemCallFilter.linuxImpl(SystemCallFilter.java:329) ~[elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.bootstrap.SystemCallFilter.init(SystemCallFilter.java:617) ~[elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.bootstrap.JNANatives.tryInstallSystemCallFilter(JNANatives.java:260) [elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.bootstrap.Natives.tryInstallSystemCallFilter(Natives.java:113) [elasticsearch-6.6.2.jar:6.6.2]
at org.elasticsearch.bootstrap.Bootstrap.initializeNatives(Bootstrap.java:108) [elasticsearch-6.6.2.jar:6.6.2]

Bug:
org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: failed to obtain node locks, tried [[/opt/elastic-data/data]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])

Bug:
ERROR: [3] bootstrap checks failed
[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]
[2]: max number of threads [1024] for user [blank] is too low, increase to at least [4096]
[3]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

linux下安装Elasticseach及说明

Path to log files:

----------------------------------- Memory -----------------------------------

Lock the memory on startup:

---------------------------------- Network -----------------------------------

Set the bind address to a specific IP (IPv4 or IPv6):

Set a custom port for HTTP:

cluster.name: my-application

node.name: node-1

path.data: /path/to/data

path.logs: /path/to/logs

bootstrap.memory_lock: true

Set the bind address to a specific IP (IPv4 or IPv6):

network.host: 192.168.0.1

Set a custom port for HTTP:

http.port: 9200

[::1]，表示ipv6的回环地址

Pass an initial list of hosts to perform discovery when new node is started:

The default list of hosts is [“127.0.0.1”, “[::1]”]

discovery.zen.ping.unicast.hosts: [“host1”, “host2”]

Prevent the “split brain” by configuring the majority of nodes (total number of nodes / 2 + 1):

discovery.zen.minimum_master_nodes: 3

gateway.recover_after_nodes: 3

Disable starting multiple nodes on a single system:

node.max_local_storage_nodes: 1

猜你喜欢