[C++][第三方库][Elasticsearch]详细讲解

DieSnowK 2024-10-14 17:05:01 阅读 82

目录

1.介绍2.安装1.ES2.Kibana

3.ES核心概念1.索引(index)2.类型(Type)3.字段(Field)4.映射(mapping)5.文档(document)

4.Kibana访问ES进行测试5.ES客户端的安装6.ES客户端接口介绍7.使用


1.介绍

<code>Elasticsearch,简称ES,它是个开源分布式搜索引擎

特点:分布式,零配置,自动发现,索引自动分片,索引副本机制,restful风格接口,多数据源,自动搜索负载等它可以近乎实时的存储、检索数据;本身扩展性很好,可以扩展到上百台服务器,处理PB级别的数据ES也使用Java开发并使用Lucene作为其核心来实现所有索引和搜索的功能,但是它的目的是通过简单的RESTfulAPI来隐藏Lucene的复杂性,从而让全文搜索变得简单 Elasticsearch是**面向文档**(document oriented)的

这意味着它可以存储整个对象或文档(document)然而它不仅仅是存储,还会索引(index)每个文档的内容使之可以被搜索

可以对文档(而非成行成列的数据)进行索引、搜索、排序、过滤


2.安装

1.ES

添加仓库密钥:上边的添加方式会导致一个apt-key的警告,如果不想报警告使用下边这个

# 1.

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

# 2.

curl -s https://artifacts.elastic.co/GPG-KEY-elasticsearch | \

sudo gpg --no-default-keyring \

--keyring gnupg-ring:/etc/apt/trusted.gpg.d/icsearch.gpg --import

添加镜像源仓库

echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" \

| sudo tee /etc/apt/sources.list.d/elasticsearch.list

更新软件包列表

sudo apt update

安装ES

sudo apt-get install elasticsearch=7.17.21

启动ES

sudo systemctl start elasticsearch

安装ik分词器插件

sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install \

https://get.infini.cloud/elasticsearch/analysis-ik/7.17.21

查看ES服务的状态

sudo systemctl status elasticsearch.service

验证ES是否安装成功

curl -X GET "http://localhost:9200/"

设置外网访问:默认只能在本机进行访问,修改后浏览器访问IP:PORT

vim /etc/elasticsearch/elasticsearch.yml

# 新增配置

network.host: 0.0.0.0

http.port: 9200

cluster.initial_master_nodes: ["node-1"]

如果启动ES的时候出现报错

解决方法

# 调整ES虚拟内存,虚拟内存默认最大映射数为65530,无法满足ES系统要求, 需要调整为262144以上

sudo sysctl -w vm.max_map_count=262144

# 增加虚拟机内存配置

sudo vim /etc/elasticsearch/jvm.options

# 新增如下内容

-Xms512m

-Xmx512m

Job for elasticsearch.service failed because the control process exited with error code.

See "systemctl status elasticsearch.service" and "journalctl -xeu elasticsearch.service" for details.


2.Kibana

安装Kibana

sudo apt install kibana

配置Kibana(可选):根据需要配置Kibana,配置文件通常位于/etc/kibana/kibana.yml,可能需要设置如服务器地址、端口、Elasticsearch URL启动Kibana

sudo systemctl start kibana

设置开机自启(可选)

sudo systemctl enable kibana

访问Kibanahttp://<ip>:5601


3.ES核心概念

1.索引(index)

一个索引就是一个拥有几分相似特征的文档的集合

例如

有一个客户数据的索引,一个产品目录的索引,还有一个订单数据的索引一个索引由一个名字来标识(必须全部是小写字母的),并且当要对应于这个索引中的文档进行索引、搜索、更新和删除的时候,都要使用到这个名字 在一个集群中,可以定义任意多的索引索引类似于数据库中库的概念

数据库中的库,表示了一组数据的集合ES中的索引,是一组相似特征数据的集合


2.类型(Type)

在一个索引中,可以定义一种或多种类型一个类型是索引的一个逻辑上的分类/分区,其语义完全由用户来定通常,会为具有一组共同字段的文档定义一个类型

例如

运营一个博客平台并且将所有的数据存储到一个索引中在这个索引中,可以为用户数据定义一个类型,为博客数据定义另一个类型,为评论数据定义另一个类型 [类型]类似于数据库中表的概念,在索引的概念下,又对数据集合进行了一层细分现在[类型]几乎已经弃用


3.字段(Field)

字段相当于是数据库表的字段,对文档数据根据不同属性进行的分类标识 -> 数据类型

![[Pasted image 20240918180030.png]]


4.映射(mapping)

映射是在处理数据的方式和规则方面做一些限制

某个字段的数据类型、默认值、分析器、是否被索引等等,这些都是映射里面可以设置的

映射类似于告诉ES哪些字段需要分词,做出索引映射,能够进行数据检索 其它就是处理ES里面数据的一些使用规则设置也叫做映射 按着最优规则处理数据对性能提高很大,因此才需要建立映射,并且需要思考如何建立映射才能对性能更好具体规则

enabled:是否仅作存储,不做搜索和分析

取值true(默认)/false index:是否构建倒排索引(决定了是否分词,是否被索引)

取值true(默认)/false index_optiondynamic:控制mapping的自动更新

取值true(默认)/false doc_value:是否开启doc_value,用户聚合和排序分析,分词字段不能使用

取值true(默认)/false fielddata:是否为text类型启动fielddata,实现排序和聚合分析

针对分词字段,参与排序或聚合时能提高性能不分词字段统一建议使用doc_value

fielddata": {

"format": "disabled"

}

store:是否单独设置此字段的是否存储而从_source字段中分离

取值true/false(默认) coerce:是否开启自动数据类型转换功能,如字符串转整形,浮点转整形

取值true(默认)/false analyzer:指定分词器,默认分词器是standard analyzer

示例”analyzer”: “ik” boost:字段级别的分数加权,默认值是1.0

示例”boost”: 1.25 fields:对一个字段提供多种索引模式,同一个字段的值,一个分词一个不分词

"fields": {

"raw": {

"type": "text",

"index": "not_analyzed"

}

}

data_detection:是否自动识别日期类型

取值true(默认)/false


5.文档(document)

一个文档是一个可被索引的基础信息单元

例如:某一个客户的文档,某一个产品的一个文档或者某个订单的一个文档

文档以JSON格式来表示,而JSON是一个到处存在的互联网数据交互格式在一个index/type里面,可以存储任意多的文档一个文档必须被索引或者赋予一个索引的type

Elasticsearch与传统关系性数据库相比

DB Database Table Row Column
ES Index Type Document Field

4.Kibana访问ES进行测试

创建索引库

POST /user/_doc

{

"settings" : {

"analysis" : {

"analyzer" : {

"ik" : {

"tokenizer" : "ik_max_word"

}

}

}

},

"mappings" : {

"dynamic" : true,

"properties" : {

"nickname" : {

"type" : "text",

"analyzer" : "ik_max_word"

},

"user_id" : {

"type" : "keyword",

"analyzer" : "standard"

},

"phone" : {

"type" : "keyword",

"analyzer" : "standard"

},

"description" : {

"type" : "text",

"enabled" : false

},

"avatar_id" : {

"type" : "keyword",

"enabled" : false

}

}

}

}

新增数据

插入形式

POST /user/_doc/_bulk

{ "index":{ "_id":"1"}}

{ "user_id" : "USER4b862aaa-2df8654a-7eb4bb65e3507f66","nickname" : "昵称1","phone" : "手机号1","description" : "签名1","avatar_id" : "头像1"}

{ "index":{ "_id":"2"}}

{ "user_id" : "USER14eeeaa5-442771b9-0262e455e4663d1d","nickname" : "昵称2","phone" : "手机号2","description" : "签名2","avatar_id" : "头像2"}

{ "index":{ "_id":"3"}}

{ "user_id" : "USER484a6734-03a124f0-996c169dd05c1869","nickname" : "昵称3","phone" : "手机号3","description" : "签名3","avatar_id" : "头像3"}

{ "index":{ "_id":"4"}}

{ "user_id" : "USER186ade83-4460d4a6-8c08068f83127b5d","nickname" : "昵称4","phone" : "手机号4","description" : "签名4","avatar_id" : "头像4"}

{ "index":{ "_id":"5"}}

{ "user_id" : "USER6f19d074-c33891cf-23bf5a8357189a19","nickname" : "昵称5","phone" : "手机号5","description" : "签名5","avatar_id" : "头像5"}

{ "index":{ "_id":"6"}}

{ "user_id" : "USER97605c64-9833ebb7-d045535335a59195","nickname" : "昵称6","phone" : "手机号6","description" : "签名6","avatar_id" : "头像6"}

便于阅读

[

{

"index": {

"_id": "1"

},

"user": {

"user_id": "USER4b862aaa-2df8654a-7eb4bb65e3507f66",

"nickname": "昵称1",

"phone": "手机号1",

"description": "签名1",

"avatar_id": "头像1"

}

},

{

"index": {

"_id": "2"

},

"user": {

"user_id": "USER14eeeaa5-442771b9-0262e455e4663d1d",

"nickname": "昵称2",

"phone": "手机号2",

"description": "签名2",

"avatar_id": "头像2"

}

},

{

"index": {

"_id": "3"

},

"user": {

"user_id": "USER484a6734-03a124f0-996c169dd05c1869",

"nickname": "昵称3",

"phone": "手机号3",

"description": "签名3",

"avatar_id": "头像3"

}

},

{

"index": {

"_id": "4"

},

"user": {

"user_id": "USER186ade83-4460d4a6-8c08068f83127b5d",

"nickname": "昵称4",

"phone": "手机号4",

"description": "签名4",

"avatar_id": "头像4"

}

},

{

"index": {

"_id": "5"

},

"user": {

"user_id": "USER6f19d074-c33891cf-23bf5a8357189a19",

"nickname": "昵称5",

"phone": "手机号5",

"description": "签名5",

"avatar_id": "头像5"

}

},

{

"index": {

"_id": "6"

},

"user": {

"user_id": "USER97605c64-9833ebb7-d045535335a59195",

"nickname": "昵称6",

"phone": "手机号6",

"description": "签名6",

"avatar_id": "头像6"

}

}

]

查看并搜索数据:

GET /user/_doc/_search?pretty

{

"query" : {

"bool" : {

"must_not" : [

{

"terms" : {

"user_id.keyword" : [

"USER4b862aaa-2df8654a-7eb4bb65e3507f66",

"USER14eeeaa5-442771b9-0262e455e4663d1d",

"USER484a6734-03a124f0-996c169dd05c1869"

]

}

}

],

"should" : [

{

"match" : {

"user_id" : "昵称"

}

},

{

"match" : {

"nickname" : "昵称"

}

},

{

"match" : {

"phone" : "昵称"

}

}

]

}

}

}

删除索引

DELETE /user

查询所有数据

POST /user/_doc/_search

{

"query":

{

"match_all":{ }

}

}


5.ES客户端的安装

代码官网ES C++的客户端选择并不多, 这里使用elasticlient前置安装:依赖MicroHTTPD

sudo apt-get install libmicrohttpd-dev

安装

# 克隆代码

git clone https://github.com/seznam/elasticlient

# 切换目录

cd elasticlient

# 更新子模块

git submodule update --init --recursive

# 编译代码

make build && cd build

cmake ..

make

# 安装

make install


6.ES客户端接口介绍

/**

* Perform search on nodes until it is successful. Throws

exception if all nodes

* has failed to respond.

* \param indexName specification of an Elasticsearch index.

* \param docType specification of an Elasticsearch document type.

* \param body Elasticsearch request body.

* \param routing Elasticsearch routing. If empty, no routing has

been used.

*

* \return cpr::Response if any of node responds to request.

* \throws ConnectionException if all hosts in cluster failed to

respond.

*/

cpr::Response search(const std::string &indexName,

const std::string &docType,

const std::string &body,

const std::string &routing = std::string());

/**

* Get document with specified id from cluster. Throws exception

if all nodes

* has failed to respond.

* \param indexName specification of an Elasticsearch index.

* \param docType specification of an Elasticsearch document type.

* \param id Id of document which should be retrieved.

* \param routing Elasticsearch routing. If empty, no routing has

been used.

*

* \return cpr::Response if any of node responds to request.

* \throws ConnectionException if all hosts in cluster failed to

respond.

*/

cpr::Response get(const std::string &indexName,

const std::string &docType,

const std::string &id = std::string(),

const std::string &routing = std::string());

/**

* Index new document to cluster. Throws exception if all nodes

has failed to respond.

* \param indexName specification of an Elasticsearch index.

* \param docType specification of an Elasticsearch document type.

* \param body Elasticsearch request body.

* \param id Id of document which should be indexed. If empty, id

will be generated

* automatically by Elasticsearch cluster.

* \param routing Elasticsearch routing. If empty, no routing has

been used.

*

* \return cpr::Response if any of node responds to request.

* \throws ConnectionException if all hosts in cluster failed to

respond.

*/

cpr::Response index(const std::string &indexName,

const std::string &docType,

const std::string &id,

const std::string &body,

const std::string &routing = std::string());

/**

* Delete document with specified id from cluster. Throws

exception if all nodes

* has failed to respond.

* \param indexName specification of an Elasticsearch index.

* \param docType specification of an Elasticsearch document type.

* \param id Id of document which should be deleted.

* \param routing Elasticsearch routing. If empty, no routing has

been used.

*

* \return cpr::Response if any of node responds to request.

* \throws ConnectionException if all hosts in cluster failed to

respond.

*/

cpr::Response remove(const std::string &indexName,

const std::string &docType,

const std::string &id,

const std::string &routing = std::string());


7.使用

ES客户端使用注意

地址后边不要忘了相对根目录http://127.0.0.1:9200/ES客户端API使用时,要进行异常捕捉,否则操作失败会导致程序异常退出

#include <iostream>

#include <elasticlient/client.h>

#include <cpr/cpr.h>

int main()

{

// 1.构造ES客户端

elasticlient::Client client({ "http://127.0.0.1:9200/"});

// 2.发起搜索请求

try

{

auto resp = client.search("user", "_doc",

"{\"query\": { \"match_all\":{} }}");

std::cout << resp.status_code << std::endl;

std::cout << resp.text << std::endl;

}

catch(std::exception &e)

{

std::cout << e.what() << std::endl;

return -1;

}

return 0;

}




声明

本文内容仅代表作者观点,或转载于其他网站,本站不以此文作为商业用途
如有涉及侵权,请联系本站进行删除
转载本站原创文章,请注明来源及作者。