Elasticsearch 自动插入东八区默认时间实现

1、问题1：Elasticsearch 有没有办法设置自动默认值呢?

比如默认update_time=当前时间？

回顾一下 Elasticsearch 相关知识点，可知：Elasticsearch 并没有创建索引设定默认值的机制。

也就是说，没有 MySQL 中设置字段默认值的功能。

MySQL 中设置默认时间，大家都比较熟悉了：

CREATE TABLE example_table (id INT AUTO_INCREMENT PRIMARY KEY,name VARCHAR(255) NOT NULL,update_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

上述语句创建了一张名为 example_table 的表，其中 update_time 字段默认值为当前时间。

在 Elasticsearch 中，虽然没有直接设置字段默认值的机制，但可以通过其他方式实现类似的功能。

2、Elasticsearch 自动添加默认值方案探讨

以下是一些可能的方法：

2.1 方案一：借助 Ingest Pipeline 预处理实现

可以创建一个 Ingest Pipeline，在文档被索引之前自动添加或修改字段值。

在这种情况下，可以设置 update_time 字段为当前时间。

创建 Ingest Pipeline 的示例如下：

## step1:创建 pipeline
PUT _ingest/pipeline/default_time_pipeline
{"description": "Sets default update_time to now","processors": [{"set": {"field": "update_time","value": "{{_ingest.timestamp}}"}}]
}

然后在索引文档时指定使用这个 pipeline：

## step2:然后在索引文档时指定使用这个 pipeline
PUT /example_index/_doc/1?pipeline=default_time_pipeline
{"id": 1,"name": "example name"
}

## step3：检查数据是否已自带插入时间
GET example_index/_search

更好的实现方案推荐：创建索引的时候同时指定 default_pipeline，更为精简和方便。

也就是步骤 2 改成：

步骤2：创建索引并指定 default_pipeline：在创建索引时指定使用 default_pipeline，这样在向该索引添加文档时，默认会使用这个 pipeline。

PUT /example_index_02
{"settings": {"index.default_pipeline": "default_time_pipeline"},"mappings": {"properties": {"id": { "type": "integer" },"name": { "type": "text" },"update_time": { "type": "date" }}}
}

步骤3：索引文档：

现在可以直接向索引添加文档，不需要每次指定 pipeline，update_time 字段会自动设置为当前时间。

PUT /example_index_02/_doc/1
{"id": 1,"name": "example name"
}

https://www.elastic.co/guide/en/elasticsearch/reference/current/ingest.html

2.2 方案二：在应用层处理默认值

可以在应用程序代码中，在发送文档到 Elasticsearch 之前，手动设置 update_time 字段为当前时间。

例如，在 Python 中使用 elasticsearch 客户端库：

from datetime import datetime
from elasticsearch import Elasticsearches = Elasticsearch()doc = {'id': 1,'name': 'example name','update_time': datetime.utcnow()
}es.index(index="example_index", id=1, body=doc)

通过这些方法，虽然不能像 MySQL 那样直接在索引中设置默认值，但可以实现类似的效果，确保 update_time 字段在文档索引时自动设置为当前时间。

和方案一相比，方案一更加灵活便捷！

3、默认时间搞对了，但默认是UTC，能不能改成+8区时间呢？

8 小时滞后的本质原因：时区问题。

Elasticsearch 滞后8个小时等时区问题，一网打尽！

Elasticsearch 默认 UTC 0 时区，咱们是东 8 区，需要加上 8 个小时，时间才能一致。

如何实现呢？需要使用script更新现有文档的时间字段。

方案：使用Ingest Pipeline在索引时处理时区

如果你希望在索引时就处理好时区，可以创建一个Ingest Pipeline，使用Painless脚本来转换时区：

PUT _ingest/pipeline/convert_to_shanghai_time
{"description": "Convert update_time to Asia/Shanghai timezone","processors": [{"script": {"lang": "painless","source": """ZonedDateTime utcDate = ZonedDateTime.parse(ctx.update_time);ZonedDateTime shanghaiDate = utcDate.withZoneSameInstant(ZoneId.of('Asia/Shanghai'));ctx.update_time = shanghaiDate.format(DateTimeFormatter.ofPattern("yyyy-MM-dd'T'HH:mm:ss"));"""}}]
}

脚本释义同 3.1，不再赘述。

然后在索引文档时使用这个Pipeline：

PUT my-index/_doc/1?pipeline=convert_to_shanghai_time
{"field1": "value1","update_time": "2024-07-26T12:38:46.713Z"  // 输入的UTC时间
}

或者创建索引的时候指定 default_pipeline，实现参见 2.1，不再赘述。

4、小结

在 Elasticsearch 中，虽然没有直接设置字段默认值的机制，但可以通过其他方式实现类似的功能。

推荐使用 Ingest Pipeline 在文档被索引之前自动添加或修改字段值，并且在创建索引时指定default_pipeline，这样所有文档都会自动应用这个 Pipeline，从而使操作更加简便和高效。

如果需要处理时区问题，可以在 Pipeline 中使用 Painless 脚本将时间转换为所需的时区。

此外，还可以在应用层手动设置字段默认值。通过这些方法，可以确保字段在文档索引时自动设置为期望的值。

Elasticsearch 预处理没有奇技淫巧，请先用好这一招！

Elasticsearch的ETL利器——Ingest节点

近 30000 人都在看的 ElasticStack 非官方技术公众号

Elasticsearch 自动插入东八区默认时间实现

1、问题1：Elasticsearch 有没有办法设置自动默认值呢?

2、Elasticsearch 自动添加默认值方案探讨

2.1 方案一：借助 Ingest Pipeline 预处理实现

2.2 方案二：在应用层处理默认值

3、默认时间搞对了，但默认是UTC，能不能改成+8区时间呢？

方案：使用Ingest Pipeline在索引时处理时区

4、小结

相关资讯

热文排行

最新新闻

推荐新闻

热搜词