Elasticsearch reports "400 - Rejected by Elasticsearch" errors
Environment
- OpenShift Container Platform
- 3.7.72
- Fluentd
- 0.12.42
- Related fluent gems:
- 'fluent-plugin-elasticsearch' version '1.17.0'
- 'fluent-plugin-kubernetes_metadata_filter' version '1.0.1'
Issue
Seeing "400 - Rejected by Elasticsearch" errors in fluentd logs like the following:
2019-01-23 14:23:56 +0100 [warn]: dump an error event: error_class=Fluent::ElasticsearchErrorHandler::ElasticsearchError error="400 - Rejected by Elasticsearch" location=nil tag="output_tag" time=1548249831 record={"caller"=>"transactionsBusiness.go:269", "component"=>"<obfuscated>", "correlationId"=>"<obfuscated>", "dateTime"=>"2019-01-23T13:23:51.630668Z", "error"=>{"errorCategory"=>"validation", "errorCode"=>"monthlyValueLimitExceeded", "errorDateTime"=>"2019-01-23T13:23:51.623Z", "errorDescription"=>"validation_monthlyValueLimitExceeded", "errorParameters"=>[...]
Resolution
3.9
For versions 3.9+, follow this related solution to disable MERGE_JSON_LOG.
3.7
For later releases, we expose disabling via environment variable (MERGE_JSON_LOG), but for v3.7 this is the available workaround:
- Create a
configmapconsisting of these Content from github.com is not included.files (named fluentd-overrides for example). - Edit Content from github.com is not included.filter-k8s-meta.conf to include merge_json_log false.
- Edit the
daemonsetto mount theconfigmap:
$ oc edit daemonset logging-fluentd
a) add section to volumes:
- name: config-overrides
configMap:
name: fluentd-overrides
b) add section to volumeMounts:
- name: config-overrides
mountPath: /etc/fluent/configs.d/openshift
readOnly: true
IMPORTANT NOTE: The changes to the daemonset will need to be re-applied after any upgrade.
Root Cause
This is most likely caused because merging of JSON logs is Content from github.com is not included.enabled by default. The problem is that the applications are likely logging a JSON message payload that is being added to the payload fluentd submits to Elasticsearch. The addition of your application fields to fluent's payload exceeds the maximum allowing fields and ES rejects the messages.
Please check also in the This content is not included.BZ#1669223 for more details.
Diagnostic Steps
- Activate
fluentddebug mode to deeply investigate:
a) Edit fluentd configmap:
$ oc edit configmap logging-fluentd
b) Comment the include line and add the desired log_level:
...
# @include configs.d/openshift/system.conf
<system>
log_level trace
</system>
...
c) Delete fluentd pods:
$ oc delete pods -l component=fluentd
- Wait 10/15min with debug mode enabled and extract a full dump using the following script:
$ wget https://raw.githubusercontent.com/openshift/origin-aggregated-logging/release-3.11/hack/logging-dump.sh
$ chmod +x logging-dump.sh
$ oc login -u admin -p <password> https://openshift.example.com:8443
$ ./logging-dump.sh
This issue is part of a family of solutions related to the Elasticsearch errors returned message. See more here
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.