What is the update_mapping message in logging-es logs

Solution Unverified - Updated

Environment

  • Red Hat OpenShift Container Platform
    • 3.x

Issue

  • We see a number of logs in the logging-es pod logs with a message, update_mapping
[2019-06-21 20:34:23,871][INFO ][cluster.metadata         ] [logging-es-data-master-3a6qmm33] [project.example.81e5b90c-763a-11e8-9102-0050569c31da.2019.06.21] update_mapping [com.redhat.viaq.common]
  • What is the cause of this metadata field change in Elasticsearch?
  • We see this constantly logged for one project but not for others
  • We enabled MERGE_JSON_LOG in our fluentd and now ES is updating its mapping frequently

Resolution

This should only be an issue if new fields are being repeatedly created. In a normal, functioning cluster this message might be seen a few times, or shortly after index creation, as a part of normal Elasticsearch operations.

If this message is appearing for a particular index repeatedly, and especially if it is causing performance issues, there are a few options to resolve this.

  • Fix the application logs to not use data as keys.

    • As discussed in the Root Cause section, when applications use JSON logs they should ensure that the keys in the JSON key:value pairs are generally unchanging.
    • Keys should generally be unchanging strings. IP addresses, usernames, and timestamps do not make good keys.
  • Disable MERGE_JSON_LOG

    • This prevents Elasticsearch from using log message JSON to generate metadata tags. This can cause issues if users expect to search on these fields.
    • This is set to false by default in later versions
# oc set env ds/logging-fluentd MERGE_JSON_LOG=false --overwrite=true

Root Cause

This message means that a new field (a JSON key in the log JSON entry) has been spotted for the first time. By default (unless told differently), Elasticsearch tries to infer the type of this field and include it into existing index schema (aka mapping).

These logs will be reported primarily in the master logging-es pod. Typically, one logging-es pod at a time will be listed as the master, and handles ES cluster state, etc.

This is commonly caused when undefined fields are added via MERGE_JSON_LOG, an option that allows JSON fields in the log message to be indexed as metadata. If logs are structured improperly, this can lead to dozens or hundreds of new metadata mappings. For this and other reasons, MERGE_JSON_LOG is disabled by default.

If it is happening too frequently, it can mean users are pushing in strange documents that can lead to high resource consumption in longer term. For example, when the JSON key is an increasing sequence or a unique field (like user name, IP address, etc ...). JSON best practices indicate that data should never be used as a JSON key (in the key:value pair).

Diagnostic Steps

  • Choose the index to investigate. If the update_mapping message is occurring on a specific index repeatedly, it might be good to use that.
[2019-06-21 20:34:23,871][INFO ][cluster.metadata         ] [logging-es-data-master-3a6qmm33] [project.example.81e5b90c-763a-11e8-9102-0050569c31da.2019.06.21] update_mapping [com.redhat.viaq.common]
  • Otherwise, get the list of indices:
$ oc exec -c elasticsearch $espod -- es_util --query=_cat/indices?v
  • Get a list of the keys and their mapping type
$ oc exec -c elasticsearch $espod -- es_util --query=$INDEX/_mapping?pretty
//For example
$ oc exec -c elasticsearch $espod -- es_util --query=.operations.2019.06.26/_mapping?pretty
  • See the logs in JSON format for comparison with the mapping
$ oc exec -c elasticsearch $espod -- es_util --query=||$INDEX/_mapping/||com.redhat.viaq.common?pretty
//For example
$ oc exec -c elasticsearch $espod -- es_util --query=||.operations.2019.06.26/_mapping/||com.redhat.viaq.common?pretty

These steps are derived from the Content from www.elastic.co is not included.Elasticsearch documentation

This issue is part of a family of solutions related to the Elasticsearch errors returned message. See more here

SBR
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.