Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Malformed mapping can make index snapshots not restorable or mountable (searchable snapshots) #84146

Closed
lucabelluccini opened this issue Feb 17, 2022 · 8 comments
Labels
>bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Distributed Meta label for distributed team (obsolete) Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@lucabelluccini
Copy link
Contributor

lucabelluccini commented Feb 17, 2022

Elasticsearch Version

7.14.2, 7.17.0 (probably earlier too)

Installed Plugins

No response

Java Version

bundled

OS Version

not relevant

Problem Description

If a user accidentally ingests JSON documents which have weird/malformed bodies, the generated mappings due to dynamic mapping will make the snapshot of the index fail on restore.

This can happen also during ILM (when the index is moved to mounted phases when using searchable snapshots) or during a normal snapshot restore operation.

Steps to Reproduce

DELETE myverybadindex

PUT myverybadindex
{
  "mappings": {
    "properties": {
      "query": {
        "properties": {
          "1": {
            "type": "text"
          },
          "\u0000": {
            "type": "text"
          },
          "\u0000\u0000\u0000\u0000": {
            "type": "text"
          }
        }
      }
    }
  }
}

POST _snapshot/found-snapshots/myverybadsnapshot
{
  "indices": "myverybadindex",
  "include_global_state": false
}

GET _snapshot/found-snapshots/myverybadsnapshot

POST /_snapshot/found-snapshots/myverybadsnapshot/_mount?wait_for_completion=true
{
  "index": "myverybadindex", 
  "renamed_index": "myverybadindex-mounted",
  "ignore_index_settings": [ "index.refresh_interval" ] 
}

POST /_snapshot/found-snapshots/myverybadsnapshot/_restore
{
  "indices": "myverybadindex",
  "rename_pattern": "(.+)"
  , "rename_replacement": "restored_index_$1"
}

Both mounting and restore operations end up with:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "i_o_exception",
        "reason" : "Duplicate field '\u0000'\n at [Source: (org.elasticsearch.repositories.blobstore.ChecksumBlobStoreFormat$DeserializeMetaBlobInputStream); line: -1, column: 431]"
      }
    ],
    "type" : "i_o_exception",
    "reason" : "Duplicate field '\u0000'\n at [Source: (org.elasticsearch.repositories.blobstore.ChecksumBlobStoreFormat$DeserializeMetaBlobInputStream); line: -1, column: 431]"
  },
  "status" : 500
}

This can happen also during ILM (when the index is moved to mounted phases when using searchable snapshots).

Logs (if relevant)

No response

@lucabelluccini lucabelluccini added >bug needs:triage Requires assignment of a team area label :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs :Search Foundations/Mapping Index mappings, including merging and defining field types labels Feb 17, 2022
@elasticmachine elasticmachine added Team:Distributed Meta label for distributed team (obsolete) Team:Search Meta label for search team labels Feb 17, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@kunisen
Copy link
Contributor

kunisen commented Feb 18, 2022

[1]

I wonder if we have an easy way to fix this issue?

IIUC, we need to

  • find the index having bad mapping
  • delete those bad indices
  • take snapshot again
  • mount or restore

Is this enough and is there a good way to find bad mapping indices?
(I wonder if it's not only limited to Unicode, but may expand to more patterns, which might be not that easy to check until we got rejected by the failure.)

[2]

I also feel, it might be great if we block it at the index creation stage, because it doesn’t really make logical sense to make it "OK to snapshot" but "NG to restore/mount".

Had a chat with @Leaf-Lin, the reason behind this seems to be it's not great to prevent users from creating fields that are based on Unicode, because users in a different language would have fields that are completely normal to them, but ES is unable to process it correctly.

However, given it's causing discrepancy behavior in “index creation” and “snapshot/restore”, which probably ideally best to get things aligned.

Is there a way to across this? e.g. make an "encoding logic" internally to avoid using Unicode directly?
(like URL encoding is widely used in lots of applications)

@DaveCTurner
Copy link
Contributor

This seems to be a SMILE bug, or at least something that's not supported properly in SMILE. The following test fails for SMILE (and CBOR) but passes for JSON and YAML.

diff --git a/server/src/test/java/org/elasticsearch/common/xcontent/BaseXContentTestCase.java b/server/src/test/java/org/elasticsearch/common/xcontent/BaseXContentTestCase.java
index 96b93568c66..9ab3cce8aa8 100644
--- a/server/src/test/java/org/elasticsearch/common/xcontent/BaseXContentTestCase.java
+++ b/server/src/test/java/org/elasticsearch/common/xcontent/BaseXContentTestCase.java
@@ -140,6 +140,7 @@ public abstract class BaseXContentTestCase extends ESTestCase {
         expectUnclosedException(() -> BytesReference.bytes(builder().startObject().field("foo")));

         assertResult("{'foo':'bar'}", () -> builder().startObject().field("foo").value("bar").endObject());
+        assertResult("{'\\u0000':'','\\u0000\\u0000':''}", () -> builder().startObject().field("\0", "").field("\0\0", "").endObject());
     }

     public void testNullField() throws IOException {

The trouble is that the SMILE parser treats these field names as short ASCII strings which get cached to avoid unnecessary instantiation, but the cache is keyed by an integer representation of the string and both of these strings map to 0.

I don't think this is a general Unicode problem, it's only going to affect field names that are made up of some short sequence of NUL bytes. I have reported this at FasterXML/jackson-dataformats-binary#312.

Can we perhaps forbid field names containing NUL bytes entirely? Are they ever anything but a mistake?

@lucabelluccini
Copy link
Contributor Author

Thank you David for the prompt analysis.

Can we perhaps forbid field names containing NUL bytes entirely? Are they ever anything but a mistake?

It could be a nice feature.
Most times I've seen it was a client (Fluentd or other products) trying to index garbage data.
I would always allow an escape hatch (an index setting or a cluster setting ?).

@javanna javanna removed the needs:triage Requires assignment of a team area label label Feb 21, 2022
@SharpEdgeMarshall
Copy link

Any news on this? we have a not restorable index cause of this bug

@DaveCTurner
Copy link
Contributor

DaveCTurner commented Jul 6, 2022

The Jackson bug is fixed upstream, but a fixed version (≥2.14.0) is yet to be released.

@javanna
Copy link
Member

javanna commented Jun 13, 2024

Starting from Elasticsearch 8.6 we upgraded jackson to 2.14. This should be fixed now.

@javanna javanna closed this as completed Jun 13, 2024
@javanna javanna added Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch and removed Team:Search Meta label for search team labels Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Distributed Meta label for distributed team (obsolete) Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

6 participants