Allow users to preserve document IDs between source and target with live capture #1087

sumobrian · 2024-10-22T19:43:13Z

Is your feature request related to a problem?

There is a limitation on live capture (using capture and replay) where if a user does not specify an ID when publishing to the source, the ID when publishing to the target will differ. In some cases, this may not matter, but this is problematic if you are expecting these IDs to be the same within your application code that leverages Elasticsearch/OpenSearch. An example where this can be problematic is with updates to records. When updating a document, you typically supply the document ID to target a specific record.

What solution would you like?

The ideal solution would involve implementing logic within capture and replay that ensures ID consistency between the source and target clusters. This could be achieved by either capturing and reusing auto-generated IDs or providing a mechanism to handle ID assignments explicitly. The solution should allow for ID preservation even in cases where the ID is not provided initially by the user. Specifically, response data containing the document ID that was generated on the source can then be re-used by the replayer to submit a request to the target using the same ID from the source.

What alternatives have you considered?

Generating and storing a unique mapping of source-to-target IDs during the capture phase to allow for ID consistency during the replay phase. This approach, however, could increase storage and lookup complexity.
Using custom logic in the application layer to track and align IDs between the source and target clusters, but this introduces extra development effort and may not be feasible in all use cases.

Do you have any additional context?

In use cases where ID consistency is crucial (e.g., for updates or deletions), mismatched IDs between source and target can cause unintended results or errors within application logic. This enhancement would ensure that the capture and replay process maintains ID consistency, reducing risks and enabling a more reliable migration experience. Additionally, applications depending on this feature would experience fewer edge-case issues during migrations, allowing them to align source and target records seamlessly.

sumobrian added enhancement New feature or request untriaged and removed untriaged labels Oct 22, 2024

sumobrian moved this to 3-6 Months in OpenSearch Migrations - Roadmap Oct 22, 2024

sumobrian added this to OpenSearch Migrations - Roadmap Oct 22, 2024

sumobrian changed the title ~~[FEATURE] Allow users to preserve document IDs between source and target with live capture~~ Allow users to preserve document IDs between source and target with live capture Oct 23, 2024

sumobrian added MAx3.x MAv4.0 and removed MAx3.x labels Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow users to preserve document IDs between source and target with live capture #1087

Allow users to preserve document IDs between source and target with live capture #1087

sumobrian commented Oct 22, 2024

Allow users to preserve document IDs between source and target with live capture #1087

Allow users to preserve document IDs between source and target with live capture #1087

Comments

sumobrian commented Oct 22, 2024

Is your feature request related to a problem?

What solution would you like?

What alternatives have you considered?

Do you have any additional context?