Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to preserve document IDs between source and target with live capture #1087

Open
sumobrian opened this issue Oct 22, 2024 · 0 comments
Labels
enhancement New feature or request MAv4.0

Comments

@sumobrian
Copy link
Collaborator

Is your feature request related to a problem?

There is a limitation on live capture (using capture and replay) where if a user does not specify an ID when publishing to the source, the ID when publishing to the target will differ. In some cases, this may not matter, but this is problematic if you are expecting these IDs to be the same within your application code that leverages Elasticsearch/OpenSearch. An example where this can be problematic is with updates to records. When updating a document, you typically supply the document ID to target a specific record.

What solution would you like?

The ideal solution would involve implementing logic within capture and replay that ensures ID consistency between the source and target clusters. This could be achieved by either capturing and reusing auto-generated IDs or providing a mechanism to handle ID assignments explicitly. The solution should allow for ID preservation even in cases where the ID is not provided initially by the user. Specifically, response data containing the document ID that was generated on the source can then be re-used by the replayer to submit a request to the target using the same ID from the source.

What alternatives have you considered?

  • Generating and storing a unique mapping of source-to-target IDs during the capture phase to allow for ID consistency during the replay phase. This approach, however, could increase storage and lookup complexity.
  • Using custom logic in the application layer to track and align IDs between the source and target clusters, but this introduces extra development effort and may not be feasible in all use cases.

Do you have any additional context?

In use cases where ID consistency is crucial (e.g., for updates or deletions), mismatched IDs between source and target can cause unintended results or errors within application logic. This enhancement would ensure that the capture and replay process maintains ID consistency, reducing risks and enabling a more reliable migration experience. Additionally, applications depending on this feature would experience fewer edge-case issues during migrations, allowing them to align source and target records seamlessly.

@sumobrian sumobrian added enhancement New feature or request untriaged and removed untriaged labels Oct 22, 2024
@sumobrian sumobrian changed the title [FEATURE] Allow users to preserve document IDs between source and target with live capture Allow users to preserve document IDs between source and target with live capture Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request MAv4.0
Projects
Status: 6 Months - 1 Year
Development

No branches or pull requests

1 participant