Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Clarify full-compaction changelog integrality #4551

Open
2 tasks done
zhongyujiang opened this issue Nov 19, 2024 · 2 comments
Open
2 tasks done

[Doc] Clarify full-compaction changelog integrality #4551

zhongyujiang opened this issue Nov 19, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@zhongyujiang
Copy link
Contributor

zhongyujiang commented Nov 19, 2024

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

Currently, the doc of full compaction changelog producer states that "Full compaction changelog producer can produce complete changelog for any type of source", however, when full-compaction.delta-commits is greater than 1, the intermediate changes across multiple snapshots will be ignored.

Iceberg CDC refers to this as net changes, and Snowflake refers to this as Minimum-delta changes, both differ from a "complete" changelog. So I think this also worth clarifying in the Paimon doc, because we usually consider net changes and complete changes to be different.

Solution

I think we should clarify that the full compaction changelog producer will only output complete changes when full-compaction.delta-commits is set to 1; when full-compaction.delta-commits is set to a value greater than 1, intermediate changes across the serveral delta snapshots will be ignored.

cc @JingsongLi What do you think?

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@zhongyujiang zhongyujiang added the enhancement New feature or request label Nov 19, 2024
@JingsongLi
Copy link
Contributor

@zhongyujiang I think it is good idea~ Also, In fact, we may also overlook the repeated changes within a commit, which is worth emphasizing.

@zhongyujiang
Copy link
Contributor Author

In fact, we may also overlook the repeated changes within a commit, which is worth emphasizing.

Thanks for replying, I'll try to provide a PR to clarify this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants