Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for exactly-once #268

Closed
darrenhaken opened this issue Apr 28, 2020 · 3 comments
Closed

Support for exactly-once #268

darrenhaken opened this issue Apr 28, 2020 · 3 comments

Comments

@darrenhaken
Copy link

I know Kafka has supported exactly-once delivery for several versions and the S3 connector also supports it:
https://www.confluent.io/blog/apache-kafka-to-amazon-s3-exactly-once/

I was wondering if this connector also does?

@C0urante
Copy link
Collaborator

C0urante commented Apr 28, 2020

Unfortunately not, exactly-once in a connector is kind of tricky. The way the S3 connector achieves exactly-once is through idempotent writes to the external system (at least, when it's configured in a certain way).
The BigQuery connector does give each row an insertID, which is used by BigQuery to dedupe on a best-effort basis, but the guarantees are limited.
Exactly-once might not be that big of a deal with BigQuery anyways, since as far as I know, one common practice is to just allow duplicate writes to a table and then perform deduplication logic with a custom query.
For what it's worth, I've opened #264 to propose support for a logical UPSERT operation, which would perform deduplication for you on the fly (as long as each row can be identified with a unique field or set of fields).

@darrenhaken
Copy link
Author

Thanks for answering!

@cemo
Copy link

cemo commented Jan 7, 2021

@C0urante I guess you were talking about Streaming Inserts. What about for load job operations?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants