DOC-12484 XDCR Conflict Logging feature #3806

rao-shwe · 2025-05-09T14:07:11Z

Link to the preview doc: https://preview.docs-test.couchbase.com/DOC-12484/server/current/learn/clusters-and-availability/xdcr-conflict-logging-feature.html

Get login credentials from here.

Preview pages:

XDCR Conflict Logging
Replication Settings for XDCR Conflict Logging and Create an XDCR by Enabling Conflict Logging
Added conflictLogging in the Curl syntax and explained briefly about conflictLogging in the paragraphs that follow.

PR pages:
New page: XDCR Conflict Logging.

Updated the following pages for "XDCR Conflict Logging":

Added the section Edit a Bucket to Enable Cross Cluster Versioning in Edit a Bucket.
Added a NOTE in Create a Bucket.
Added the sections Replication Settings for XDCR Conflict Logging and Create an XDCR by Enabling Conflict Logging in Create a Replication.
conflictLogging entry in Managing Advanced Settings.

Don't review the following files: The following are 7.6.6 release docs which were missing in the release/8.0 branch.

XDCR Active-Active with Sync Gateway
Diagram 1 of XDCR with SGW
Diagram 2 of XDCR with SGW
XDCR Conflict Resolution
XDCR enableCrossClusterVersioning
Section "Create an XDCR Replication with mobile=Active" in Create a Replication.
Creating and Editing Buckets
mobile entry, enableCrossClusterVersioning, and "Change Settings for an Existing Replication to Set mobile=Active" in Managing Advanced Settings.
XDCR Advanced Settings.

modules/manage/pages/manage-xdcr/create-xdcr-replication.adoc

sumukhbhat2701 · 2025-07-21T07:08:52Z

modules/manage/pages/manage-xdcr/create-xdcr-replication.adoc

+
+. xref:learn:clusters-and-availability/xdcr-conflict-logging-feature.adoc#xdcr-conflict-detection[*Conflict Detection*]: During the replication, XDCR detects true conflicts by comparing the Hybrid Logical Vector (HLV) metadata of the source and target documents.
+
+. xref:learn:clusters-and-availability/xdcr-conflict-logging-feature.adoc#conflict-logging-process[*Conflict Logging*]: When a true conflict is detected, XDCR logs the conflict details, such as document ID, document contents, and conflicting document histories, into the designated conflict collection.


Do we need to define what a true conflict is?

sumukhbhat2701 · 2025-07-21T07:09:38Z

modules/manage/pages/manage-xdcr/create-xdcr-replication.adoc

+
+. xref:learn:clusters-and-availability/xdcr-conflict-logging-feature.adoc#conflict-logging-process[*Conflict Logging*]: When a true conflict is detected, XDCR logs the conflict details, such as document ID, document contents, and conflicting document histories, into the designated conflict collection.
+
+. xref:learn:clusters-and-availability/xdcr-conflict-logging-feature.adoc#conflict-access-and-management[*Conflict Access and Management*]: Administrators can access and review the logged conflicts. Then manually resolve the conflicts by selecting the appropriate mutations for replication and upsert the documents.


nit: It's not just Administrators. Technically it's any user who has RW access to the bucket can view and access.

modules/rest-api/pages/rest-xdcr-adv-settings.adoc

sumukhbhat2701 · 2025-07-21T07:15:18Z

modules/rest-api/pages/rest-xdcr-create-replication.adoc

@@ -28,6 +28,14 @@ curl -v -X POST -u [admin]:[password]
  -d fromBucket=[bucket-name]
  -d toCluster=[cluster-name]
  -d toBucket=[bucket-name]
+  -d conflictLogging='{ 


I think this should be URL-encoded. Using -d followed by a string encoded JSON will throw an error.

One can use --data-urlencode like https://couchbase.slack.com/archives/C0963TSUU0N/p1752763203434269.

Or another option is to follow how colMappingRules [JSON-Document] is mentioned below (eg: conflictLogging [JSON-Document]) and explain the format of JSON document as explained now.

sumukhbhat2701 · 2025-07-21T07:16:53Z

modules/rest-api/pages/rest-xdcr-create-replication.adoc

@@ -28,6 +28,14 @@ curl -v -X POST -u [admin]:[password]
  -d fromBucket=[bucket-name]
  -d toCluster=[cluster-name]
  -d toBucket=[bucket-name]
+  -d conflictLogging='{ 
+	    "disabled": [true | false], "bucket": [conflict-bucket-name], "collection": [conflict-scope-name].[conflict-collection-name], "loggingRules": { 
+	      	[custom-conflict-scope-name]: { 


the LHS is source collection of the replication.
[custom-conflict-scope-name]: { -> [source-scope-name]: {

the LHS can also be [source-collection-name]

the RHS can also be {} or null

sumukhbhat2701 · 2025-07-21T07:17:34Z

modules/rest-api/pages/rest-xdcr-create-replication.adoc

 ----

 The `type` value must be `xmem`; which is sometimes referred to as *Version 2*, and corresponds to the _Memcached Binary_ protocol, used in XDCR communications.

 The `replicationType` value is always `continuous`.
 This value must be specified.

+The `conflictLogging` flag enables or disables conflict logging for the replication.
+When enabled (`disabled=false`), you can specify the target bucket, scope, and collection for logging conflicts, as well as custom logging rules for specific collections.


as custom logging rules for specific collections -> source collections of the replication

sumukhbhat2701 · 2025-07-21T07:18:37Z

modules/rest-api/pages/rest-xdcr-create-replication.adoc

 ----

 The `type` value must be `xmem`; which is sometimes referred to as *Version 2*, and corresponds to the _Memcached Binary_ protocol, used in XDCR communications.

 The `replicationType` value is always `continuous`.
 This value must be specified.

+The `conflictLogging` flag enables or disables conflict logging for the replication.
+When enabled (`disabled=false`), you can specify the target bucket, scope, and collection for logging conflicts, as well as custom logging rules for specific collections.
+This helps track and resolve document conflicts during replication.


mention the words

conflicts -> true conflicts

manually if needed - resolve document conflicts manually

modules/rest-api/pages/rest-xdcr-create-replication.adoc

sumukhbhat2701 · 2025-07-21T07:21:14Z

modules/rest-api/pages/rest-xdcr-adv-settings.adoc

+| `conflictLogging`
+| disabled (true/false)
+| Configuration settings for conflict logging. This configuration setting defines objects/parameters and options used to control how conflicts are logged within the application.
+It includes settings such as log levels, output destinations, and thresholds for logging conflict events.


Sorry but what does log levels and thresholds mean?

modules/learn/pages/clusters-and-availability/xdcr-enable-crossclusterversioning.adoc

sumukhbhat2701

Some review feedback which hold true for all pages:

For this feature specifically, we need to use the term "true conflicts" more than just mentioning "conflicts". That means we need to first define what a true conflict is and set the expectation.
There should be a warning that this feature is best effort (and that true conflicts is assumed to be very low). Everything that's in this slide - https://couchbase.slack.com/archives/C0963TSUU0N/p1752763776316649.
The setting is quite complex to understand just from textual description. An example will do a lot of help to someone new reading this.
There should be a mention that on every true conflict detected, XDCR will log 3 documents to the conflict collection - CRD (Conflict record document - contains metadata of detected true conflict), source document in conflict & target document in conflict. It should be mentioned that the CRD will contain the document IDs of source and target documents logged. Maybe an example of source and target document IDs in CRD.
Continuation of (3), I think there should be some examples on how to make use of the detected and logged conflicts. Eg: Use SDK, N1QL, range scan, eventing etc.
There should be a mention that the logged documents will not be replicated by XDCR if conflict collection is a source collection of any XDCR.

sumukhbhat2701 · 2025-07-21T07:47:00Z

I think I missed one of the pages from reviewing, so if somethings are already done from last comment, please ignore.

sumukhbhat2701 · 2025-07-21T07:49:54Z

modules/learn/pages/clusters-and-availability/xdcr-conflict-logging-feature.adoc

+
+* xref:learn:clusters-and-availability/xdcr-conflict-logging-feature.adoc#upgrade-xdcr-setup-conflict-logging[*Upgrading an Existing Active-Passive XDCR Setup*]: Configure an existing active-passive XDCR setup into an active-active XDCR setup.
+
+[#hlv]


Do we need this detail of a section for HLV?
cc: @hyunjuV for your thoughts.

This page is mainly for conceptual information. Users may not find HLV related information anywhere else.

modules/learn/pages/clusters-and-availability/xdcr-conflict-logging-feature.adoc

sumukhbhat2701 · 2025-07-21T07:57:38Z

modules/learn/pages/clusters-and-availability/xdcr-conflict-logging-feature.adoc

+* In conflict with the one at target, so a merge needs to be performed. This happens when the target has mutations in the document not included in the source document. This is true conflict detection.
+However, comparing documents’ CAS values is straightforward (by comparing integers), whereas comparing HLVs is complex. HLVs combine CAS with per-source version history, so clear HLV properties and rules are defined for “greater than” and “equal”.
+
+[#compare-hlvs-to-detect-conflicts]


I think this is way too much design / implementation detail and can be skipped.

Please specify the lines that need to be removed.

The whole section is not needed.

sumukhbhat2701 · 2025-07-21T08:01:45Z

modules/learn/pages/clusters-and-availability/xdcr-conflict-logging-feature.adoc

+* *Dynamic Rebalancing:* You can temporarily increase the resource allocation for conflict logging using a dedicated “boost” option via curl command called `ClogBoost`, to handle an increased number of conflict events.
+
+[#conflict-logger-data-flow]
+==== Data Flow and Processing


This section I feel is not needed again because it goes deep into implementation.

sumukhbhat2701 · 2025-07-21T08:02:21Z

modules/learn/pages/clusters-and-availability/xdcr-conflict-logging-feature.adoc

+* *Hibernation Mechanism:* If the logger cannot process requests due to persistent errors, such as misconfiguration, slow IO, resource exhaustion, logging is temporarily disabled or hibernated. This prevents replication performance from being degraded. Logging is re-enabled after a set interval or once errors are resolved.
+
+[#shared-resources-logger]
+==== Shared Resources and Connection Handling


modules/learn/pages/clusters-and-availability/xdcr-conflict-logging-feature.adoc

sumukhbhat2701 · 2025-07-21T08:08:28Z

modules/learn/pages/clusters-and-availability/xdcr-conflict-logging-feature.adoc

+
+* *Token-Based Throttling:* Logging tasks receive a minimal percentage of tokens (default allocation: 89% high-priority replication, 8% low-priority, 3% for logging). If insufficient tokens are available, logging requests are throttled to avoid impacting replication performance.
+
+* *Dynamic Rebalancing:* You can temporarily increase the resource allocation for conflict logging using a dedicated “boost” option via curl command called `ClogBoost`, to handle an increased number of conflict events.


ClogBoost is an internal setting. Not sure if we want to document it. cc: @staticgc

rao-shwe

@sumukhbhat2701

I've implemented most of your review inputs and closed the comments.

rao-shwe · 2025-07-21T12:13:11Z

modules/learn/pages/clusters-and-availability/xdcr-conflict-logging-feature.adoc

+
+* xref:learn:clusters-and-availability/xdcr-conflict-logging-feature.adoc#upgrade-xdcr-setup-conflict-logging[*Upgrading an Existing Active-Passive XDCR Setup*]: Configure an existing active-passive XDCR setup into an active-active XDCR setup.
+
+[#hlv]


This page is mainly for conceptual information. Users may not find HLV related information anywhere else.

modules/learn/pages/clusters-and-availability/xdcr-conflict-logging-feature.adoc

rao-shwe · 2025-07-21T12:14:51Z

modules/learn/pages/clusters-and-availability/xdcr-conflict-logging-feature.adoc

+* In conflict with the one at target, so a merge needs to be performed. This happens when the target has mutations in the document not included in the source document. This is true conflict detection.
+However, comparing documents’ CAS values is straightforward (by comparing integers), whereas comparing HLVs is complex. HLVs combine CAS with per-source version history, so clear HLV properties and rules are defined for “greater than” and “equal”.
+
+[#compare-hlvs-to-detect-conflicts]


Please specify the lines that need to be removed.

rao-shwe · 2025-07-21T12:16:58Z

modules/learn/pages/clusters-and-availability/xdcr-conflict-logging-feature.adoc

+* *Best-Effort Logging:* Logging is attempted on a best-effort basis, reflecting the lowest operational priority compared to the main data transfer tasks.
+
+[#resource-management-logger]
+==== Resource Management


True. I've mentioned Resource Management as a generic term and not as a Server component.

modules/learn/pages/clusters-and-availability/xdcr-enable-crossclusterversioning.adoc

modules/manage/pages/manage-xdcr/create-xdcr-replication.adoc

modules/rest-api/pages/rest-xdcr-adv-settings.adoc

modules/rest-api/pages/rest-xdcr-create-replication.adoc

rao-shwe · 2025-07-21T12:39:33Z

@sumukhbhat2701

Points 1 is fixed.
Point 2 needs to be addressed.
Point 3: Already has examples and descriptions. Not okay to repeat the same content in multiple locations. So I've added a link to examples wherever necessary.
Point 4, 5, and 6: Already exists.

Some review feedback which hold true for all pages:

For this feature specifically, we need to use the term "true conflicts" more than just mentioning "conflicts". That means we need to first define what a true conflict is and set the expectation.

There should be a warning that this feature is best effort (and that true conflicts is assumed to be very low). Everything that's in this slide - https://couchbase.slack.com/archives/C0963TSUU0N/p1752763776316649.

The setting is quite complex to understand just from textual description. An example will do a lot of help to someone new reading this.

There should be a mention that on every true conflict detected, XDCR will log 3 documents to the conflict collection - CRD (Conflict record document - contains metadata of detected true conflict), source document in conflict & target document in conflict. It should be mentioned that the CRD will contain the document IDs of source and target documents logged. Maybe an example of source and target document IDs in CRD.

Continuation of (3), I think there should be some examples on how to make use of the detected and logged conflicts. Eg: Use SDK, N1QL, range scan, eventing etc. @hyunjuV I think you had a document prepared for this, was that for public docs?

There should be a mention that the logged documents will not be replicated by XDCR if conflict collection is a source collection of any XDCR.

hyunjuV · 2025-07-22T00:42:15Z

modules/learn/pages/clusters-and-availability/xdcr-active-active-sgw.adoc

+If you try to use the feature _XDCR Active-Active with Sync Gateway_ when you have more than 10 user xattrs in your document, the XDCR replication **silently skips** replicating that document.
+As a result, the data in the replication-skipped document will not be consistent between the target and source clusters.
+The only way you will know this skip occured is because the Prometheus stat `subdoc_cmd_docs_skipped` will be incremented and the document will _not_ be consistent between the target and source.
+* Eventing Service cannot be used with Sync Gateway in bi-directional XDCR.


If you are using Eventing Service functions that update documents in the XDCR replicated buckets, you must take care that the deployed Eventing functions do not cause XDCR to ping-pong and never stop replicating.

hyunjuV · 2025-07-22T00:48:45Z

modules/learn/pages/clusters-and-availability/xdcr-active-active-sgw.adoc

+As a result, the data in the replication-skipped document will not be consistent between the target and source clusters.
+The only way you will know this skip occured is because the Prometheus stat `subdoc_cmd_docs_skipped` will be incremented and the document will _not_ be consistent between the target and source.
+* Eventing Service cannot be used with Sync Gateway in bi-directional XDCR.
+If used with the _Sync Gateway in a bi-directional, active-active XDCR_ environment, the updates of Eventing Service metadata in the source and the target clusters causes XDCR to ping-pong and never stop replicating.


If you are using Eventing functions that update the documents in the XDCR replicated buckets (also referred to as Eventing source bucket mutations), ensure that the deployed functions behave as desired in the replication environment. Within a bi-directional, active-active XDCR environment, the deployed Eventing functions can cause XDCR to ping-pong and never stop replicating if you do not include logic to prevent the infinite loop. In general, for active-active, avoid redundant updates with appropriate logic within the Eventing functions. See XDCR Active-Active and Eventing for more information.

Note for @rao-shwe :
Fortune Ikechi is working on DOC-13300, which will add a page called "XDCR Active-Active and Eventing" in 7.6.x documentation. One of the changes for that work is to update this note in lines 25-26.

rao-shwe added 8 commits May 9, 2025 19:35

first-commit

d3c3a1c

small change

2fcf652

hlv-info

7b22571

conflict-logging-first-updates

6b20668

test

fc6ee7f

test-nested-content

75c108f

test2-nested-content

d8dcb3b

full-draft-document

18e7837

rao-shwe requested review from hyunjuV, sumukhbhat2701 and staticgc July 21, 2025 06:53

sumukhbhat2701 reviewed Jul 21, 2025

View reviewed changes

modules/manage/pages/manage-xdcr/create-xdcr-replication.adoc Show resolved Hide resolved

sumukhbhat2701 reviewed Jul 21, 2025

View reviewed changes

modules/rest-api/pages/rest-xdcr-adv-settings.adoc Show resolved Hide resolved

sumukhbhat2701 reviewed Jul 21, 2025

View reviewed changes

modules/rest-api/pages/rest-xdcr-create-replication.adoc Show resolved Hide resolved

sumukhbhat2701 reviewed Jul 21, 2025

View reviewed changes

modules/learn/pages/clusters-and-availability/xdcr-enable-crossclusterversioning.adoc Show resolved Hide resolved

sumukhbhat2701 suggested changes Jul 21, 2025

View reviewed changes

review-implementation-engg-1

3cfb3e2

rao-shwe commented Jul 21, 2025

View reviewed changes

rao-shwe added 2 commits July 21, 2025 18:18

minor-edit-in-image-dimension

6cb2d6a

minor-edit-in-image-2

d0f00b1

title-edit

d29a7d5

hyunjuV reviewed Jul 22, 2025

View reviewed changes


		. xref:learn:clusters-and-availability/xdcr-conflict-logging-feature.adoc#xdcr-conflict-detection[Conflict Detection]: During the replication, XDCR detects true conflicts by comparing the Hybrid Logical Vector (HLV) metadata of the source and target documents.

		. xref:learn:clusters-and-availability/xdcr-conflict-logging-feature.adoc#conflict-logging-process[Conflict Logging]: When a true conflict is detected, XDCR logs the conflict details, such as document ID, document contents, and conflicting document histories, into the designated conflict collection.


		* xref:learn:clusters-and-availability/xdcr-conflict-logging-feature.adoc#upgrade-xdcr-setup-conflict-logging[Upgrading an Existing Active-Passive XDCR Setup]: Configure an existing active-passive XDCR setup into an active-active XDCR setup.

		[#hlv]


		* Token-Based Throttling: Logging tasks receive a minimal percentage of tokens (default allocation: 89% high-priority replication, 8% low-priority, 3% for logging). If insufficient tokens are available, logging requests are throttled to avoid impacting replication performance.

		* Dynamic Rebalancing: You can temporarily increase the resource allocation for conflict logging using a dedicated “boost” option via curl command called `ClogBoost`, to handle an increased number of conflict events.

DOC-12484 XDCR Conflict Logging feature #3806

Are you sure you want to change the base?

DOC-12484 XDCR Conflict Logging feature #3806

Uh oh!

Conversation

rao-shwe commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sumukhbhat2701 Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sumukhbhat2701 Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sumukhbhat2701 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sumukhbhat2701 commented Jul 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rao-shwe left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rao-shwe commented Jul 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rao-shwe commented May 9, 2025 •

edited

Loading

sumukhbhat2701 Jul 21, 2025 •

edited

Loading

sumukhbhat2701 Jul 21, 2025 •

edited

Loading

sumukhbhat2701 left a comment •

edited

Loading