Skip to content

Wide Record reader fix. #288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from
Open

Wide Record reader fix. #288

wants to merge 4 commits into from

Conversation

prince-cs
Copy link

@prince-cs prince-cs commented Jun 10, 2025

Currently records from a single batch are being read in a parallel stream processing and the result is being stored in its entirety. This could lead to out of memory(OOM).

The fix is to ensure that the batches are being read with the help of an iterator rather storing the entire result. Made the changes accordingly to SalesforceWideRecordReader.java

Validation has been done by making a query of length more than 20k and ensuring the pipeline passes.

JIRA: https://cdap.atlassian.net/browse/PLUGIN-1897

@prince-cs prince-cs requested a review from MrRahulSharma June 11, 2025 05:18
@MrRahulSharma
Copy link
Contributor

Please add more details in the description, like the issue, RCA, fix, how did we validate etc.

@@ -47,7 +47,7 @@ public class SalesforceConstants {
public static final int RANGE_FILTER_MIN_VALUE = 0;
public static final int SOQL_MAX_LENGTH = 20000;

public static final int DEFAULT_CONNECTION_TIMEOUT_MS = 30000;
public static final int DEFAULT_CONNECTION_TIMEOUT_MS = 120_000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change does not look related to this failure, can you confirm ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The read calls started failing with timeouts after reading some batches and did not recover even after retries.
During local testing faced the same behavior in both CDF and DTS hence increasing the timeout helped overcome the timeouts.
This was observed in our environment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do have a user provided value for the connectTimeout, are we not using that for some flow ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the plugin does use the user provided connectTimeout. However it fails with the default connectTimeout, so increased the timeout to handle the default configs.

Copy link
Member

@itsankit-google itsankit-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add JIRA in the PR title.

Comment on lines +58 to +62
private PartnerConnection partnerConnection;
private SObjectDescriptor sObjectDescriptor;
private List<String> fieldsNames;
private String fields;
private String sObjectName;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why all of them need to be added as instance variables?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we are now fetching records based on batch, so it is required to declare these as instance variables

LOG.debug("Number of partitions to be fetched for wide object: '{}'", partitions.size());

// Process partitions with batches sized to adhere to API limits and optimize memory usage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we hit API limit early now after this change?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, as the logic remains the same, the only change is the way it reads the set of 2K records at a time instead of the entire batch in one go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants