-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tap-marketo - static schema for leads if required by users to support… #90
base: master
Are you sure you want to change the base?
tap-marketo - static schema for leads if required by users to support… #90
Conversation
Hi @guptaa3, thanks for your contribution! In order for us to evaluate and accept your PR, we ask that you sign a contribution license agreement. It's all electronic and will take just minutes. |
You did it @guptaa3! Thank you for signing the Singer Contribution License Agreement. |
hello @guptaa3 thank you for your contribution.
The singer tap supports field selection via the catalog file, have you tried selecting / deselecting fields in the catalog? This function only requests the fields that are selected or have automatic inclusion type. def get_or_create_export_for_leads(client, state, stream, export_start, config):
export_id = bookmarks.get_bookmark(state, "leads", "export_id")
# check if export is still valid
if export_id is not None and not client.export_available("leads", export_id):
singer.log_info("Export %s no longer available.", export_id)
export_id = None
if export_id is None:
# Corona mode is required to query by "updatedAt", otherwise a full
# sync is required using "createdAt".
query_field = "updatedAt" if client.use_corona else "createdAt"
max_export_days = int(config.get('max_export_days',
MAX_EXPORT_DAYS))
export_end = get_export_end(export_start,
end_days=max_export_days)
query = {query_field: {"startAt": export_start.isoformat(),
"endAt": export_end.isoformat()}}
# Create the new export and store the id and end date in state.
# Does not start the export (must POST to the "enqueue" endpoint).
fields = []
for entry in stream['metadata']:
if len(entry['breadcrumb']) > 0 and (entry['metadata'].get('selected') or entry['metadata'].get('inclusion') == 'automatic'):
fields.append(entry['breadcrumb'][-1])
export_id = client.create_export("leads", fields, query)
state = update_state_with_export_info(
state, stream, export_id=export_id, export_end=export_end.isoformat())
else:
export_end = pendulum.parse(bookmarks.get_bookmark(state, "leads", "export_end"))
return export_id, export_end ref: https://github.com/singer-io/tap-marketo/blob/master/tap_marketo/sync.py#L156-L187 |
Hi @Vi6hal, I am using Meltano to set up the configurations and run the tap - the issue here is that Meltano will do a discover and then run the job based on the catalog generated on fly - I did think of having a static catalog and passing it to Meltano as well but that seemed more cumbersome for users rather than having the catalog generated directly as Meltano functions by default |
… 1500 mb limit
Description of change
Adding availability to have a static schema for leads object if required by the user, our Marketo lead objects are too big and we are facing 1500 mb limitation while loading our data even for a day. To handle this we have added functionality to enable static schema for leads object allowing users to select and pull only the fields they need for analysis
Manual QA steps
Tested loads for both with and without leads schema file, both working fine
Risks
Rollback steps