This is a Singer tap that produces JSON-formatted data following the Singer spec.
This tap:
- Pulls raw data from the Pendo API.
- Supports following two subscription
- US Subscription
- EU Subscription
- Extracts the following resources:
- Accounts
- Features
- Guides
- Pages
- Visitors
- Visitor History
- Syncs for this endpoint may be very long running if extracting anonymous visitors, see Visitors config
include_anonymous_visitors
.
- Syncs for this endpoint may be very long running if extracting anonymous visitors, see Visitors config
- Track Types
- Feature Events
- Events
- Page Events
- Guide Events
- Poll Events
- Track Events
- Metadata Accounts
- Metadata Visitors
- Outputs the schema for each resource
- Incrementally pulls data based on the input state
- US Subscription Endpoint: https://app.pendo.io/api/v1/aggregation
- EU Subscription Endpoint: https://app.eu.pendo.io/api/v1/aggregation
- Primary key fields:
account_id
- Replication strategy: INCREMENTAL (query filtered)
- Bookmark:
lastupdated
- Bookmark:
- Transformations
- Camel to snake case.
metadata.auto.lastupdated
denested to root aslastupdated
metadata
objects denested(metadata_agent
,metadata_audo
,metadata_custom
, etc)
- US Subscription Endpoint: https://app.pendo.io/api/v1/aggregation
- EU Subscription Endpoint: https://app.eu.pendo.io/api/v1/aggregation
- Primary key fields:
id
- Replication strategy: INCREMENTAL (query filtered)
- Bookmark:
last_updated_at
- Bookmark:
- Transformations: Camel to snake case.
- US Subscription Endpoint: https://app.pendo.io/api/v1/aggregation
- EU Subscription Endpoint: https://app.eu.pendo.io/api/v1/aggregation
- Primary key fields:
id
- Replication strategy: INCREMENTAL (query filtered)
- Bookmark:
last_pdated_at
- Bookmark:
- Transformations: Camel to snake case.
- US Subscription Endpoint: https://app.pendo.io/api/v1/aggregation
- EU Subscription Endpoint: https://app.eu.pendo.io/api/v1/aggregation
- Primary key fields:
id
- Replication strategy: INCREMENTAL (query filtered)
- Bookmark:
last_pdated_at
- Bookmark:
- Transformations: Camel to snake case.
- US Subscription Endpoint: https://app.pendo.io/api/v1/aggregation
- EU Subscription Endpoint: https://app.eu.pendo.io/api/v1/aggregation
- Primary key fields:
visitor_id
- Replication strategy: INCREMENTAL (query filtered)
- Bookmark:
lastupdated
- Bookmark:
- Config option:
include_anonymous_visitors
:true
/false
- Default:
false
to eliminate anonymous visitors - https://developers.pendo.io/docs/?bash#source-specification
- Default:
- Transformations
- Camel to snake case.
metadata.auto.lastupdated
denested to root aslastupdated
metadata
objects denested(metadata_agent
,metadata_audo
,metadata_custom
, etc)
- US Subscription Endpoint: https://app.pendo.io/api/v1/aggregation
- EU Subscription Endpoint: https://app.eu.pendo.io/api/v1/aggregation
- Primary key fields:
visitor_id
- Replication strategy: INCREMENTAL (query filtered)
- Bookmark:
modified_ts
(Max fromts
orlastTs
)
- Bookmark:
- Transformations: Camel to snake case.
- US Subscription Endpoint: https://app.pendo.io/api/v1/aggregation
- EU Subscription Endpoint: https://app.eu.pendo.io/api/v1/aggregation
- Primary key fields:
feature_id
,visitor_id
,account_id
,remote_ip
,user_agent
,day
orhour
- Replication strategy: INCREMENTAL (query filtered)
- Bookmark:
day
orhour
- Bookmark:
- Transformations: Camel to snake case.
- US Subscription Endpoint: https://app.pendo.io/api/v1/aggregation
- EU Subscription Endpoint: https://app.eu.pendo.io/api/v1/aggregation
- Primary key fields:
visitor_id
,account_id
,remote_ip
,user_agent
,day
orhour
- Replication strategy: INCREMENTAL (query filtered)
- Bookmark:
day
orhour
- Bookmark:
- Transformations: Camel to snake case.
- US Subscription Endpoint: https://app.pendo.io/api/v1/aggregation
- EU Subscription Endpoint: https://app.eu.pendo.io/api/v1/aggregation
- Primary key fields:
page_id
,visitor_id
,account_id
,remote_ip
,user_agent
,day
orhour
- Replication strategy: INCREMENTAL (query filtered)
- Bookmark:
day
orhour
- Bookmark:
- Transformations: Camel to snake case.
- US Subscription Endpoint: https://app.pendo.io/api/v1/aggregation
- EU Subscription Endpoint: https://app.eu.pendo.io/api/v1/aggregation
- Primary key fields:
guide_id
,guide_step_id
,visitor_id
,type
,account_id
,browser_time
,url
- Replication strategy: INCREMENTAL (query filtered)
- Bookmark:
browserTime
- Bookmark:
- Transformations: Camel to snake case.
- US Subscription Endpoint: https://app.pendo.io/api/v1/aggregation
- EU Subscription Endpoint: https://app.eu.pendo.io/api/v1/aggregation
- Primary key fields:
visitor_id
,account_id
,poll_id
,browser_time
- Replication strategy: INCREMENTAL (query filtered)
- Bookmark:
browserTime
- Bookmark:
- Transformations: Camel to snake case.
- US Subscription Endpoint: https://app.pendo.io/api/v1/aggregation
- EU Subscription Endpoint: https://app.eu.pendo.io/api/v1/aggregation
- Primary key fields:
track_type_id
,visitor_id
,account_id
,remote_ip
,user_agent
,day
orhour
- Replication strategy: INCREMENTAL (query filtered)
- Bookmark:
day
orhour
- Bookmark:
- Transformations: Camel to snake case.
- US Subscription Endpoint: https://app.pendo.io/api/v1/aggregation
- EU Subscription Endpoint: https://app.eu.pendo.io/api/v1/aggregation
- Primary key fields:
id
- Replication strategy: INCREMENTAL (query filtered)
- Bookmark:
last_pdated_at
- Bookmark:
- Transformations: Camel to snake case.
- US Subscription Endpoint: https://app.pendo.io/api/v1/metadata/schema/account
- EU Subscription Endpoint: https://app.eu.pendo.io/api/v1/metadata/schema/account
- Replication strategy: FULL_TABLE
- Transformations: Camel to snake case.
- US Subscription Endpoint: https://app.pendo.io/api/v1/metadata/schema/visitor
- EU Subscription Endpoint: https://app.eu.pendo.io/api/v1/metadata/schema/visitor
- Replication strategy: FULL_TABLE
- Transformations: Camel to snake case.
Authentication is managed by integration keys. An integration key may be created in the Pendo website: Settings -> Integrations -> Integration Keys.
{
"currently_syncing": null,
"bookmarks": {
"pages": { "lastUpdatedAt": "2020-09-26T00:00:00.000000Z" },
"page_events": { "day": "2020-09-27T04:00:00.000000Z" },
"accounts": { "lastupdated": "2020-09-28T01:30:29.237000Z" },
"feature_events": { "day": "2020-09-27T04:00:00.000000Z" },
"guides": { "lastUpdatedAt": "2020-09-26T00:00:00.000000Z" },
"poll_events": { "day": "2020-09-26T00:00:00.000000Z" },
"events": { "day": "2020-09-27T04:00:00.000000Z" },
"visitors": { "lastupdated": "2020-09-28T01:30:29.199000Z" },
"features": { "lastUpdatedAt": "2020-09-26T00:00:00.000000Z" },
"guide_events": { "day": "2020-09-26T00:00:00.000000Z" },
"track_types": { "lastUpdatedAt": "2020-09-26T00:00:00.000000Z" }
}
}
Interrupted syncs for Event type stream are resumed via a bookmark placed during processing, last_processed
. The value of the parent GUID will be
{
"bookmarks": {
"guides": { "lastUpdatedAt": "2020-09-22T20:23:44.514000Z" },
"poll_events": { "day": "2020-09-20T00:00:00.000000Z" },
"feature_events": { "day": "2020-09-27T04:00:00.000000Z" },
"visitors": { "lastupdated": "2020-09-27T15:40:02.729000Z" },
"pages": { "lastUpdatedAt": "2020-09-20T00:00:00.000000Z" },
"track_types": { "lastUpdatedAt": "2020-09-20T00:00:00.000000Z" },
"features": { "lastUpdatedAt": "2020-09-20T00:00:00.000000Z" },
"accounts": { "lastupdated": "2020-09-27T15:39:50.585000Z" },
"guide_events": { "day": "2020-09-20T00:00:00.000000Z" },
"page_events": {
"day": "2020-09-27T04:00:00.000000Z",
"last_processed": "_E9IwR8tFCTQryv_hCzGVZvsgcg"
},
"events": { "day": "2020-09-27T04:00:00.000000Z" }
},
"currently_syncing": "track_events"
}
-
Install
Clone this repository, and then install using setup.py. We recommend using a virtualenv:
> virtualenv -p python3 venv > source venv/bin/activate > python setup.py install OR > cd .../tap-pendo > pip install .
-
Dependent libraries. The following dependent libraries were installed.
> pip install singer-python > pip install jsonlines > pip install singer-tools > pip install target-stitch > pip install target-json
-
Create your tap's
config.json
file. The tap config file for this tap should include these entries:start_date
- the default value to use if no bookmark exists for an endpoint (rfc3339 date string)x_pendo_integration_key
(string,ABCdef123
): an integration key from Pendo.period
(string,ABCdef123
):dayRange
orhourRange
lookback_window
(integer): 10 (For event objects. Default: 0)request_timeout
(integer): 300 (For passing timeout to the request. Default: 300)record_limit
(integer, 100000): maximum number of records Pendo API can retrieve in a single request. Default: 100000 recordsapp_ids
(string,8877665523, 1234545
): (comma seperated appIDs. If this parameter is not provided, then the data will be collected from all the apps)
Note: It is important to set
record_limit
parameter to an appropriate value, as selecting a smaller value may have a negative effect on the Pendo API's performance, while a larger value may result in connection errors, request timeouts, or memory overflows.```json { "x_pendo_integration_key": "YOUR_INTEGRATION_KEY", "start_date": "2020-09-18T00:00:00Z", "period": "dayRange", "lookback_window": 10, "request_timeout": 300, "record_limit": 100000, "include_anonymous_visitors": "true", "app_ids": "1234545, 8877665523" }
-
Run the Tap in Discovery Mode This creates a catalog.json for selecting objects/fields to integrate:
tap-pendo --config config.json --discover > catalog.json
See the Singer docs on discovery mode here.
-
Run the Tap in Sync Mode (with catalog) and write out to state file
For Sync mode:
> tap-pendo --config tap_config.json --catalog catalog.json > state.json > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json
To load to json files to verify outputs:
> tap-pendo --config tap_config.json --catalog catalog.json | target-json > state.json > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json
To pseudo-load to Stitch Import API with dry run:
> tap-pendo --config tap_config.json --catalog catalog.json | target-stitch --config target_config.json --dry-run > state.json > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json
-
Test the Tap
While developing the pendo tap, the following utilities were run in accordance with Singer.io best practices: Pylint to improve code quality:
> pylint tap_pendo -d missing-docstring -d logging-format-interpolation -d too-many-locals -d too-many-arguments
Pylint test resulted in the following score:
Your code has been rated at 9.67/10
To check the tap and verify working:
> tap-pendo --config tap_config.json --catalog catalog.json | singer-check-tap > state.json > tail -1 state.json > state.json.tmp && mv state.json.tmp state.json
Check tap resulted in the following:
Checking stdin for valid Singer-formatted data The output is valid. It contained 3734 messages for 11 streams. 13 schema messages 3702 record messages 19 state messages Details by stream: +----------------+---------+---------+ | stream | records | schemas | +----------------+---------+---------+ | accounts | 1 | 1 | | features | 29 | 1 | | feature_events | 158 | 2 | | guides | 0 | 1 | | pages | 34 | 1 | | page_events | 830 | 2 | | reports | 2 | 1 | | visitors | 1902 | 1 | | events | 746 | 1 | | guide_events | 0 | 1 | | poll_events | 0 | 1 | +----------------+---------+---------+
Unit tests may be run with the following.
python -m pytest --verbose
Note, you may need to install test dependencies.
pip install -e .'[dev]'
Copyright © 2020 Stitch