Events Guide

Envelope includes a framework for acting on events that occur during the lifecycle of an Envelope application. This framework can be used for operational purposes such as logging and alerting, and to provide hooks into lifecycle events that require custom code to be executed.

Note	It is generally not necessary to use this framework to build an Envelope data pipeline.

Each instance of an event corresponds to an event type, and also includes a message description of the event, and zero or more metadata items that are defined either for all events or per event type.

An Envelope application can register event handlers to be notified and act on events as they occur. Envelope provides one default event handler and one non-default handler. Custom event handlers can also be provided.

Note	The event framework currently only notifies on a minimal set of possible event types. More event types will be added in future releases as they are required.

Registering event handlers

Event handlers can be registered by including them in the optional application.event-handlers configuration list.

For example,

application {
  name = "Example pipeline"
  ...
  event-handlers = [
    {
      type = com.example.envelope.CustomEventHandler
      config1 = hello
    },
    {
      type = com.example.envelope.AnotherCustomEventHandler
      config2 = world
    }
  ]
}

steps {
 ...

Provided event handlers

Log

The log event handler is used to log events to stderr. This event handler is registered by Envelope by default. It can be included in the event-handlers configuration list but it does not currently use any additional configurations.

Output

The output event handler is used to write out events to an external system as defined by an Envelope output, such as Kudu. The Envelope output must be a random output that supports INSERTs.

The output must support records with the fields:

event_id (string)
timestamp_utc (timestamp)
event_type (string)
message (string)
pipeline_id (string)
application_id (string)

For example,

application {
  name = "Example pipeline with output event handler"
  ...
  event-handlers = [
    {
      type = output
      output {
        type = kudu
        connection = "kudumasterhostname:7051"
        table.name = "impala::default.envelope_events"
      }
    }
  ]
}

steps {
 ...

Custom event handlers

Event handlers can be created by implementing the Envelope EventHandler interface.

Each event handler must declare which event types it will handle. The event types created by Envelope core are provided in the CoreEventTypes class.

Event types

The following table specifies the list of event types that Envelope will notify on, and the list of metadata items that are specific for each event type.

Note that class names have been simplified for brevity. Event types map to constants within CoreEventTypes. Metadata item keys map to constants within CoreEventMetadataKeys.

Event type	Description	Metadata item key	Metadata item description	Metadata item class
PIPELINE_STARTED	The pipeline has started	none
PIPELINE_FINISHED	The pipeline has finished without any exception	none
PIPELINE_EXCEPTION_OCCURRED	The pipeline has failed because of a propagated exception	PIPELINE_EXCEPTION_OCCURRED_EXCEPTION	The exception that occurred	Exception
STEPS_EXTRACTED	The steps have been instantiated from the pipeline configuration	STEPS_EXTRACTED_CONFIG	The configuration that the steps were extracted from	Config
		STEPS_EXTRACTED_STEPS	The set of steps that were extracted	Set<Step>
		STEPS_EXTRACTED_TIME_TAKEN_NS	The number of nanoseconds taken to extract the steps	long
EXECUTION_MODE_DETERMINED	The execution mode for the pipeline (e.g. batch, streaming) has been determined	EXECUTION_MODE_DETERMINED_MODE	The execution mode	ExecutionMode
DATA_STEP_WRITTEN_TO_OUTPUT	The data step has written its data to its output	DATA_STEP_WRITTEN_TO_OUTPUT_STEP_NAME	The name of the step that wrote to its output	String
DATA_STEP_WRITTEN_TO_OUTPUT	The data step has written its data to its output	DATA_STEP_WRITTEN_TO_OUTPUT_TIME_TAKEN_NS	The number of nanoseconds taken to write to the output	long
DATA_STEP_DATA_GENERATED	The data step has generated its data from its input or its deriver. Note that when handling this event Spark is forced to execute steps one at a time, which can lead to slower performance because Spark can not merge steps together. Good citizen event handlers should allow users to optionally ignore this event for best performance.	DATA_STEP_DATA_GENERATED_STEP_NAME	The name of the step that generated its data	String
		DATA_STEP_DATA_GENERATED_ROW_COUNT	The number of rows of data generated	long
		DATA_STEP_DATA_GENERATED_TIME_TAKEN_NS	The number of nanoseconds taken to generate the data	long

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

events.adoc

events.adoc

Events Guide

Registering event handlers

Provided event handlers

Log

Output

Custom event handlers

Event types

Files

events.adoc

Latest commit

History

events.adoc

File metadata and controls

Events Guide

Registering event handlers

Provided event handlers

Log

Output

Custom event handlers

Event types