Skip to content

Latest commit

 

History

History
116 lines (68 loc) · 8.16 KB

README.md

File metadata and controls

116 lines (68 loc) · 8.16 KB

Cosmos SDK: Validator Status Alerts API

GitHub license GitHub contributors

GitHub Workflow Status GitHub repo size

ℹ️ Overview

Most validator status reporting software in Cosmos SDK is designed to be run directly by a node operator to monitor their own nodes. This typically pulls data from the Tendermint Prometheus metrics sink exposed on a node.

We wanted to build a way to monitor the status of validator nodes globally across the cheqd mainnet, and raise alerts in case validator nodes were losing blocks. (Validator nodes can get jailed if they miss too many blocks and their stake slashed.)

This custom API pulls data for all validator nodes from a BigDipper block explorer (e.g., explorer.cheqd.io) and repurposes/wraps the validator condition results into a JSON array.

The API itself can be deployed using Cloudflare Workers or compatible serverless platforms. Alerting is then achieved using Zapier (a low-code/no-code automation platform) to pipe these alerts to Slack, Discord, etc.

🚨 Alerting via Zapier

To simplify the task of alerting via various channels (and to keep it extensible to other channels), we take the output of our validator status API and parse it via Zapier. This is done as a two-stage process via two separate "Zaps".

Right now, our setup sends these details to the cheqd Community Slack and the cheqd Community Discord.

  1. Our validator status API sends a webhook call to a Zapier "Zap" that listens for newly-degraded validators every hour.

  2. Lists of degraded validators are compiled using a Zapier "Sub-Zap" to process the data from the API into a usable format and stored in a "digest".

    const body = JSON.parse(inputData.VALIDATOR_CONDITION);
    var degraded = [];
    
    for (let k=0; k<body.length; k++) {
    degraded.push(body[k]);
    }
    return {degraded}
  3. "Release" any unreleased digests by using the manual release feature in Digest by Zapier.

  4. If execution has proceeded to this step, use the Zapier App for Slack and Zapier App for Discord to send a message (with formatting) to designated alert channels.

You can copy this Zap to configure a similar setup for other alert channels, such as Email by Zapier.

🧑‍💻🛠 Developer Guide

Architecture

This API was developed to work with Cloudflare Workers, a serverless and highly-scalable platform.

Originally, this project was discussed as potentially being deployed using a serverless platform such as AWS Lambda. However, AWS Lambda has a cold-start problem if the API doesn't receive too much traffic or is only accessed infrequently. This can lead to start times ranging into single/double digit seconds, which would be considered an API timeout by many client applications.

Using Cloudflare Workers, these APIs can be served in a highly-scalable fashion and have much lower cold-start times, i.e., in the range of less than 10 milliseconds.

Setup

The recommended method of interacting with this repository is using Cloudflare Wrangler CLI.

Dependencies can be installed using NPM or any other package manager.

npm install

While our deployment uses Cloudflare Wrangler, the application itself could be modified to run on other platforms with some refactoring.

Configuration

Wrangler CLI uses wrangler.toml for configuring the application. If you're using this for your own purposes, you will need to replace values for account_id, route, etc. for the application to work correctly along with your own Cloudflare API tokens.

Crucially, you must provide a publicly-accessible BigDipper GraphQL endpoint using the environment variable GRAPHQL_API in the wrangler.toml file.

Local Development

Wrangler CLI can serve a preview where the code and KV pairs are served from Cloudflare. This also automatically executes a build to be able to serve up the app.

wrangler dev

This option will bind itself to the preview_id KV namespace binding (if defined).

Wrangler CLI also allows a degree of local development by running the web framework locally, but this option still relies on Cloudflare backend for aspects such as Cloudflare Workers KV.

wrangler dev --local

If you want completely standalone local development, this can achieved using an emulator framework like Miniflare.

Deploy

Modify the required variables in wrangler.toml for publishing to Cloudflare Workers and execute the following command to execute a build and production deployment.

wrangler publish

Other environments can be targetted (if defined in wrangler.toml) by specifying the --env flag:

wrangler publish --env staging

CI/CD deployments can be achieved using the wrangler Github Action. The deploy.yml Github Action in this repo provides an example of this can be achieved in practice.

🐞 Bug reports & 🤔 feature requests

If you notice anything not behaving how you expected, or would like to make a suggestion / request for a new feature, please create a new issue and let us know.

💬 Community

The cheqd Community Slack is our primary chat channel for the open-source community, software developers, and node operators.

Please reach out to us there for discussions, help, and feedback on the project.

🙋 Find us elsewhere

Telegram Discord Twitter LinkedIn Slack Medium YouTube