The dataeng project is dedicated to storing all data interactions across an organization as IaC. It provides a Command Line Interface to interact witht the various software tools that are used on a daily basis. It also provides an anlaysis perspective, and offers a wide variety of tooling in terms of analyzing overall operational status of the entire organization for all products.
This repository is loosely organized into 11 main categories:
-
- An assets directory housing images and assets used throughout the directory (EACH FILE HAS AN MD5SUM TO VERIFY FILE INTEGRITY)
-
- A configuration and secrets directory for all things docker
-
- A configuration and secrets directory for all things github
-
- A configuration and secrets directory for all things kubernetes
-
- A configuration and secrets directory for all things secret
-
- The Analyses directory that houses a variety of analyses across multiple toolings
-
- The Applications directory houses any and all affiliated applications to the housing repository
-
- The Automation directory will contain automation and tooling for deploying and utilizing the repository
-
- The Charts directory will container Helm Charts related to this repository
-
- The Infrastrucutre directory contains information relate to Ansible and Terraform for continuously deployed infrastructure
-
- The Lakes Directory will house the information related to any and all data lakes within the organization
-
- This will contain any pipelines related to the overall repisitory and is tooling dependent
-
- The Queries directory contains SQL Like queries that are stored for IaC purposes
-
- The Warehouse directory contains information on how to interact with the Data Warehouse for the respective organization
-
- The dataengctl houses the binary that allows you to work with a variety of tooling at the organization's disposal
The above showcases the directories that are contained within the repository.
All directories labeled with .
in front, such as .secrets
are meant to be ignored, and nothing should ever be commited inside of these directories. The main directories are each provided with a README.md
that tells you how you should interact with the repository in that particular directory.
To get started with dataengctl
, you should create a configuration file wihtin your home directory.
It should be named ~/.dataeng
and should be a hidden directory. In this directory you should create a file called config.yaml
, which contains the secrets and credentials needed to work with dataengctl
. The config should look something like this:
jiraConfig:
token: YOURJIRATOKEN
url: YOURJIRAURL
username: YOURUSERNAME
salesForceConfig:
url: YOURSALESFORCEURL
username: YOURSALESFORCEUSERNAME
password: YOURSALESFORCEPASSWORD
token: YOURSALESFORCETOKEN
clientID: YOURSALESFORCEORGCLIENTID
apiVersion: YOURSALESFORCEDEFUALTAPI
To install dataengctl you need to make sure you have two things installed
- You Want
jq
installed on your local host - You want
go
installed on your local host
To work with dataengctl you can use the makefile
that is housed within the dataengctl directory, or you can build it directly in that directory. All Binary files are ignored by default within that directory.
- Build the dataengctl binary
go build -o dataengctl
- Afterwards run the binary specifying it with
--help
rbarrett@MacBook-Pro-2 ~/Git/dataeng DATAENG-19 ● . --help 1 ↵ 10130 11:38:37
gets you data from different sources
Usage:
dataengctl [command]
Available Commands:
analyze Analyze something
completion generate the autocompletion script for the specified shell
help Help about any command
jira Interact with jira
salesforce Interact with salesforce
Flags:
--config-file string path to config file
--debug specify debug level
-h, --help help for dataengctl
Use "dataengctl [command] --help" for more information about a command.
As a result, you can see that there are several commands that are available
-
- analyze
-
- completion
-
- help
-
- jira
-
- salesforce
To interact with Jira and Salesforce or anything that is using the config.yaml
mappings, you will need to specify the command as follows:
-
- Intreracting with
analyze
command specifying the config path
- Intreracting with
. analyze --issue-type Escalation --project-key FIELD --config-file ${HOME}/.dataeng/config.yaml | jq "."
-
- Interacting with
jira
command specifying the config path
- Interacting with
. jira issue list --issue-type Escalation --project-key FIELD --config-file ${HOME}/.dataeng/config.yaml | jq ".pri"
-
- Interacting with
salesforce
command specifying the config path
- Interacting with
. salesforce query "SELECT+name+from+Account" --config-file ${HOME}/.dataeng/config.yaml | jq "."
Without jq
installed on your machine the default JSON
output will not look pretty. All output is default to JSON
format.