Skip to content

badal-io/bi_sandbox

Repository files navigation

LookML CI/CD Pipeline: Complete Analysis

Pipeline Overview

CI/CD pipeline is a comprehensive LookML Validation Pipeline that automatically tests and validates LookML code changes. It runs on GitHub Actions and performs 5 different types of validation to ensure code quality, functionality, and reliability.

Trigger Events

The pipeline runs on:

  • Pull Requests to main, master, or develop branches
  • Direct pushes to main or master branches
  • Only when LookML-related files are changed (.lkml, .lookml, .view, etc.)

File Structure & Responsibilities

Main Workflow File

  • .github/workflows/lookml-validation.yml: The orchestrator that defines the entire pipeline flow

Validation Scripts (Python)

  • looker-validator.py: LookML syntax validation using Looker API
  • lookml-linter.py: Custom coding standards and best practices checker
  • sql-execution-validator.py: Tests actual SQL execution against databases
  • lookml-data-tests.py: Runs LookML data tests and model validation
  • lookml-content-validator.py: Validates Looker content (dashboards, looks)

Utility Scripts

  • report-generator.py: Creates comprehensive validation reports
  • check-results.py: Determines overall pipeline success/failure

Configuration Files

  • linting-rules.yaml: Defines coding standards and validation rules
  • requirements.txt: Python package dependencies
  • looker-config.ini.template: Template for Looker API configuration

Pipeline Steps Breakdown

Step 1: Environment Setup

  • Checkout code (fetch-depth: 0 for full history)
  • Setup Python 3.9
  • Cache pip dependencies for performance
  • Install Python packages from requirements.txt
  • Configure Looker API credentials from GitHub secrets

Step 2: Change Detection

  • Uses tj-actions/changed-files@v44 to detect modified LookML files
  • If no LookML files changed, pipeline skips validation steps
  • Displays list of changed files for transparency

Step 3: LookML Syntax Validation

Script: looker-validator.py

  • Purpose: Validates LookML syntax using official Looker API
  • Process:
    • Authenticates with Looker API
    • Runs project validation for bi_sandbox
    • Handles both 200 (with errors) and 204 (success) responses
    • Falls back to manifest.lkml validation if project not found
  • Output: validation_results.json
  • Blocking: ❌ YES (pipeline fails if syntax errors found)

Step 4: Custom LookML Linting

Script: lookml-linter.py

  • Purpose: Enforces coding standards and best practices
  • Checks:
    • Mandatory fields: Ensures dimensions/measures have required properties (label, description, type)
    • Naming conventions: Enforces snake_case, boolean prefixes (is_, has_)
    • Performance rules: Detects SELECT *, excessive nesting
    • Security rules: Identifies potential hardcoded secrets, PII without access filters
  • Configuration: Rules defined in linting-rules.yaml
  • Output: linting_results.json
  • Blocking: ❌ YES (errors fail pipeline, warnings don't)

Step 5: SQL Execution Testing

Script: sql-execution-validator.py

  • Purpose: Tests that SQL actually executes against real databases
  • Process:
    • Extracts SQL from derived tables in LookML files
    • Uses 2-step Looker API workflow:
      • Create SQL query (POST /api/4.0/sql_queries)
      • Execute query (POST /api/4.0/sql_queries/{slug}/run/json)
    • Detects database connections from model files
    • Tests SQL syntax, table existence, permissions
  • Advanced Features:
    • Handles multiline SQL formatting
    • Detects connection from model files or derived_table
    • Parses execution results for hidden errors (even with 200 status)
  • Output: sql_validation_results.json
  • Blocking: ❌ YES (SQL execution errors fail pipeline)

Step 6: LookML Data Tests

Script: lookml-data-tests.py

  • Purpose: Validates LookML tests and model integrity
  • Process:
    • Runs model validation for all models in project
    • Validates explore definitions
    • Checks LookML test definitions (like in case21_tests.lkml)
    • Verifies model-explore relationships
  • Tests Your Cases:
    • org_value_value_check: Tests value > 0
    • org_value_dep_length_check: Tests department length = 5
    • org_value_depid_notnull_check: Tests NOT NULL constraints
    • dep_a_good_number_limit: Tests total_number < 10000
  • Output: data_tests_results.json
  • Blocking: ⚠️ PARTIAL (continues on error, but affects final status)

Step 7: Content Validation

Script: lookml-content-validator.py

  • Purpose: Validates Looker content (dashboards, looks) in BI Sandbox folder
  • Process:
    • Uses Looker's content validation API
    • Filters results for "BI Sandbox" folder only
    • Checks dashboard elements, scheduled plans, alerts
    • Validates content dependencies and model references
  • Output: content_validation_results.json
  • Blocking: ⚠️ NO (allows pipeline to continue with warnings)

Step 8: Report Generation 📋

Script: report-generator.py

  • Purpose: Creates comprehensive validation reports
    • Generates:
    • GitHub Actions step summary (visible in workflow)
    • PR comment with detailed results and status icons
    • Markdown summary file
  • Features:
    • Status icons (✅ ❌ ⚪) for each validation step
    • Error details with file paths and line numbers
    • Performance metrics (execution times, query counts)

Step 9: Artifact Upload & Results Check

  • Uploads: All JSON results and summary files to GitHub
  • Retention: 30 days for debugging and history
  • Final Check: check-results.py determines overall success/failure

Step 10: PR Commenting 💬

  • Updates existing PR comments or creates new ones
  • Shows validation status with visual indicators
  • Provides direct links to detailed workflow results

Output Artifacts

Each validation step produces detailed JSON results:

  1. validation_results.json - LookML syntax validation
  2. linting_results.json - Coding standards results
  3. sql_validation_results.json - SQL execution test results
  4. data_tests_results.json - LookML data tests results
  5. content_validation_results.json - Content validation results
  6. validation_summary.md - Human-readable summary
  7. pr_comment.md - PR comment content

Error Handling & Resilience

Blocking vs Non-Blocking Steps

  • Blocking (Pipeline Fails): Syntax errors, linting errors, SQL execution errors
  • Non-Blocking (Warnings Only): Content validation issues, some data test failures

Error Categories

  • Syntax Errors: Malformed LookML, missing required fields
  • Execution Errors: SQL that fails to run, missing tables/permissions
  • Content Errors: Broken dashboards, invalid model references
  • Security Errors: Hardcoded secrets, unprotected PII fields

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

Languages