Orchestrator and runner #3

efrecon · 2024-02-13T23:00:17Z

Add an orchestrator and a runner.

The orchestrator creates a micro VM and starts a number of runner loops. These are implemented by the runner.
The runner script will loop forever, starting up a microVM that will run an ephemeral runner until it has picked up a job and died.

Variables that are runner specific are all called RUNNER_* in all processes to facilitate passing them across. When starting the "container", the preferred way is to create an .env file with those variables, a file that will be run from within the "entrypoint" of the container, e.g. the process run in the VM. This is a security measure as secrets such as a PAT are not accessible to processes through the process table -- the file is removed as soon as it has been read.

The runner is run under the runner user and the default is to start podman in emulation mode, making available a /var/run/docker.sock socket file (owned by the docker group, of which the runner user is a member).

To ensure that several runners can be start from the same host, the runner implementation (binaries) is copied inside unique directory hierarchies. This is necessary as otherwise token reuse will happen and registration errors will happen.

Summary by CodeRabbit

New Features
- Implemented a GitHub Actions workflow for building Docker images, enhancing automation for development.
- Introduced a script for orchestrating micro virtual machines (VMs) to run GitHub actions, improving isolation and management of runner environments.
- Added functionality for creating and managing ephemeral GitHub runners via a microVM, offering more control over runner behavior and environment setup.
Enhancements
- Updated the installation process for a GitHub runner environment to include additional Docker plugins, enhancing the runner's capabilities.
- Improved documentation on the organization of GitHub runners, providing clearer guidance on user information and configuration.
Refactor
- Revised various scripts to utilize updated variable names and introduced new functions for better project description, command verification, environment variable retrieval, and logging.
Bug Fixes
- Adjusted the installation directory path and variable names in runner scripts, ensuring smoother installation and configuration processes.
- Ensured essential command jq is checked before executing token-related operations, preventing potential errors.

This passes the environment, where the PAT is located, through an environment file instead of a command line argument. This is a security improvement, as it avoids leaking the PAT through the process list. The environment file is automatically removed before the runner process is created, ensuring that it cannot be accessed from within workflows.

Improve logging so the identifier of the runner (locally and at GitHub) is present in each log message to ease identification.

When starting several runners, the same token will be returned. In order to be able to run several runners, they need to be configured from different working directories, and these directories need to contain an entire copy of the runner installation. Arrange for two sub-directories to be created in the working directory: one called `runner` will contain the copy of the installation tree, another called `work` will be where the runner will be "working", e.g. checkout repositories, etc.

coderabbitai · 2024-02-13T23:00:28Z

Walkthrough

The updates across these files focus on enhancing the infrastructure for GitHub Actions by introducing a new workflow for Docker image builds, refining GitHub runner management through scripts, and improving Docker plugin support. Key improvements include streamlined Docker image building triggered by specific branches, detailed configuration for microVM-based GitHub runners, and enhanced command verification and logging within the scripts. These changes aim to optimize the development and deployment process, ensuring a more efficient and organized CI/CD pipeline.

Changes

File(s)	Summary
`.github/workflows/dev.yml`	Introduced a GitHub Actions workflow for building Docker images triggered by `feature/` branch pushes, including `build-base` and `build-main` jobs with secrets for Docker configurations.
`base/base.sh`	Updated variable description and added Docker plugins installation alongside `docker-ce-cli` for GitHub runner environment setup in Fedora.
`docs/NOTES.md`	Added details on GitHub runner organization, including user info, directory structure, versioned files, configuration, and tool cache locations.
`lib/common.sh`	Enhanced with new functions for command verification (`check_command`), environment variable retrieval (`get_env`), and `krunvm` execution (`run_krunvm`); updated logging to include `KRUNVM_RUNNER_BIN`.
`orchestrator.sh`, `runner.sh`	Scripts for orchestrating microVM creation for GitHub runners and managing runner environments, including VM resources, DNS, mounts, and runner details.
`runner/install.sh`, `runner/runner.sh`	Updated installation directory paths, variable names, and download link handling for runner tar files; introduced functions for runner management and configuration adjustments based on environment files.
`runner/token.sh`	Modified the description variable for runner token acquisition and added a `jq` command check to ensure its availability.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit-tests for this file.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit tests for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository from git and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit tests.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
The JSON schema for the configuration file is available here.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

coderabbitai

Review Status

Actionable comments generated: 19

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between 97c3526 and b4beca8.

Files selected for processing (9)

.github/workflows/dev.yml (1 hunks)
base/base.sh (2 hunks)
docs/NOTES.md (1 hunks)
lib/common.sh (1 hunks)
orchestrator.sh (1 hunks)
runner.sh (1 hunks)
runner/install.sh (4 hunks)
runner/runner.sh (9 hunks)
runner/token.sh (2 hunks)

Additional comments: 23

.github/workflows/dev.yml (1)

1-31: The workflow definition correctly sets up two jobs, build-base and build-main, with dependencies and secrets properly configured. The use of ${{ secrets.GITHUB_TOKEN }} for authentication in Docker builds is a secure practice.

lib/common.sh (6)

42-45: The usage function correctly utilizes KRUNVM_RUNNER_DESCR for dynamic description. Ensure that all scripts using this function define KRUNVM_RUNNER_DESCR to avoid empty descriptions.

49-54: The check_command function properly verifies the accessibility of commands, enhancing the script's robustness by ensuring necessary dependencies are met before proceeding.

56-67: The get_env function securely retrieves environment variable values from a file, correctly isolating the sourcing process to prevent variable leakage. This is a good security practice.

69-72: The run_krunvm function correctly executes krunvm with specified arguments using buildah unshare, aligning with best practices for containerized environments.

74-91: The wait_path function implements a robust mechanism for waiting on a file or directory, including timeout and interval handling. This is useful for synchronization in scripts.

97-103: The modification to the _log function to dynamically set KRUNVM_RUNNER_BIN if not already set is a good enhancement for flexibility in logging. Ensure KRUNVM_RUNNER_BIN is appropriately named in all contexts where logging is used.

runner/token.sh (2)

59-59: The change from KRUNVM_RUNNER_MAIN to KRUNVM_RUNNER_DESCR for the description variable aligns with the standardized naming convention across scripts. Ensure consistency in all references.

90-90: The addition of a check for the jq command before proceeding with token acquisition is a good practice, ensuring the necessary tool is available for JSON parsing.

base/base.sh (2)

57-57: The update to KRUNVM_RUNNER_DESCR for the script description maintains consistency with the naming convention used in other scripts.

118-118: The installation of docker-ce-cli alongside Docker plugins docker-buildx-plugin and docker-compose-plugin is a significant enhancement, ensuring compatibility and extended functionality for Docker operations within the environment.

runner/install.sh (3)

57-57: The change to KRUNVM_RUNNER_DESCR for the script description aligns with the standardized naming convention across scripts.

43-43: Changing the installation directory to a versioned path under ../share/runner improves organization and version management of the runner installations.

108-110: Adjusting the download and extraction process to use versioned .tgz filenames enhances clarity and version tracking. Removing the deletion of the tar file after extraction is acceptable given the move to a versioned filename, but consider cleanup strategies for older versions to manage disk space.

docs/NOTES.md (1)

121-141: The added documentation provides clear and detailed information about the organization of GitHub runners, including user details, directory structure, versioned files, and tool cache locations. This enhances understanding and manageability of the runner environment.

runner.sh (2)

86-86: The description variable KRUNVM_RUNNER_DESCR is correctly set to describe the script's purpose, maintaining consistency across the project.

147-179: The loop logic for continuously starting microVMs to run ephemeral GitHub runners is correctly implemented, including the handling of environment variables and the execution of krunvm. The use of random_string for unique runner IDs and the conditional handling of environment files are well-designed.

orchestrator.sh (2)

30-30: Validate the output of command -v -- "$(abspath "$0")" to ensure it returns a valid path before proceeding.

66-70: The description of ORCHESTRATOR_ISOLATION implies a security feature but does not detail how it's implemented. Ensure the implementation securely handles environment variables without exposing sensitive information.

runner/runner.sh (4)

53-53: The default value for RUNNER_USER has been changed to "runner". Confirm this user exists and has the necessary permissions.

77-77: Validate the default value for RUNNER_INSTALL to ensure it points to a valid directory containing the runner binaries.

82-84: The handling of RUNNER_ENVFILE should ensure the file's integrity and security, given it may contain sensitive information.

375-375: Verify that the docker_daemon function is called only when necessary and that it correctly handles the daemon's lifecycle.

orchestrator.sh

runner/runner.sh

coderabbitai · 2024-02-13T23:02:48Z

runner/runner.sh

+# Setup variables that would have been missing. These depends on the main
+# variables, so we do it here rather than at the top of the script.
 debug "Setting up missing defaults"
-[ -f "/etc/os-release" ] && . /etc/os-release
-RUNNER_DISTRO=${RUNNER_DISTRO:-"${ID:-"unknown}"}"}
+distro=$(get_env "/etc/os-release" "ID")
+RUNNER_DISTRO=${RUNNER_DISTRO:-"${distro:-"unknown}"}"}
 RUNNER_NAME_PREFIX=${RUNNER_NAME_PREFIX:-"${RUNNER_DISTRO}-krunvm"}
-RUNNER_NAME=${RUNNER_NAME:-"${RUNNER_NAME_PREFIX}-$(random_string)"}
+RUNNER_NAME=${RUNNER_NAME:-"${RUNNER_NAME_PREFIX}-$RUNNER_ID"}

 RUNNER_WORKDIR=${RUNNER_WORKDIR:-"/_work/${RUNNER_NAME}"}
-if [ -n "${ID:-}" ]; then
+if [ -n "${distro:-}" ]; then
  RUNNER_LABELS=${RUNNER_LABELS:-"krunvm,${RUNNER_DISTRO}"}
 else
  RUNNER_LABELS=${RUNNER_LABELS:-"krunvm"}
 fi

+RUNNER_TAR=$(find "$RUNNER_INSTALL" -type f -name "*.tgz" | sort -r | head -n 1)
+if [ -z "$RUNNER_TAR" ]; then
+  error "No runner tar file found under $RUNNER_INSTALL"
+fi
+


The setup of missing defaults and construction of the runner URL should validate the values to prevent configuration errors.

runner/runner.sh

efrecon added 30 commits February 10, 2024 23:40

Enforce presence of jq

0d22f09

Add utility function

68210f1

Use new functions and enforce good defaults

f4c95fe

Add an "orchestrator" and a runner

00c9486

Rewrite to more generic env file "parser"

7b77388

Add simple runner test

1f52637

Only print vars

c3af32c

Add ps command to check isolation

8a4a2aa

Improve logging

bb16216

Remove comment. Signalling not working (yet?)

53bbda6

Ignore more variables

4f6a170

Change mount->directory

2ee807b

Enforce ephemeral runners

918101b

Implement host->VM mount points

6d7b4fc

Improve logging: use identifier

a0a7d7d

Improve logging so the identifier of the runner (locally and at GitHub) is present in each log message to ease identification.

Install debug shell

1d4d2bf

Run on regular runners

5fd2cdc

Add waiting debug

5a09519

Revert to self-host test

c4f743e

Add GH runner analysis

8fcb628

Add sleep between runner startups

cd100ca

Add dev CI

cc3e455

Make docker dependencies explicit

3f39442

Keep tgz file and store under version name

715bedb

Install from tgz and fix permissions docker sock

95810d4

Change workflow name

caefd6e

Wait for podman to be running

9166f03

Change workflow name

332856d

efrecon added 4 commits February 13, 2024 23:34

Change default user to runner

f36c86a

Add docker test

bb5dad1

Remove mount when no DIR specified

9fe4ea8

Remove test

b4beca8

efrecon self-assigned this Feb 13, 2024

coderabbitai bot reviewed Feb 13, 2024

View reviewed changes

efrecon merged commit 50fd848 into main Feb 13, 2024
6 checks passed

efrecon deleted the feature/krunvm-runner branch February 13, 2024 23:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Orchestrator and runner #3

Orchestrator and runner #3

efrecon commented Feb 13, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 13, 2024 •

edited

Loading

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (`.coderabbit.yaml`)

CodeRabbit Discord Community

coderabbitai bot left a comment

coderabbitai bot Feb 13, 2024

Orchestrator and runner #3

Orchestrator and runner #3

Conversation

efrecon commented Feb 13, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Feb 13, 2024 • edited Loading

Walkthrough

Changes

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (.coderabbit.yaml)

CodeRabbit Discord Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Feb 13, 2024

Choose a reason for hiding this comment

efrecon commented Feb 13, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 13, 2024 •

edited

Loading

CodeRabbit Configration File (`.coderabbit.yaml`)