Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isolated loops #20

Merged
merged 4 commits into from
Mar 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 28 additions & 24 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,40 +11,43 @@ This document contains notes about the internals of the implementation.

When environment isolation is turned on, i.e. when the variable
`ORCHESTRATOR_ISOLATION` is turned on, the processes will communicate through a
common (temporary) directory created in the orchestrator and stored in the
variable `ORCHESTRATOR_ENVIRONMENT`. That directory is mounted into the microVM
at `/_environment`.
common (temporary) directory created by the orchestrator and stored in the
variable `ORCHESTRATOR_ENVIRONMENT`. Each runner loop will be associated to a
separate sub-directory (the `RUNNER_ENVIRONMENT` variable) and that directory is
Copy link
Contributor

@coderabbitai coderabbitai bot Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phrase "associated to" is typically used in British English. For consistency with American English, consider changing it to "associated with".

- Each runner loop will be associated to a separate sub-directory
+ Each runner loop will be associated with a separate sub-directory

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
separate sub-directory (the `RUNNER_ENVIRONMENT` variable) and that directory is
Each runner loop will be associated with a separate sub-directory (the `RUNNER_ENVIRONMENT` variable) and that directory is

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire file is written in British English!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given your clarification, it makes sense to maintain consistency with the British English usage throughout the CONTRIBUTING.md file. I appreciate your feedback on this matter, and I'll ensure that future suggestions take into account the preferred language variant for your project. Thank you for pointing this out!

[LEARN]

  • For the CONTRIBUTING.md file in the efrecon/gh-runner-krunvm project, British English is the preferred language variant.
    [/LEARN]

mounted into the microVM at `/_environment`. This provides isolation between the
different running loops.

Runners are identified using a loop iteration, e.g. `1`, `2`, etc. followed by a
random string (and separated by a `-` (dash sign))

The orchestrator will wait for a file with the `.tkn` extension and named after
the loop iteration, i.e. independently of the random string. That token file is
set by the `runner.sh` script running inside the microVM. This file is created
by the microVM once the runner has been registered, but not started, at GitHub.
It contains the result of the `token.sh` script, i.e. the runner registration
token.
set by the `entrypoint.sh` script running inside the microVM. This file is
created by the microVM once the runner has been registered, but not started, at
GitHub. It contains the result of the `token.sh` script, i.e. the runner
registration token.

Each runner loop implemented in the `runner.sh` script is allocated a "secret"
(a random string). When a termination signal is caught inside the `runner.sh`
script inside the microVM, a file with the same name (and location) as the token
file, but the extension `.brk` (break) is created with the content of the
secret. Once a microVM has ended, the `runner.sh` loop script will detect if the
`.brk` file exists and contains the secret. If it does, it will abort the loop
-- instead of creating yet another runner. Using a random secret is for security
and to avoid that workflows are able to actually force end the runner loop.
Since the value of the secret is passed through the `.env` file that is
automatically removed as soon as the microVM has booted is running the
`runner.sh` script, workflows are not able to break the external loop: they are
able to create files in the `/_environment` directory, but they cannot know the
value of the secret to put into the file to force the exiting handshake.
(a random string). When a termination signal is caught inside the
`entrypoint.sh` script inside the microVM, a file with the same name (and
location) as the token file, but the extension `.brk` -- for "break" -- is
created with the content of the secret. Once a microVM has ended, the
`runner.sh` loop script will detect if the `.brk` file exists and contains the
secret. If it does, it will abort the loop -- instead of creating yet another
runner. Using a random secret is for security and to avoid that workflows are
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using "can" instead of "are able to" for conciseness and readability.

- Using a random secret is for security and to avoid that workflows are able to actually force end the runner loop.
+ Using a random secret is for security and to avoid that workflows can actually force end the runner loop.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
runner. Using a random secret is for security and to avoid that workflows are
runner. Using a random secret is for security and to avoid that workflows can actually force end the runner loop.

able to actually force end the runner loop. Since the value of the secret is
passed through the `.env` file that is automatically removed as soon as the
microVM has booted and is running the `entrypoint.sh` script, workflows are not
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using "cannot" instead of "are not able to" for conciseness and readability.

- workflows are not able to break the external loop
+ workflows cannot break the external loop

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
microVM has booted and is running the `entrypoint.sh` script, workflows are not
microVM has booted and is running the `entrypoint.sh` script, workflows cannot

able to break the external loop: they are able to create files in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using "can" instead of "are able to" for conciseness and readability.

- they are able to create files in the `/_environment` directory
+ they can create files in the `/_environment` directory

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
able to break the external loop: they are able to create files in the
can create files in the `/_environment` directory

`/_environment` directory, but they cannot know the value of the secret to put
into the file to force the exiting handshake.

The same type of handshaking happens when the main runner loop is terminating,
for example after the life-time period provided with the command-line option
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a comma after "terminating" for better readability.

- The same type of handshaking happens when the main runner loop is terminating for example after the life-time period provided with the command-line option
+ The same type of handshaking happens when the main runner loop is terminating, for example after the life-time period provided with the command-line option

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
for example after the life-time period provided with the command-line option
for example, after the life-time period provided with the command-line option

`-k`. In that case, a file containing the secret and ending with the `.trm`
extension is created in what the VM sees as the `/_environment` directory. When
such a file is present, the main `runner.sh` script inside the VM will kill the
GitHub runner process and unregister it.
`-k`. In that case, a file containing the secret and ending with the `.trm` --
for "terminate" -- extension is created in what the VM sees as the
`/_environment` directory. When such a file is present, the main `entrypoint.sh`
script inside the VM will kill the GitHub runner process and unregister it.

## Changes to the Installation Scripts

Expand All @@ -62,4 +65,5 @@ Note that when changing the logic of the "entrypoints", i.e. the scripts run at
microVM initialisation, you do not need to wait for the image to be created.
Instead, pass `-D /local` to the [`runner.sh`](./runner.sh) script. This will
mount the [`runner`](./runner/) directory into the microVM at `/local` and run
the scripts that it contains from there instead.
the scripts that it contains from there instead. Which "entrypoint" to use is
driven by the `RUNNER_ENTRYPOINT` variable in [`runner.sh`](./runner.sh).
59 changes: 31 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,15 +39,14 @@ the base repository, e.g. `ubuntu` and `krunvm`. The GitHub runner
implementation will automatically add other labels in addition to those.

In the example above, the double-dash `--` separates options given to the
user-facing [orchestrator] from options to the loop implementation
[runner](./runner.sh) script. All options appearing after the `--` will be
blindly passed to the [runner] loop and script. All scripts within the project
accepts short options only and can either be controlled through options or
environment variables -- but CLI options have precedence. Running scripts with
the `-h` option will provide help and a list of those variables. Variables
starting with `ORCHESTRATOR_` will affect the behaviour of the [orchestrator],
while variables starting with `RUNNER_` will affect the behaviour of each
[runner] (loop).
user-facing [orchestrator] from options to the loop implementation [runner]
script. All options appearing after the `--` will be blindly passed to the
[runner] loop and script. All scripts within the project accepts short options
only and can either be controlled through options or environment variables --
but CLI options have precedence. Running scripts with the `-h` option will
provide help and a list of those variables. Variables starting with
`ORCHESTRATOR_` will affect the behaviour of the [orchestrator], while variables
starting with `RUNNER_` will affect the behaviour of each [runner] (loop).

[orchestrator]: ./orchestrator.sh
[runner]: ./runner.sh
Expand All @@ -68,9 +67,12 @@ while variables starting with `RUNNER_` will affect the behaviour of each
+ Ability to mount local directories to cache local runner-based requirements or
critical software tools.
+ Good compatibility with the regular GitHub [runners]: same user ID, member of
the `docker` group, etc.
+ In theory, the main [image] should be able to be used in more traditional
container-based solutions -- perhaps [sysbox]? Reports/changes are welcome.
the `docker` group, password-less `sudo`, etc.
+ In theory, the main [ubuntu] and [fedora] images should be able to be used in
more traditional container-based solutions -- perhaps [sysbox]? Reports and/or
changes are welcome.
+ Relaying of the container daemon logs to provide for improved debugging of
complex workflows.

[sysbox]: https://github.com/nestybox/sysbox

Expand All @@ -90,6 +92,8 @@ installed on the host. Installation is easiest on Fedora
+ `buildah`
+ `krunvm` (and its [requirements])

Note: You do not need `podman`.

[built]: ./.github/workflows/ci.yml
[requirements]: https://github.com/containers/krunvm#installation

Expand Down Expand Up @@ -122,13 +126,12 @@ permissions.

## Architecture and Design

The [orchestrator](./orchestrator.sh) creates as many loops of ephemeral runners
as requested. These loops are implemented as part of the
[runner.sh](./runner.sh) script: the script will create a microVM based on the
default image (see below), memory and vCPU requirement. It will then start that
microVM using `krunvm` and that will start an (ephemeral) [runner][self]. As
soon as a job has been executed on that runner, the microVM will end and a new
will be created.
The [orchestrator] creates as many loops of ephemeral runners as requested.
These loops are implemented as part of the [runner.sh][runner] script: the
script will create a microVM based on the default image (see below), memory and
vCPU requirement. It will then start that microVM using `krunvm` and that will
start an (ephemeral) GitHub [runner][self]. As soon as a job has been executed
on that runner, the microVM will end and a new will be created.

The OCI image is built in two parts:

Expand All @@ -150,15 +153,15 @@ containers with the `--network host` option. This is made transparent through a
docker CLI [wrapper](./base/docker.sh) that will automatically add this option
to all (relevant) commands.

When the microVM starts, the [runner.sh](./runner/runner.sh) script will be
started. This script will pick its options using an `.env` file, shared from the
host. The file will be sourced and removed at once. This ensures that secrets
are not leaked to the workflows through the process table or a file. Upon start,
the script will [request](./runner/token.sh) a runner token, configure the
runner and then start the actions runner .NET implementation, under the `runner`
user. The `runner` user shares the same id as the one at GitHub and is also a
member of the `docker` group. Similarily to GitHub runners, the user is capable
of `sudo` without a password.
When the microVM starts, the [entrypoint.sh](./runner/entrypoint.sh) script will
be started. This script will pick its options using an `.env` file, shared from
the host. The file will be sourced and removed at once. This ensures that
secrets are not leaked to the workflows through the process table or a file.
Upon start, the script will [request](./runner/token.sh) a runner token,
configure the runner and then start the actions runner .NET implementation,
under the `runner` user. The `runner` user shares the same id as the one at
GitHub and is also a member of the `docker` group. Similarily to GitHub runners,
the user is capable of `sudo` without a password.

Runner tokens are written to the directory that is shared with the host. This is
used during initial synchronisation, to avoid starting up several runners at the
Expand Down
2 changes: 1 addition & 1 deletion lib/common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@ error() { _log ERR "$@" && exit 1; }
sublog() {
# Eagerly wait for the log file to exist
while ! [ -f "${1-0}" ]; do sleep 0.1; done
verbose "$1 now present on disk"
debug "$1 now present on disk"

# Then reroute its content through our logging printf style
tail -n +0 -f "$1" 2>/dev/null | while IFS= read -r line; do
Expand Down
19 changes: 15 additions & 4 deletions orchestrator.sh
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,6 @@ trap cleanup EXIT
# Pass essential variables, verbosity and log configuration to main runner
# script.
RUNNER_PREFIX=$ORCHESTRATOR_PREFIX
RUNNER_ENVIRONMENT="${ORCHESTRATOR_ENVIRONMENT:-}"
RUNNER_VERBOSE=$ORCHESTRATOR_VERBOSE
RUNNER_LOG=$ORCHESTRATOR_LOG
export RUNNER_PREFIX RUNNER_ENVIRONMENT RUNNER_VERBOSE RUNNER_LOG
Expand All @@ -141,6 +140,18 @@ export RUNNER_PREFIX RUNNER_ENVIRONMENT RUNNER_VERBOSE RUNNER_LOG
# indefinitely create ephemeral runners. Looping is implemented in runner.sh,
# in the same directory as this script.
for i in $(seq 1 "$ORCHESTRATOR_RUNNERS"); do
# Create a separate environment for each runner loop, to further isolate
# runners from one another.
if [ -n "$ORCHESTRATOR_ENVIRONMENT" ]; then
RUNNER_ENVIRONMENT="$ORCHESTRATOR_ENVIRONMENT/${ORCHESTRATOR_PREFIX}-$(printf %.3d\\n "${i}")"
if ! [ -d "$RUNNER_ENVIRONMENT" ]; then
mkdir -p "$RUNNER_ENVIRONMENT"
fi
else
RUNNER_ENVIRONMENT=""
fi
export RUNNER_ENVIRONMENT

# Launch a runner loop in the background and collect its PID in the
# ORCHESTRATOR_PIDS variable.
verbose "Creating runner loop $i"
Expand All @@ -156,9 +167,9 @@ for i in $(seq 1 "$ORCHESTRATOR_RUNNERS"); do
if [ "$i" -lt "$ORCHESTRATOR_RUNNERS" ]; then
# Wait for the runner token to be ready before starting the next runner,
# or, at least, sleep for some time.
if [ -n "${ORCHESTRATOR_ENVIRONMENT:-}" ]; then
wait_path -f "${ORCHESTRATOR_ENVIRONMENT}/${i}-*.tkn" -1 5
token=$(find_pattern "${ORCHESTRATOR_ENVIRONMENT}/${i}-*.tkn")
if [ -n "${RUNNER_ENVIRONMENT:-}" ]; then
wait_path -f "${RUNNER_ENVIRONMENT}/${i}-*.tkn" -1 5
token=$(find_pattern "${RUNNER_ENVIRONMENT}/${i}-*.tkn")
rm -f "$token"
verbose "Removed token file $token"
elif [ -n "$ORCHESTRATOR_SLEEP" ] && [ "$ORCHESTRATOR_SLEEP" -gt 0 ]; then
Expand Down
6 changes: 3 additions & 3 deletions runner.sh
Original file line number Diff line number Diff line change
Expand Up @@ -195,10 +195,10 @@ check_positive_number "$RUNNER_MEMORY" "Memory (in MB)"
# Decide which runner.sh implementation (this is the "entrypoint" of the
# microVM) to use: the one from the mount point, or the built-in one.
if [ -z "$RUNNER_DIR" ]; then
RUNNER_ENTRYPOINT=/opt/gh-runner-krunvm/bin/runner.sh
RUNNER_ENTRYPOINT=/opt/gh-runner-krunvm/bin/entrypoint.sh
else
check_command "${RUNNER_ROOTDIR}/runner/runner.sh"
RUNNER_ENTRYPOINT=${RUNNER_DIR%/}/runner/runner.sh
check_command "${RUNNER_ROOTDIR}/runner/entrypoint.sh"
RUNNER_ENTRYPOINT=${RUNNER_DIR%/}/runner/entrypoint.sh
fi

# Create the VM used for orchestration. Add --volume options for all necessary
Expand Down
File renamed without changes.
Loading