-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add optional saving of test output #12184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
saintstack
wants to merge
3
commits into
apple:main
Choose a base branch
from
saintstack:TH_ARCHIVE_LOGS_ON_FAILURE
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+2,285
−521
Draft
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,237 @@ | ||
#!/bin/sh | ||
#!/bin/bash | ||
|
||
# Entry point for running FoundationDB correctness tests | ||
# using Python-based TestHarness2 (invoked as `python3 -m test_harness.app`). | ||
# It is designed to be called by the Joshua testing framework. | ||
# For detailed documentation on TestHarness2 features, including log archival, | ||
# see contrib/TestHarness2/README.md. | ||
# | ||
# Key Responsibilities: | ||
# 1. Sets up unique temporary directories for test outputs (`APP_JOSHUA_OUTPUT_DIR`) | ||
# and runtime artifacts (`APP_RUN_TEMP_DIR`) based on JOSHUA_SEED or a timestamp. | ||
# 2. Gathers necessary environment variables and parameters (e.g., JOSHUA_SEED, | ||
# OLDBINDIR, JOSHUA_TEST_FILES_DIR) and translates them into command-line | ||
# arguments for the Python test harness application (`app.py`). | ||
# 3. Executes the Python test harness application, capturing its stdout (expected to be | ||
# a single XML summary line for Joshua) and stderr. | ||
# 4. Forwards relevant environment variables like `FDB_NETWORK_OPTION_EXTERNAL_CLIENT_DIRECTORY` | ||
# and `TH_JOB_ID` to the Python application. | ||
# 5. Provides default values for some TestHarness2 arguments if not explicitly passed. | ||
# 6. Conditionally preserves or cleans up the top-level temporary directories | ||
# (`APP_JOSHUA_OUTPUT_DIR` and `APP_RUN_TEMP_DIR`) based on the Python | ||
# application's exit code and the `TH_ARCHIVE_LOGS_ON_FAILURE` environment | ||
# variable. If `TH_ARCHIVE_LOGS_ON_FAILURE` is set to a true-like value | ||
# (e.g., '1', 'true', 'yes'), these directories are NOT deleted if the Python | ||
# application exits with a non-zero status, thus preserving all generated | ||
# artifacts for debugging (copy them local quick using 'kubectl cp podname:/tmp .' | ||
# before the pod goes away). The Python harness | ||
# also internally uses this variable to control its own more specific log archival behavior. | ||
# 7. Exits with the same exit code as the Python test harness application. | ||
|
||
# ============================================================================= | ||
# Cleanup logic | ||
# ============================================================================= | ||
# The cleanup function is defined first so it is available to the 'trap' command. | ||
cleanup() { | ||
# Unconditionally stop background FDB monitor | ||
# Clean up temporary directories unless debugging preservation is requested. | ||
|
||
echo "--- correctnessTest.sh cleanup routine starting ---" >&2 | ||
echo "PYTHON_EXIT_CODE: '${PYTHON_EXIT_CODE}'" >&2 | ||
echo "TH_ARCHIVE_LOGS_ON_FAILURE: '${TH_ARCHIVE_LOGS_ON_FAILURE}'" >&2 | ||
echo "TH_PRESERVE_TEMP_DIRS_ON_EXIT: '${TH_PRESERVE_TEMP_DIRS_ON_EXIT}'" >&2 | ||
|
||
local archive_on_failure=false | ||
if [ "${TH_ARCHIVE_LOGS_ON_FAILURE}" = "true" ]; then | ||
archive_on_failure=true | ||
fi | ||
|
||
if [ "${TH_PRESERVE_TEMP_DIRS_ON_EXIT}" = "true" ] || ( [ "${PYTHON_EXIT_CODE}" -ne "0" ] && [ "${archive_on_failure}" = "true" ] ); then | ||
echo "Cleanup: Condition to PRESERVE files was met." >&2 | ||
if [ "${PYTHON_EXIT_CODE}" -ne "0" ] && [ "${archive_on_failure}" = "true" ]; then | ||
echo "Python app exited with error (code ${PYTHON_EXIT_CODE}). ARCHIVE ON: NOT cleaning up unified output directory for inspection." >&2 | ||
echo " All run artifacts retained in: ${TOP_LEVEL_OUTPUT_DIR}" >&2 | ||
else | ||
echo "TH_PRESERVE_TEMP_DIRS_ON_EXIT is true. NOT cleaning up unified output directory." >&2 | ||
echo " All run artifacts retained in: ${TOP_LEVEL_OUTPUT_DIR}" >&2 | ||
fi | ||
else | ||
echo "Cleanup: Condition to PRESERVE files was NOT met. Deleting directory: ${TOP_LEVEL_OUTPUT_DIR}" >&2 | ||
rm -rf "${TOP_LEVEL_OUTPUT_DIR}" | ||
fi | ||
} | ||
|
||
# ============================================================================= | ||
# Script Main Body | ||
# ============================================================================= | ||
|
||
# Set a trap to run the cleanup function upon script exit. | ||
trap cleanup EXIT | ||
|
||
# Check if DIAG_LOG_DIR is set and non-empty, otherwise default to /tmp | ||
if [ -z "${DIAG_LOG_DIR}" ]; then | ||
DIAG_LOG_DIR="/tmp" | ||
fi | ||
|
||
# New: Define a single top-level directory for all TestHarnessV2 outputs for this run. | ||
# This directory's location can be controlled by the TH_OUTPUT_DIR env var. | ||
TH_OUTPUT_BASE_DIR="${TH_OUTPUT_DIR:-${DIAG_LOG_DIR}}" | ||
UNIQUE_RUN_SUFFIX="${JOSHUA_SEED:-$(date +%s%N)}" | ||
TOP_LEVEL_OUTPUT_DIR="${TH_OUTPUT_BASE_DIR}/th_run_${UNIQUE_RUN_SUFFIX}" | ||
|
||
# 1. Sets up unique temporary directories for test outputs (`APP_JOSHUA_OUTPUT_DIR`) | ||
# and the FDB cluster files (`APP_RUN_TEMP_DIR`). | ||
# These are now subdirectories of the new TOP_LEVEL_OUTPUT_DIR. | ||
APP_JOSHUA_OUTPUT_DIR="${TOP_LEVEL_OUTPUT_DIR}/joshua_output" | ||
APP_RUN_TEMP_DIR="${TOP_LEVEL_OUTPUT_DIR}/run_files" | ||
|
||
# We no longer use `set -e` because we want to guarantee that the | ||
# script runs to completion to cat the output files before cleanup. | ||
# trap 'echo "FATAL: error in correctnessTest.sh" >&2; cleanup' ERR | ||
|
||
# Ensure directories exist | ||
mkdir -p "${APP_JOSHUA_OUTPUT_DIR}" | ||
mkdir -p "${APP_RUN_TEMP_DIR}" | ||
|
||
# Check that directories were created successfully. | ||
if [ ! -d "${APP_JOSHUA_OUTPUT_DIR}" ]; then | ||
echo "FATAL: Failed to create APP_JOSHUA_OUTPUT_DIR (path: ${APP_JOSHUA_OUTPUT_DIR})" >&2 | ||
exit 1 | ||
fi | ||
if [ ! -d "${APP_RUN_TEMP_DIR}" ]; then | ||
echo "FATAL: Failed to create APP_RUN_TEMP_DIR (path: ${APP_RUN_TEMP_DIR})" >&2 | ||
exit 1 | ||
fi | ||
|
||
# Make sure the python application can write to them | ||
chmod 777 "${TOP_LEVEL_OUTPUT_DIR}" | ||
chmod 777 "${APP_JOSHUA_OUTPUT_DIR}" | ||
chmod 777 "${APP_RUN_TEMP_DIR}" | ||
|
||
echo "Created unified output directory: ${TOP_LEVEL_OUTPUT_DIR}" >&2 | ||
|
||
# --- Diagnostic Logging for this script --- | ||
DIAG_LOG_FILE="${DIAG_LOG_DIR}/correctness_test_sh_diag.${UNIQUE_RUN_SUFFIX}.log" | ||
|
||
# Redirect all of this script's stderr to the diagnostic log file | ||
# AND ensure the tee'd output also goes to stderr, not stdout. | ||
exec 2> >(tee -a "${DIAG_LOG_FILE}" 1>&2) | ||
|
||
# Now that stderr is redirected, log the definitive messages | ||
echo "--- correctnessTest.sh execution started at $(date) --- " >&2 | ||
echo "Using UNIQUE_RUN_SUFFIX: ${UNIQUE_RUN_SUFFIX}" >&2 | ||
echo "Diagnostic log for this script: ${DIAG_LOG_FILE}" >&2 | ||
echo "Script PID: $$" >&2 | ||
echo "Running as user: $(whoami)" >&2 | ||
echo "Bash version: $BASH_VERSION" >&2 | ||
echo "Initial PWD: $(pwd)" >&2 | ||
echo "Initial environment variables relevant to TestHarness:" >&2 | ||
echo " JOSHUA_SEED: ${JOSHUA_SEED}" >&2 | ||
echo " OLDBINDIR: ${OLDBINDIR}" >&2 | ||
echo " JOSHUA_TEST_FILES_DIR: ${JOSHUA_TEST_FILES_DIR}" >&2 | ||
echo " FDB_NETWORK_OPTION_EXTERNAL_CLIENT_DIRECTORY: ${FDB_NETWORK_OPTION_EXTERNAL_CLIENT_DIRECTORY}" >&2 | ||
echo " TH_ARCHIVE_LOGS_ON_FAILURE: ${TH_ARCHIVE_LOGS_ON_FAILURE}" >&2 | ||
echo "-----------------------------------------------------" >&2 | ||
|
||
# Simulation currently has memory leaks. We need to investigate before we can enable leak detection in joshua. | ||
export ASAN_OPTIONS="detect_leaks=0" | ||
export ASAN_OPTIONS="${ASAN_OPTIONS:-detect_leaks=0}" | ||
echo "ASAN_OPTIONS set to: ${ASAN_OPTIONS}" >&2 | ||
|
||
# --- Prepare arguments for the Python application --- | ||
# Default values are mostly handled by the Python app's config.py, | ||
# but we provide what Joshua gives us. | ||
|
||
# JOSHUA_SEED is mandatory for the python app | ||
if [ -z "${JOSHUA_SEED}" ]; then | ||
echo "FATAL: JOSHUA_SEED environment variable is not set." >&2 | ||
# Output a TestHarnessV1-style error XML to stdout for Joshua | ||
echo '<Test Ok="0" Error="InternalError"><JoshuaMessage Severity="40" Message="FATAL: JOSHUA_SEED environment variable is not set in correctnessTest.sh." /></Test>' | ||
exit 1 | ||
fi | ||
|
||
# OLDBINDIR: Default if not set by Joshua | ||
# The Python app's config.py has its own default, but we prefer Joshua's if available. | ||
APP_OLDBINDIR="${OLDBINDIR:-/app/deploy/global_data/oldBinaries}" # Default from original script if not set by env | ||
echo "Using OLDBINDIR for Python app: ${APP_OLDBINDIR}" >&2 | ||
|
||
# JOSHUA_TEST_FILES_DIR: This is the directory containing test definitions (.toml files). | ||
# The python app calls this --test-dir. If not set, Python app will use its default. | ||
APP_TEST_DIR="${JOSHUA_TEST_FILES_DIR}" | ||
if [ -z "${APP_TEST_DIR}" ]; then | ||
echo "WARNING: JOSHUA_TEST_FILES_DIR environment variable is not set. Python app will use its default test_source_dir (typically 'tests/' relative to CWD)." >&2 | ||
# We allow this to proceed, Python app will handle default or fail if no tests found there. | ||
else | ||
echo "Using JOSHUA_TEST_FILES_DIR for Python app (--test-source-dir): ${APP_TEST_DIR}" >&2 | ||
fi | ||
|
||
# Job ID from Joshua, if provided. | ||
APP_JOB_ID="${TH_JOB_ID-}" | ||
|
||
PYTHON_EXE="${PYTHON_EXE:-python3}" # Allow overriding the python executable | ||
|
||
# Construct Python command arguments | ||
PYTHON_CMD_ARGS=() | ||
PYTHON_CMD_ARGS+=("--joshua-seed" "${JOSHUA_SEED}") | ||
PYTHON_CMD_ARGS+=("--joshua-output-dir" "${APP_JOSHUA_OUTPUT_DIR}") | ||
PYTHON_CMD_ARGS+=("--run-temp-dir" "${APP_RUN_TEMP_DIR}") | ||
|
||
# Only pass --test-source-dir if APP_TEST_DIR (from JOSHUA_TEST_FILES_DIR) is set. | ||
if [ -n "${APP_TEST_DIR}" ]; then | ||
PYTHON_CMD_ARGS+=("--test-source-dir" "${APP_TEST_DIR}") | ||
fi | ||
|
||
if [ -n "${APP_OLDBINDIR}" ]; then | ||
PYTHON_CMD_ARGS+=("--old-binaries-path" "${APP_OLDBINDIR}") | ||
fi | ||
|
||
# Forward FDB_NETWORK_OPTION_EXTERNAL_CLIENT_DIRECTORY if set | ||
if [ -n "${FDB_NETWORK_OPTION_EXTERNAL_CLIENT_DIRECTORY}" ]; then | ||
PYTHON_CMD_ARGS+=("--external-client-library" "${FDB_NETWORK_OPTION_EXTERNAL_CLIENT_DIRECTORY}") | ||
fi | ||
|
||
# Forward TH_ARCHIVE_LOGS_ON_FAILURE if set (Python app reads this from env if not on CLI) | ||
# No need to explicitly pass as CLI if app.py handles TH_ARCHIVE_LOGS_ON_FAILURE env var. | ||
# If you wanted to override env with a script default, you could add: | ||
# if [ -n "${TH_ARCHIVE_LOGS_ON_FAILURE}" ]; then | ||
# PYTHON_CMD_ARGS+=("--archive-logs-on-failure" "${TH_ARCHIVE_LOGS_ON_FAILURE}") | ||
# fi | ||
|
||
# Forward TH_JOB_ID if set (Python app reads this from env if not on CLI) | ||
if [ -n "${APP_JOB_ID}" ]; then | ||
PYTHON_CMD_ARGS+=("--job-id" "${APP_JOB_ID}") | ||
fi | ||
|
||
echo "Python app executable: python3 -m test_harness.app" >&2 | ||
echo "Python app arguments:" >&2 | ||
printf " %s\n" "${PYTHON_CMD_ARGS[@]}" >&2 | ||
echo "-----------------------------------------------------" >&2 | ||
|
||
|
||
# --- Execute the Python Test Harness Application --- | ||
PYTHON_APP_STDOUT_FILE="${APP_RUN_TEMP_DIR}/python_app_stdout.log" # Temporary capture | ||
PYTHON_APP_STDERR_FILE="${APP_RUN_TEMP_DIR}/python_app_stderr.log" # Temporary capture | ||
|
||
# Execute python app. | ||
# stdout is redirected to this script's stdout (which goes to Joshua). | ||
# stderr is redirected to this script's diagnostic log file. | ||
echo "Executing Python app..." >&2 | ||
python3 -m test_harness.app "${PYTHON_CMD_ARGS[@]}" > "${PYTHON_APP_STDOUT_FILE}" 2> "${PYTHON_APP_STDERR_FILE}" | ||
PYTHON_EXIT_CODE=$? | ||
echo "Python app execution finished. Exit code: ${PYTHON_EXIT_CODE}" >&2 | ||
|
||
# If the python app failed, log it for clarity. The script will continue, | ||
# print any available stdout, and then exit with the failure code. | ||
if [ "${PYTHON_EXIT_CODE}" -ne 0 ]; then | ||
echo "Error: Python application returned a non-zero exit code." >&2 | ||
fi | ||
|
||
OLDBINDIR="${OLDBINDIR:-/app/deploy/global_data/oldBinaries}" | ||
#mono bin/TestHarness.exe joshua-run "${OLDBINDIR}" false | ||
# Output the Python app's stdout (the single XML line) to this script's stdout | ||
if [ -f "${PYTHON_APP_STDOUT_FILE}" ]; then | ||
cat "${PYTHON_APP_STDOUT_FILE}" | ||
else | ||
echo "WARNING: Python app stdout file (${PYTHON_APP_STDOUT_FILE}) not found." >&2 | ||
# Output a fallback XML if Python produced no stdout | ||
echo '<Test Ok="0" Error="PythonAppNoStdout"><JoshuaMessage Severity="40" Message="Python application produced no stdout file." /></Test>' | ||
fi | ||
|
||
# export RARE_PRIORITY=20 | ||
python3 -m test_harness.app -s ${JOSHUA_SEED} --old-binaries-path ${OLDBINDIR} | ||
exit ${PYTHON_EXIT_CODE} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
# FoundationDB TestHarness2 | ||
|
||
This directory contains TestHarness2, a Python-based test harness for FoundationDB, designed to be invoked by the Joshua testing framework via scripts like `correctnessTest.sh`. In typical FoundationDB testing setups orchestrated by Joshua, this harness and the tests it runs are executed within Kubernetes pods. | ||
|
||
## Key Features | ||
* Parses FoundationDB trace event logs (`trace.*.xml` or `trace.*.json`). | ||
* Generates summary XML (`joshua.xml`) compatible with Joshua's expectations. | ||
* Supports configuration via command-line arguments and environment variables. | ||
* Includes an optional feature for preserving detailed logs on test failure to aid in debugging. | ||
|
||
## TestHarness2 Operation and Outputs | ||
|
||
Understanding how TestHarness2 operates and where it stores its output is essential for interpreting test results and debugging issues. | ||
|
||
### Unified Output Directory | ||
|
||
For each invocation, TestHarness2 (via its `correctnessTest.sh` wrapper) creates a single, consolidated output directory. This makes all artifacts from a single run easy to find. | ||
|
||
* **Location:** The base location defaults to `/tmp` but can be controlled by the `TH_OUTPUT_DIR` environment variable. | ||
* **Naming Convention:** The directory is named `th_run_<seed>`, where `<seed>` is the unique Joshua seed for the run (e.g., `/tmp/th_run_6709478271895344724`). | ||
|
||
### Directory Structure | ||
|
||
Inside each `th_run_<seed>` directory, you will find a standardized structure: | ||
|
||
* `joshua_output/`: | ||
* **`joshua.xml`**: A comprehensive XML file containing detailed results and parsed events from all test parts. This is the most important file for a detailed analysis of the run. | ||
* **`app_log.txt`**: The main log file for the Python test harness application itself. Check this file first to debug issues with the harness, such as configuration errors or crashes. | ||
* Other summary files like `stats.json` or `run_times.json` if configured. | ||
|
||
* `run_files/`: | ||
* This directory contains a subdirectory for each individual test part that was executed. | ||
* Each per-test-part subdirectory contains: | ||
* `logs/`: The raw FoundationDB trace event logs (`trace.*.json`). | ||
* `command.txt`: The exact `fdbserver` command used for that test part. | ||
* `stdout.txt` / `stderr.txt`: The raw standard output/error from the `fdbserver` process for that part. | ||
|
||
### V1 Compatibility vs. Archival Mode | ||
|
||
TestHarnessV2 has two primary modes of operation, controlled by the `TH_ARCHIVE_LOGS_ON_FAILURE` environment variable. | ||
|
||
#### Default Behavior (`TH_ARCHIVE_LOGS_ON_FAILURE` is unset or `false`) | ||
|
||
* **V1 `stdout` Emulation:** For every test part (both success and failure), a single-line XML summary is printed to standard output. This is captured by Joshua and serves as the primary, persistent record of the test outcome. | ||
* **Cleanup:** The entire `th_run_<seed>` directory is **deleted** after the run completes, regardless of success or failure. | ||
|
||
#### Archival Mode (`TH_ARCHIVE_LOGS_ON_FAILURE=true`) | ||
|
||
This mode is designed to help debug failures by preserving all detailed logs and linking them directly from the summary. | ||
|
||
* **V1 `stdout` Emulation:** The harness continues to print the single-line XML summary to `stdout` for every test part, just like in the default mode. | ||
* **Log Referencing on Failure:** If a test part fails, special `<FDBClusterLogDir>`, `<HarnessLogFile>`, and other reference tags are injected into that test part's summary within the main `joshua_output/joshua.xml` file. These tags contain the absolute paths to the preserved log files and directories. | ||
* **Conditional Cleanup:** | ||
* If the test run is **successful**, the `th_run_<seed>` directory is **deleted**. | ||
* If the test run **fails**, the entire `th_run_<seed>` directory is **preserved**, allowing you to inspect all the artifacts and follow the paths referenced in the `joshua.xml`. | ||
|
||
**Example of enabling archival mode:** | ||
```bash | ||
joshua start --env TH_ARCHIVE_LOGS_ON_FAILURE=true --tarball /path/to/your/test.tar.gz | ||
``` | ||
|
||
### Summary of Outputs and Preservation: | ||
|
||
* **Joshua `stdout` (Always):** | ||
* Contains the official single-line XML summaries for each test part. This is the "V1 compatible" output. | ||
* **`/tmp/th_run_<seed>/` (Or `$TH_OUTPUT_DIR/th_run_<seed>/`):** | ||
* Contains all detailed artifacts: FDB traces, `joshua.xml`, `app_log.txt`, etc. | ||
* **Default Mode:** Deleted after every run. | ||
* **Archival Mode:** Preserved **only if** the run fails. Deleted on success. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.