Skip to content

Conversation

mabbott-aurorasolar
Copy link

See #4746 for full context.

In short, this PR will add metrics (optionally) to record whether or not a given runner pool was sufficient to handle a job when it arrives. This will allow users (specifically, my company) to better understand whether we're right-sizing our pools or not.

@mabbott-aurorasolar mabbott-aurorasolar requested review from a team as code owners September 4, 2025 14:46
@npalm npalm requested a review from Copilot September 4, 2025 18:42
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds optional pool sufficiency metrics to track whether runner pools are adequately sized to handle incoming jobs. The feature allows users to monitor if their pool configurations are right-sized by recording metrics when jobs arrive.

  • Adds a new enable_pool_sufficiency configuration option across all relevant modules
  • Implements metric collection logic in the scale-up function to track pool adequacy
  • Includes comprehensive tests to verify the metric behavior under different scenarios

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
variables.tf Adds enable_pool_sufficiency option to root module metrics configuration
modules/runners/variables.tf Adds enable_pool_sufficiency option to runners module metrics configuration
modules/runners/scale-up.tf Passes pool sufficiency metric environment variable to scale-up Lambda
modules/runners/job-retry/variables.tf Adds enable_pool_sufficiency option to job-retry module configuration
modules/runners/job-retry/main.tf Passes pool sufficiency metric environment variable to job-retry Lambda
modules/multi-runner/variables.tf Adds enable_pool_sufficiency option to multi-runner module metrics configuration
lambdas/functions/control-plane/src/scale-runners/scale-up.ts Implements pool sufficiency metric collection logic
lambdas/functions/control-plane/src/scale-runners/scale-up.test.ts Adds comprehensive tests for pool sufficiency metric functionality
examples/multi-runner/main.tf Documents the new enable_pool_sufficiency option in example
examples/default/main.tf Documents the new enable_pool_sufficiency option in example

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +484 to +493
function createPoolSufficiencyMetric(environment: string, payload: ActionRequestMessage, wasSufficient: boolean) {
if (yn(process.env.ENABLE_METRIC_POOL_SUFFICIENCY, { default: false })) {
const metric = createSingleMetric('SufficientPoolHosts', MetricUnit.Count, wasSufficient ? 1.0 : 0.0, {
Environment: environment,
});
metric.addMetadata('Environment', environment);
metric.addMetadata('RepositoryName', payload.repositoryName);
metric.addMetadata('RepositoryOwner', payload.repositoryOwner);
}
}
Copy link
Preview

Copilot AI Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Environment dimension is added twice - once in the metric creation and again as metadata. The dimension in the metric creation (line 487) should be sufficient for grouping metrics, making the duplicate metadata on line 489 redundant.

Copilot uses AI. Check for mistakes.

return {
...actual,
// eslint-disable-next-line @typescript-eslint/no-unused-vars
createSingleMetric: vi.fn((name: string, unit: string, value: number, dimensions?: Record<string, string>) => {
Copy link
Preview

Copilot AI Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mock function parameter types don't match the actual createSingleMetric function signature. The second parameter should be MetricUnit enum type, not string. This could lead to test inconsistencies.

Copilot uses AI. Check for mistakes.

@npalm
Copy link
Member

npalm commented Sep 4, 2025

@mabbott-aurorasolar thx for our PR. I have the next day not much time to dig in. Are you considering the PR ready? If not you can mark it as draft. I see no documentation, at least this part of the docs needs to be adjusted: https://github-aws-runners.github.io/terraform-aws-github-runner/configuration/#multiple-runner-module-in-your-aws-account

@mabbott-aurorasolar
Copy link
Author

@npalm great callout on the docs. I'll update those as well, and follow-up on copilot's suggestions. I honestly threw this together in an hour this morning and wasn't sure what the usual process on this repo is, but yeah I do think this is mostly feature complete.

@npalm
Copy link
Member

npalm commented Sep 23, 2025

Sorry for keep you wainting. Will do my best to put the PR on the list for the next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants