Option to prevent startup if tag fetching fails #2886

stevenmatthewt · 2024-07-12T20:25:08Z

Is your feature request related to a problem? Please describe.
We use --tags-from-ec2-tags to set several tags automatically based on the EC2 instance the agent is running on. One of the tags we set in this way is the queue tag. Recently, one of our agents failed to fetch those EC2 tags on startup (I don't have logs for what the underlying error was unfortunately), but the agent continues to start regardless of this error. This caused our agent to start up in the default queue, which was incorrect and caused us a bit of a headache.

Describe the solution you'd like
Currently, the agent is configured to just log errors when fetching tags. Since that functionality is actually pretty critical to us, I'd love to have an option to actually block startup by erroring/panicking instead.

Blocking startup would be a good safe default, but it's also a breaking change, so just having another config option to enable the "strict" behavior here would work well for us.

Describe alternatives you've considered
Alternatively, if there was a way to configure buildkite-agent to require a queue to be configured, and not to automatically use the default queue, that would work. We tried configuring tags="queue=not-functional-queue" in the config file, with the hopes that it would be "overridden" by the EC2 tag. But it seems that will just cause the agent to listen on both queues.

Or, if there was a way to inspect the config of the Buildkite Agent after it starts, we could use that to verify that the queue was set properly.

The text was updated successfully, but these errors were encountered:

patrobinson · 2024-07-17T04:05:15Z

Hi @stevenmatthewt , thanks for the details. It seems we are using the API to get the tags and retrying 5 times, so this seems like it would only fail if Amazon's API was having a bad time.

It looks like we could instead use the metadata endpoint to retrieve tags, which should be a lot more reliable and wouldn't be a breaking change.

stevenmatthewt · 2024-07-30T13:17:56Z

Yeah, I think using the metadata API would be a pretty reasonable change to make internally. I'd honestly still love a way to make failures block startup, as I think the existing behavior is a little backwards and confusing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option to prevent startup if tag fetching fails #2886

Option to prevent startup if tag fetching fails #2886

stevenmatthewt commented Jul 12, 2024

patrobinson commented Jul 17, 2024

stevenmatthewt commented Jul 30, 2024

Option to prevent startup if tag fetching fails #2886

Option to prevent startup if tag fetching fails #2886

Comments

stevenmatthewt commented Jul 12, 2024

patrobinson commented Jul 17, 2024

stevenmatthewt commented Jul 30, 2024