Skip to content

Conversation

mergify[bot]
Copy link
Contributor

@mergify mergify bot commented Nov 20, 2024

What does this PR do?

Use failure_threshold introduced in elastic/beats#41570 in self-monitoring configuration to avoid elastic-agent reporting DEGRADED if it fails to fetch metrics due to a component starting/stopping.
The default value for the failure threshold is set to 2 but it can be configured via config file or fleet policy.

Why is it important?

It is important to avoid a misrepresentation of agent status due to a single metrics fetch erroring out once.
See #5332

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • [ ] I have added an entry in ./changelog/fragments using the changelog tool
  • [ ] I have added an integration test or an E2E test

Disruptive User Impact

How to test this PR locally

Related issues

Questions to ask yourself

  • How are we going to support this in production?
  • How are we going to measure its adoption?
  • How are we going to debug this?
  • What are the metrics I should take care of?
  • ...

This is an automatic backport of pull request #5999 done by [Mergify](https://mergify.com).

* Add failureThreshold to elastic-agent self-monitoring config

(cherry picked from commit 2a46509)

# Conflicts:
#	internal/pkg/agent/application/monitoring/v1_monitor.go
@mergify mergify bot added backport conflicts There is a conflict in the backported pull request labels Nov 20, 2024
@mergify mergify bot requested a review from a team as a code owner November 20, 2024 21:37
@mergify mergify bot requested review from michalpristas and pchila and removed request for a team November 20, 2024 21:37
@mergify mergify bot assigned pchila Nov 20, 2024
Copy link
Contributor Author

mergify bot commented Nov 20, 2024

Cherry-pick of 2a46509 has failed:

On branch mergify/bp/8.15/pr-5999
Your branch is up to date with 'origin/8.15'.

You are currently cherry-picking commit 2a4650974e.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   internal/pkg/agent/application/coordinator/diagnostics_test.go
	modified:   internal/pkg/agent/application/monitoring/v1_monitor_test.go
	modified:   internal/pkg/core/monitoring/config/config.go

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   internal/pkg/agent/application/monitoring/v1_monitor.go

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

@pchila pchila removed the request for review from michalpristas November 22, 2024 10:10
@pchila
Copy link
Member

pchila commented Nov 22, 2024

Closing this backport as elastic/beats#40565 is not on 8.15 branch

@pchila pchila closed this Nov 22, 2024
@v1v v1v deleted the mergify/bp/8.15/pr-5999 branch July 24, 2025 08:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport conflicts There is a conflict in the backported pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant