You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have created a 'cloudwatch-exporter.yml' file to fetch metrics from CloudWatch for RDS, Lambda, VPN-tunnel, ALB, CLB, and NLB. We are successfully obtaining metrics for RDS and Lambda, and on Prometheus, we can see RDS and Lambda metrics. However, when there is an issue with RDS and Lambda, alert rules go into a firing state and generate alerts. Unfortunately, we are not receiving alerts for VPN-tunnel and ALB, CLB, & NLB. Can you please help with identifying the reason? Below, you'll find the 'cloudwatch-exporter.yml' file and alert rules.
groups:
- name: VPNAlertsrules:
# Alert if the average VPN tunnel state is less than 1 (indicating down) for 5 minutes
- alert: VPNDownCriticalexpr: aws_vpn_tunnel_state_average < 1for: 5mlabels:
severity: criticalannotations:
LABELS: '{{ $labels }}'VALUE: '{{ $value }}'summary: 'VPN Tunnel Down Critical'description: 'At least one VPN tunnel is down.'# Alert if the average VPN tunnel state is less than 1 for 1 minute
- alert: VPNDownWarningexpr: aws_vpn_tunnel_state_average < 1for: 1mlabels:
severity: warningannotations:
LABELS: '{{ $labels }}'VALUE: '{{ $value }}'summary: 'VPN Tunnel Down Warning'description: 'At least one VPN tunnel is down.'# Alert if there are changes in VPN tunnel state indicating flapping for 5 minutes
- alert: VPNFlappingexpr: changes(aws_vpn_tunnel_state_average[5m]) > 1for: 5mlabels:
severity: criticalannotations:
LABELS: '{{ $labels }}'VALUE: '{{ $value }}'summary: 'VPN Tunnel Flapping'description: 'At least one VPN tunnel is experiencing flapping.'
Cloudwatch Metrics here
The text was updated successfully, but these errors were encountered:
rajualap
changed the title
[metrics]: short description here
[metrics]: Not able to go into a firing state when VPN tunnel is down for VPN-tunnel and ALB, CLB, & NLB
Jan 19, 2024
What does aws_vpn_tunnel_state_average look like in the /metrics endpoint? What does it look like in the Prometheus graph and table views?
It seems that you are using the default delay_seconds and set_timestamp. This means the metrics are not visible to an instant query in Prometheus "now", as your rules are using – see the documentation for details.
Try min_over_time(aws_vpn_tunnel_state_average[15m]) < 1 and changes(aws_vpn_tunnel_state_average[30m]) > 1 to look back further.
Hi ,
I have created a 'cloudwatch-exporter.yml' file to fetch metrics from CloudWatch for RDS, Lambda, VPN-tunnel, ALB, CLB, and NLB. We are successfully obtaining metrics for RDS and Lambda, and on Prometheus, we can see RDS and Lambda metrics. However, when there is an issue with RDS and Lambda, alert rules go into a firing state and generate alerts. Unfortunately, we are not receiving alerts for VPN-tunnel and ALB, CLB, & NLB. Can you please help with identifying the reason? Below, you'll find the 'cloudwatch-exporter.yml' file and alert rules.
Please assist in resolving this issue
cloudwatch-exporter.yml file here :-
####################################
Prometheus VPNtunnel alerts file here 👎
Cloudwatch Metrics here
The text was updated successfully, but these errors were encountered: