Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Count pods probe is inconsistent. #32

Open
chaosdudu opened this issue Feb 14, 2019 · 0 comments
Open

Count pods probe is inconsistent. #32

chaosdudu opened this issue Feb 14, 2019 · 0 comments

Comments

@chaosdudu
Copy link

Hi, we have been trying the probe count pods and faced inconsistency.
When we give a range as tolerance it is increasing the pod count and also the return of pod count is incorrect.
Another point phase check is also inconsistent, it does not recognise the running pods and prompting as pending even if they are running state when its checked with kubectl.

�[36m[2019-02-01 14:49:52 DEBUG]�[39m The Chaos Toolkit settings file could not be found at '/home/jakob/.chaostoolkit/settings.yaml'.
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Building activity cache...
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Cached 2 activities
�[32m[2019-02-01 14:49:52 INFO]�[39m Validating the experiment's syntax
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Loading configuration...
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Loading secrets...
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Secrets loaded
�[32m[2019-02-01 14:49:52 INFO]�[39m Experiment looks valid
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Clearing activities cache
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Building activity cache...
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Cached 2 activities
�[32m[2019-02-01 14:49:52 INFO]�[39m Running experiment: Test janie-nginx Resilience - at least one pod
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Loading configuration...
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Loading secrets...
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Secrets loaded
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Initializing controls
�[32m[2019-02-01 14:49:52 INFO]�[39m Steady state hypothesis: Prometheus running as expected
�[32m[2019-02-01 14:49:52 INFO]�[39m Probe: count_pods
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Activity 'count_pods' loaded from '/usr/lib/python3.7/site-packages/chaosk8s/pod/probes.py'
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Using Kubernetes context: default
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Found 2 pods matching label 'app=janie-nginx' in ns 'chaos'
�[36m[2019-02-01 14:49:52 DEBUG]�[39m => succeeded with '2'
�[36m[2019-02-01 14:49:52 DEBUG]�[39m allowed tolerance is [1, 3]
�[32m[2019-02-01 14:49:52 INFO]�[39m Steady state hypothesis is met!
�[32m[2019-02-01 14:49:52 INFO]�[39m Action: terminate_pods
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Activity 'terminate_pods' loaded from '/usr/lib/python3.7/site-packages/chaosk8s/pod/actions.py'
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Using Kubernetes context: default
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Found 2 pods labelled 'app=janie-nginx' in ns chaos
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Pod 'janie-nginx-5795fbf867-l6l4b' match pattern
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Pod 'janie-nginx-5795fbf867-vcv2p' match pattern
�[36m[2019-02-01 14:49:52 DEBUG]�[39m Picked pods 'janie-nginx-5795fbf867-l6l4b,janie-nginx-5795fbf867-vcv2p' to be terminated
�[36m[2019-02-01 14:49:52 DEBUG]�[39m => succeeded without any result value
�[32m[2019-02-01 14:49:52 INFO]�[39m Pausing after activity for 5s...
�[32m[2019-02-01 14:49:57 INFO]�[39m Steady state hypothesis: Prometheus running as expected
�[32m[2019-02-01 14:49:57 INFO]�[39m Probe: count_pods
�[36m[2019-02-01 14:49:57 DEBUG]�[39m Activity 'count_pods' loaded from '/usr/lib/python3.7/site-packages/chaosk8s/pod/probes.py'
�[36m[2019-02-01 14:49:57 DEBUG]�[39m Using Kubernetes context: default
�[36m[2019-02-01 14:49:58 DEBUG]�[39m Found 2 pods matching label 'app=janie-nginx' in ns 'chaos'
�[36m[2019-02-01 14:49:58 DEBUG]�[39m => succeeded with '2'
�[36m[2019-02-01 14:49:58 DEBUG]�[39m allowed tolerance is [1, 3]
�[32m[2019-02-01 14:49:58 INFO]�[39m Steady state hypothesis is met!
�[32m[2019-02-01 14:49:58 INFO]�[39m Let's rollback...
�[32m[2019-02-01 14:49:58 INFO]�[39m No declared rollbacks, let's move on.
�[32m[2019-02-01 14:49:58 INFO]�[39m Experiment ended with status: completed
�[36m[2019-02-01 14:49:58 DEBUG]�[39m Cleaning up controls
�[36m[2019-02-01 14:49:58 DEBUG]�[39m Clearing activities cache
�[36m[2019-02-01 14:51:00 DEBUG]�[39m ###############################################################################
�[36m[2019-02-01 14:51:00 DEBUG]�[39m Running command 'run'
�[36m[2019-02-01 14:51:00 DEBUG]�[39m Using settings file '/home/jakob/.chaostoolkit/settings.yaml'
�[33m[2019-02-01 14:51:01 WARNING]�[39m
There is a new version (1.0.0rc3) of the chaostoolkit available.
You may upgrade by typing:

$ pip install -U chaostoolkit

Please review changes at https://github.com/chaostoolkit/chaostoolkit/blob/master/CHANGELOG.md

�[36m[2019-02-01 14:51:01 DEBUG]�[39m The Chaos Toolkit settings file could not be found at '/home/jakob/.chaostoolkit/settings.yaml'.
�[36m[2019-02-01 14:51:01 DEBUG]�[39m Building activity cache...
�[36m[2019-02-01 14:51:01 DEBUG]�[39m Cached 2 activities
�[32m[2019-02-01 14:51:01 INFO]�[39m Validating the experiment's syntax
�[36m[2019-02-01 14:51:01 DEBUG]�[39m Loading configuration...
�[36m[2019-02-01 14:51:01 DEBUG]�[39m Loading secrets...
�[36m[2019-02-01 14:51:01 DEBUG]�[39m Secrets loaded
�[32m[2019-02-01 14:51:01 INFO]�[39m Experiment looks valid
�[36m[2019-02-01 14:51:01 DEBUG]�[39m Clearing activities cache
�[36m[2019-02-01 14:51:01 DEBUG]�[39m Building activity cache...
�[36m[2019-02-01 14:51:01 DEBUG]�[39m Cached 2 activities
�[32m[2019-02-01 14:51:01 INFO]�[39m Running experiment: Test janie-nginx Resilience - at least one pod
�[36m[2019-02-01 14:51:01 DEBUG]�[39m Loading configuration...
�[36m[2019-02-01 14:51:01 DEBUG]�[39m Loading secrets...
�[36m[2019-02-01 14:51:01 DEBUG]�[39m Secrets loaded
�[36m[2019-02-01 14:51:01 DEBUG]�[39m Initializing controls
�[32m[2019-02-01 14:51:01 INFO]�[39m Steady state hypothesis: Prometheus running as expected
�[32m[2019-02-01 14:51:01 INFO]�[39m Probe: count_pods
�[36m[2019-02-01 14:51:01 DEBUG]�[39m Activity 'count_pods' loaded from '/usr/lib/python3.7/site-packages/chaosk8s/pod/probes.py'
�[36m[2019-02-01 14:51:01 DEBUG]�[39m Using Kubernetes context: default
�[36m[2019-02-01 14:51:03 DEBUG]�[39m Found 2 pods matching label 'app=janie-nginx' in ns 'chaos'
�[36m[2019-02-01 14:51:03 DEBUG]�[39m => succeeded with '2'
�[36m[2019-02-01 14:51:03 DEBUG]�[39m allowed tolerance is [1, 2]
�[32m[2019-02-01 14:51:03 INFO]�[39m Steady state hypothesis is met!
�[32m[2019-02-01 14:51:03 INFO]�[39m Action: terminate_pods
�[36m[2019-02-01 14:51:03 DEBUG]�[39m Activity 'terminate_pods' loaded from '/usr/lib/python3.7/site-packages/chaosk8s/pod/actions.py'
�[36m[2019-02-01 14:51:03 DEBUG]�[39m Using Kubernetes context: default
�[36m[2019-02-01 14:51:03 DEBUG]�[39m Found 2 pods labelled 'app=janie-nginx' in ns chaos
�[36m[2019-02-01 14:51:03 DEBUG]�[39m Pod 'janie-nginx-5795fbf867-7wjdt' match pattern
�[36m[2019-02-01 14:51:03 DEBUG]�[39m Pod 'janie-nginx-5795fbf867-zkbvf' match pattern
�[36m[2019-02-01 14:51:03 DEBUG]�[39m Picked pods 'janie-nginx-5795fbf867-7wjdt,janie-nginx-5795fbf867-zkbvf' to be terminated
�[36m[2019-02-01 14:51:03 DEBUG]�[39m => succeeded without any result value
�[32m[2019-02-01 14:51:03 INFO]�[39m Pausing after activity for 10s...
�[32m[2019-02-01 14:51:13 INFO]�[39m Steady state hypothesis: Prometheus running as expected
�[32m[2019-02-01 14:51:13 INFO]�[39m Probe: count_pods
�[36m[2019-02-01 14:51:13 DEBUG]�[39m Activity 'count_pods' loaded from '/usr/lib/python3.7/site-packages/chaosk8s/pod/probes.py'
�[36m[2019-02-01 14:51:13 DEBUG]�[39m Using Kubernetes context: default
�[36m[2019-02-01 14:51:14 DEBUG]�[39m Found 4 pods matching label 'app=janie-nginx' in ns 'chaos'
�[36m[2019-02-01 14:51:14 DEBUG]�[39m => succeeded with '4'
�[36m[2019-02-01 14:51:14 DEBUG]�[39m allowed tolerance is [1, 2]
[2019-02-01 14:51:14 CRITICAL] Steady state probe 'count_pods' is not in the given tolerance so failing this experiment
�[32m[2019-02-01 14:51:14 INFO]�[39m Let's rollback...
�[32m[2019-02-01 14:51:14 INFO]�[39m No declared rollbacks, let's move on.
�[32m[2019-02-01 14:51:14 INFO]�[39m Experiment ended with status: deviated
�[32m[2019-02-01 14:51:14 INFO]�[39m The steady-state has deviated, a weakness may have been discovered
�[36m[2019-02-01 14:51:14 DEBUG]�[39m Cleaning up controls
�[36m[2019-02-01 14:51:14 DEBUG]�[39m Clearing activities cache
�[36m[2019-02-01 14:52:21 DEBUG]�[39m ###############################################################################
�[36m[2019-02-01 14:52:21 DEBUG]�[39m Running command 'run'
�[36m[2019-02-01 14:52:21 DEBUG]�[39m Using settings file '/home/jakob/.chaostoolkit/settings.yaml'
�[33m[2019-02-01 14:52:22 WARNING]�[39m
There is a new version (1.0.0rc3) of the chaostoolkit available.
You may upgrade by typing:

---------------____________________________------------------------------_____________________________
�[36m[2019-02-01 14:31:47 DEBUG]�[39m Activity 'pods_in_phase' loaded from '/usr/lib/python3.7/site-packages/chaosk8s/pod/probes.py'
�[36m[2019-02-01 14:31:47 DEBUG]�[39m Using Kubernetes context: default
�[36m[2019-02-01 14:31:47 DEBUG]�[39m Found 4 pods matching label 'app=janie-nginx' in ns 'chaos'
�[36m[2019-02-01 14:31:47 DEBUG]�[39m Activity failed
Traceback (most recent call last):
File "/usr/lib/python3.7/site-packages/chaoslib/provider/python.py", line 57, in run_python_activity
return func(**arguments)
File "/usr/lib/python3.7/site-packages/chaosk8s/pod/probes.py", line 105, in pods_in_phase
name=label_selector, s=d.status.phase, p=phase))
chaoslib.exceptions.ActivityFailed: pod 'app=janie-nginx' is in phase 'Pending' but should be 'Running'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.7/site-packages/chaoslib/activity.py", line 224, in run_activity
    result = run_python_activity(activity, configuration, secrets)
  File "/usr/lib/python3.7/site-packages/chaoslib/provider/python.py", line 62, in run_python_activity
    sys.exc_info()[2])
  File "/usr/lib/python3.7/site-packages/chaoslib/provider/python.py", line 57, in run_python_activity
    return func(**arguments)
  File "/usr/lib/python3.7/site-packages/chaosk8s/pod/probes.py", line 105, in pods_in_phase
    name=label_selector, s=d.status.phase, p=phase))
chaoslib.exceptions.ActivityFailed: chaoslib.exceptions.ActivityFailed: pod 'app=janie-nginx' is in phase 'Pending' but should be 'Running'

�[31m[2019-02-01 14:31:47 ERROR]�[39m => failed: chaoslib.exceptions.ActivityFailed: pod 'app=janie-nginx' is in phase 'Pending' but should be 'Running'
�[33m[2019-02-01 14:31:47 WARNING]�[39m Probe terminated unexpectedly, so its tolerance could not be validated
[2019-02-01 14:31:47 CRITICAL] Steady state probe 'pods_in_phase' is not in the given tolerance so failing this experiment
�[32m[2019-02-01 14:31:47 INFO]�[39m Let's rollback...
�[32m[2019-02-01 14:31:47 INFO]�[39m No declared rollbacks, let's move on.
�[32m[2019-02-01 14:31:47 INFO]�[39m Experiment ended with status: deviated
�[32m[2019-02-01 14:31:47 INFO]�[39m The steady-state has deviated, a weakness may have been discovered
�[36m[2019-02-01 14:31:47 DEBUG]�[39m Cleaning up controls
�[36m[2019-02-01 14:31:47 DEBUG]�[39m Clearing activities cache

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant