Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] pod_failure_by_litmus to use ansible k8s module #824

Merged
merged 3 commits into from
Oct 10, 2019

Conversation

kmjayadeep
Copy link
Contributor

@kmjayadeep kmjayadeep commented Oct 9, 2019

Signed-off-by: Jayadeep KM [email protected]

What this PR does / why we need it:

Many of of the ansible modules are using shell commands using kubectl to access kubernetes resources. This PR would be contributing towards litmuschaos/litmus-ansible#39 and #703 in converting them to k8s and k8s_facts module provided by ansible

Checklist

  • Does this PR have a corresponding GitHub issue?
  • Have you included relevant README for the chaoslib/experiment with details?
  • Have you added debug messages where necessary?
  • Have you added task comments where necessary?
  • Have you tested the changes for possible failure conditions?
  • Have you provided the positive & negative test logs for the litmusbook execution?
  • Does the litmusbook ensure idempotency of cluster state?, i.e., is cluster restored to original state?
  • Have you used non-shell/command modules for Kubernetes tasks?
  • Have you (jinja) templatized custom scripts that is run by the litmusbook, if any?
  • Have you (jinja) templatized Kubernetes deployment manifests used by the litmusbook, if any?
  • Have you reused/created util functions instead of repeating tasks in the litmusbook?
  • Do the artifacts follow the appropriate directory structure?
  • Have you isolated storage (eg: OpenEBS) specific implementations, checks?
  • Have you isolated platform (eg: baremetal kubeadm/openshift/aws/gcloud) specific implementations, checks?
  • Are the ansible facts well defined? Is the scope explicitly set for playbook & included utils?
  • Have you ensured minimum/careful usage of shell utilities (awk, grep, sed, cut, xargs etc.,)?
  • Can the limtusbook be executed both from within & outside a container (configurable paths, no hardcode)?
  • Can you suggest the minimal resource requirements for the litmusbook execution?
  • Does the litmusbook job artifact carry comments/default options/range for the ENV tunables?
  • Has the litmusbooks been linted?

Special notes for your reviewer:
Could not change the kill_pod task to k8s module, as the module doesn't support --grace-period parameter. Also I have moved the pod kill logic to separate file as ansible doesn't support looping over blocks.
Tested locally in minikube by invoking the following playbook with various parameters

- hosts: localhost
  connection: local
  gather_facts: no

  vars:
    c_experiment: "pod-delete"
    c_duration: "15"
    c_interval: "5"
    c_force: "false"
    a_ns: "default"
    a_label: "run=myserver"
  
  tasks:
    - include_tasks: "./pod_failure_by_litmus.yml"
      vars:
        c_svc_acc: "litmus"

@ksatchit
Copy link
Member

ksatchit commented Oct 9, 2019

@kmjayadeep ! Thanks for the refactor. Have a few comments, PTAL!

Would be great if you could post a final test output while running as a job (with a dev image for the runner) !

@kmjayadeep
Copy link
Contributor Author

@ksatchit
Sure, I'll test and post the logs soon.

@kmjayadeep
Copy link
Contributor Author

Here is the log from pod-delete by following the example from docs

ansible-playbook 2.7.3
  config file = None
  configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python2.7/dist-packages/ansible
  executable location = /usr/local/bin/ansible-playbook
  python version = 2.7.15+ (default, Jul  9 2019, 16:51:35) [GCC 7.4.0]
No config file found; using defaults
/etc/ansible/hosts did not meet host_list requirements, check plugin documentation if this is unexpected
/etc/ansible/hosts did not meet script requirements, check plugin documentation if this is unexpected
statically imported: /experiments/generic/pod_delete/pod_delete_ansible_prerequisites.yml

PLAYBOOK: pod_delete_ansible_logic.yml *****************************************
1 plays in ./experiments/generic/pod_delete/pod_delete_ansible_logic.yml

PLAY [localhost] ***************************************************************
2019-10-10T04:15:32.039107 (delta: 0.034401)         elapsed: 0.034401 ******** 
=============================================================================== 

TASK [Gathering Facts] *********************************************************
task path: /experiments/generic/pod_delete/pod_delete_ansible_logic.yml:2
2019-10-10T04:15:32.046337 (delta: 0.007205)         elapsed: 0.041631 ******** 
ok: [127.0.0.1]
META: ran handlers

TASK [Identify the chaos util to be invoked] ***********************************
task path: /experiments/generic/pod_delete/pod_delete_ansible_prerequisites.yml:1
2019-10-10T04:15:36.838447 (delta: 4.792081)         elapsed: 4.833741 ******** 
changed: [127.0.0.1] => {"changed": true, "checksum": "c272806e35212135d7ba563c282a0ab52a9d56a4", "dest": "./chaosutil.yml", "gid": 0, "group": "root", "md5sum": "d138cf6836284fb164fe8bdf4032cafc", "mode": "0644", "owner": "root", "size": 54, "src": "/root/.ansible/tmp/ansible-tmp-1570680936.87-222949968912080/source", "state": "file", "uid": 0}

TASK [include_vars] ************************************************************
task path: /experiments/generic/pod_delete/pod_delete_ansible_logic.yml:21
2019-10-10T04:15:37.253905 (delta: 0.41543)         elapsed: 5.249199 ********* 
ok: [127.0.0.1] => {"ansible_facts": {"c_util": "/chaoslib/litmus/pod_failure_by_litmus.yml"}, "ansible_included_var_files": ["/experiments/generic/pod_delete/chaosutil.yml"], "changed": false}

TASK [Construct chaos result name (experiment_name)] ***************************
task path: /experiments/generic/pod_delete/pod_delete_ansible_logic.yml:27
2019-10-10T04:15:37.309037 (delta: 0.055098)         elapsed: 5.304331 ******** 
ok: [127.0.0.1] => {"ansible_facts": {"c_experiment": "engine-nginx-pod-delete"}, "changed": false}

TASK [include_tasks] ***********************************************************
task path: /experiments/generic/pod_delete/pod_delete_ansible_logic.yml:34
2019-10-10T04:15:37.372436 (delta: 0.063374)         elapsed: 5.36773 ********* 
included: /utils/runtime/update_chaos_result_resource.yml for 127.0.0.1

TASK [Generate the litmus result CR to reflect SOT (Start of Test)] ************
task path: /utils/runtime/update_chaos_result_resource.yml:3
2019-10-10T04:15:37.439807 (delta: 0.067343)         elapsed: 5.435101 ******** 
changed: [127.0.0.1] => {"changed": true, "checksum": "762fa33f316adce43a6b17b938253877f285396a", "dest": "./chaos-result.yaml", "gid": 0, "group": "root", "md5sum": "16f40e6ba355987016c0dcb2d50b31dc", "mode": "0644", "owner": "root", "size": 342, "src": "/root/.ansible/tmp/ansible-tmp-1570680937.47-115629602203873/source", "state": "file", "uid": 0}

TASK [Apply the litmus result CR] **********************************************
task path: /utils/runtime/update_chaos_result_resource.yml:12
2019-10-10T04:15:37.694683 (delta: 0.254848)         elapsed: 5.689977 ******** 
changed: [127.0.0.1] => {"changed": true, "cmd": "kubectl apply -f chaos-result.yaml -n litmus", "delta": "0:00:00.678454", "end": "2019-10-10 04:15:38.596426", "failed_when_result": false, "rc": 0, "start": "2019-10-10 04:15:37.917972", "stderr": "", "stderr_lines": [], "stdout": "chaosresult.litmuschaos.io/engine-nginx-pod-delete created", "stdout_lines": ["chaosresult.litmuschaos.io/engine-nginx-pod-delete created"]}

TASK [Update the litmus result CR to reflect EOT (End of Test)] ****************
task path: /utils/runtime/update_chaos_result_resource.yml:22
2019-10-10T04:15:38.657713 (delta: 0.963008)         elapsed: 6.653007 ******** 
skipping: [127.0.0.1] => {"changed": false, "skip_reason": "Conditional result was False"}

TASK [Apply the litmus result CR] **********************************************
task path: /utils/runtime/update_chaos_result_resource.yml:31
2019-10-10T04:15:38.698612 (delta: 0.04087)         elapsed: 6.693906 ********* 
skipping: [127.0.0.1] => {"changed": false, "skip_reason": "Conditional result was False"}

TASK [Verify that the AUT (Application Under Test) is running] *****************
task path: /experiments/generic/pod_delete/pod_delete_ansible_logic.yml:41
2019-10-10T04:15:38.741010 (delta: 0.042366)         elapsed: 6.736304 ******** 
included: /utils/common/status_app_pod.yml for 127.0.0.1

TASK [Get the container status of application.] ********************************
task path: /utils/common/status_app_pod.yml:2
2019-10-10T04:15:38.805382 (delta: 0.064344)         elapsed: 6.800676 ******** 
changed: [127.0.0.1] => {"attempts": 1, "changed": true, "cmd": "kubectl get pod -n litmus -l run=\"myserver\" -o custom-columns=:..containerStatuses[].state --no-headers | grep -w \"running\"", "delta": "0:00:00.449115", "end": "2019-10-10 04:15:39.362912", "rc": 0, "start": "2019-10-10 04:15:38.913797", "stderr": "", "stderr_lines": [], "stdout": "map[running:map[startedAt:2019-10-10T04:14:54Z]]", "stdout_lines": ["map[running:map[startedAt:2019-10-10T04:14:54Z]]"]}

TASK [Checking {{ application_name }} pod is in running state] *****************
task path: /utils/common/status_app_pod.yml:13
2019-10-10T04:15:39.414519 (delta: 0.609116)         elapsed: 7.409813 ******** 
changed: [127.0.0.1] => {"attempts": 1, "changed": true, "cmd": "kubectl get pods -n litmus -o jsonpath='{.items[?(@.metadata.labels.run==\"myserver\")].status.phase}'", "delta": "0:00:00.445006", "end": "2019-10-10 04:15:39.971377", "rc": 0, "start": "2019-10-10 04:15:39.526371", "stderr": "", "stderr_lines": [], "stdout": "Running", "stdout_lines": ["Running"]}

TASK [include_tasks] ***********************************************************
task path: /experiments/generic/pod_delete/pod_delete_ansible_logic.yml:52
2019-10-10T04:15:40.020742 (delta: 0.606197)         elapsed: 8.016036 ******** 
included: /chaoslib/litmus/pod_failure_by_litmus.yml for 127.0.0.1

TASK [Derive chaos iterations] *************************************************
task path: /chaoslib/litmus/pod_failure_by_litmus.yml:1
2019-10-10T04:15:40.091005 (delta: 0.070237)         elapsed: 8.086299 ******** 
ok: [127.0.0.1] => {"ansible_facts": {"chaos_iterations": "3"}, "changed": false}

TASK [Set min chaos count to 1 if interval > duration] *************************
task path: /chaoslib/litmus/pod_failure_by_litmus.yml:5
2019-10-10T04:15:40.150182 (delta: 0.059095)         elapsed: 8.145476 ******** 
skipping: [127.0.0.1] => {"changed": false, "skip_reason": "Conditional result was False"}

TASK [Kill random pod] *********************************************************
task path: /chaoslib/litmus/pod_failure_by_litmus.yml:10
2019-10-10T04:15:40.190597 (delta: 0.040387)         elapsed: 8.185891 ******** 
included: /chaoslib/litmus/kill_random_pod.yml for 127.0.0.1 => (item=1)
included: /chaoslib/litmus/kill_random_pod.yml for 127.0.0.1 => (item=2)
included: /chaoslib/litmus/kill_random_pod.yml for 127.0.0.1 => (item=3)

TASK [Get a list of all pods from given namespace] *****************************
task path: /chaoslib/litmus/kill_random_pod.yml:1
2019-10-10T04:15:40.290337 (delta: 0.099709)         elapsed: 8.285631 ******** 
ok: [127.0.0.1] => {"changed": false, "resources": [{"apiVersion": "v1", "kind": "Pod", "metadata": {"creationTimestamp": "2019-10-10T04:14:46Z", "generateName": "myserver-54c896dd7b-", "labels": {"pod-template-hash": "54c896dd7b", "run": "myserver"}, "name": "myserver-54c896dd7b-slfnq", "namespace": "litmus", "ownerReferences": [{"apiVersion": "apps/v1", "blockOwnerDeletion": true, "controller": true, "kind": "ReplicaSet", "name": "myserver-54c896dd7b", "uid": "7eb6b098-eb14-11e9-9684-025000000001"}], "resourceVersion": "194453", "selfLink": "/api/v1/namespaces/litmus/pods/myserver-54c896dd7b-slfnq", "uid": "7eb77f1e-eb14-11e9-9684-025000000001"}, "spec": {"containers": [{"image": "nginx", "imagePullPolicy": "Always", "name": "myserver", "resources": {}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [{"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount", "name": "default-token-nrwrt", "readOnly": true}]}], "dnsPolicy": "ClusterFirst", "enableServiceLinks": true, "nodeName": "docker-desktop", "priority": 0, "restartPolicy": "Always", "schedulerName": "default-scheduler", "securityContext": {}, "serviceAccount": "default", "serviceAccountName": "default", "terminationGracePeriodSeconds": 30, "tolerations": [{"effect": "NoExecute", "key": "node.kubernetes.io/not-ready", "operator": "Exists", "tolerationSeconds": 300}, {"effect": "NoExecute", "key": "node.kubernetes.io/unreachable", "operator": "Exists", "tolerationSeconds": 300}], "volumes": [{"name": "default-token-nrwrt", "secret": {"defaultMode": 420, "secretName": "default-token-nrwrt"}}]}, "status": {"conditions": [{"lastProbeTime": null, "lastTransitionTime": "2019-10-10T04:14:46Z", "status": "True", "type": "Initialized"}, {"lastProbeTime": null, "lastTransitionTime": "2019-10-10T04:14:55Z", "status": "True", "type": "Ready"}, {"lastProbeTime": null, "lastTransitionTime": "2019-10-10T04:14:55Z", "status": "True", "type": "ContainersReady"}, {"lastProbeTime": null, "lastTransitionTime": "2019-10-10T04:14:46Z", "status": "True", "type": "PodScheduled"}], "containerStatuses": [{"containerID": "docker://b7f3c2d8a77d507858d1825999ac0a8c64cd1ad71bba22ccf4c4c2093e992131", "image": "nginx:latest", "imageID": "docker-pullable://nginx@sha256:aeded0f2a861747f43a01cf1018cf9efe2bdd02afd57d2b11fcc7fcadc16ccd1", "lastState": {}, "name": "myserver", "ready": true, "restartCount": 0, "state": {"running": {"startedAt": "2019-10-10T04:14:54Z"}}}], "hostIP": "192.168.65.3", "phase": "Running", "podIP": "10.1.0.117", "qosClass": "BestEffort", "startTime": "2019-10-10T04:14:46Z"}}]}

TASK [Select a random pod to kill] *********************************************
task path: /chaoslib/litmus/kill_random_pod.yml:9
2019-10-10T04:15:41.077600 (delta: 0.787232)         elapsed: 9.072894 ******** 
ok: [127.0.0.1] => {"ansible_facts": {"a_pod_to_kill": "myserver-54c896dd7b-slfnq"}, "changed": false}

TASK [debug] *******************************************************************
task path: /chaoslib/litmus/kill_random_pod.yml:13
2019-10-10T04:15:41.137303 (delta: 0.059672)         elapsed: 9.132597 ******** 
ok: [127.0.0.1] => {
    "msg": "Killing pod myserver-54c896dd7b-slfnq"
}

TASK [Force Kill application pod] **********************************************
task path: /chaoslib/litmus/kill_random_pod.yml:16
2019-10-10T04:15:41.197409 (delta: 0.060075)         elapsed: 9.192703 ******** 
skipping: [127.0.0.1] => {"changed": false, "skip_reason": "Conditional result was False"}

TASK [Kill application pod] ****************************************************
task path: /chaoslib/litmus/kill_random_pod.yml:24
2019-10-10T04:15:41.241614 (delta: 0.044175)         elapsed: 9.236908 ******** 
changed: [127.0.0.1] => {"changed": true, "cmd": "kubectl delete pod -n litmus --grace-period=0 --wait=false myserver-54c896dd7b-slfnq", "delta": "0:00:00.451195", "end": "2019-10-10 04:15:41.803372", "rc": 0, "start": "2019-10-10 04:15:41.352177", "stderr": "", "stderr_lines": [], "stdout": "pod \"myserver-54c896dd7b-slfnq\" deleted", "stdout_lines": ["pod \"myserver-54c896dd7b-slfnq\" deleted"]}

TASK [Wait for the interval timer] *********************************************
task path: /chaoslib/litmus/kill_random_pod.yml:32
2019-10-10T04:15:41.895581 (delta: 0.653936)         elapsed: 9.890875 ******** 
Pausing for 5 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
ok: [127.0.0.1] => {"changed": false, "delta": 5, "echo": true, "rc": 0, "start": "2019-10-10 04:15:41.939472", "stderr": "", "stdout": "Paused for 5.0 seconds", "stop": "2019-10-10 04:15:46.939766", "user_input": ""}

TASK [Get a list of all pods from given namespace] *****************************
task path: /chaoslib/litmus/kill_random_pod.yml:1
2019-10-10T04:15:46.964414 (delta: 5.068799)         elapsed: 14.959708 ******* 
ok: [127.0.0.1] => {"changed": false, "resources": [{"apiVersion": "v1", "kind": "Pod", "metadata": {"creationTimestamp": "2019-10-10T04:15:41Z", "generateName": "myserver-54c896dd7b-", "labels": {"pod-template-hash": "54c896dd7b", "run": "myserver"}, "name": "myserver-54c896dd7b-4fcct", "namespace": "litmus", "ownerReferences": [{"apiVersion": "apps/v1", "blockOwnerDeletion": true, "controller": true, "kind": "ReplicaSet", "name": "myserver-54c896dd7b", "uid": "7eb6b098-eb14-11e9-9684-025000000001"}], "resourceVersion": "194544", "selfLink": "/api/v1/namespaces/litmus/pods/myserver-54c896dd7b-4fcct", "uid": "9f6b628f-eb14-11e9-9684-025000000001"}, "spec": {"containers": [{"image": "nginx", "imagePullPolicy": "Always", "name": "myserver", "resources": {}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [{"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount", "name": "default-token-nrwrt", "readOnly": true}]}], "dnsPolicy": "ClusterFirst", "enableServiceLinks": true, "nodeName": "docker-desktop", "priority": 0, "restartPolicy": "Always", "schedulerName": "default-scheduler", "securityContext": {}, "serviceAccount": "default", "serviceAccountName": "default", "terminationGracePeriodSeconds": 30, "tolerations": [{"effect": "NoExecute", "key": "node.kubernetes.io/not-ready", "operator": "Exists", "tolerationSeconds": 300}, {"effect": "NoExecute", "key": "node.kubernetes.io/unreachable", "operator": "Exists", "tolerationSeconds": 300}], "volumes": [{"name": "default-token-nrwrt", "secret": {"defaultMode": 420, "secretName": "default-token-nrwrt"}}]}, "status": {"conditions": [{"lastProbeTime": null, "lastTransitionTime": "2019-10-10T04:15:41Z", "status": "True", "type": "Initialized"}, {"lastProbeTime": null, "lastTransitionTime": "2019-10-10T04:15:41Z", "message": "containers with unready status: [myserver]", "reason": "ContainersNotReady", "status": "False", "type": "Ready"}, {"lastProbeTime": null, "lastTransitionTime": "2019-10-10T04:15:41Z", "message": "containers with unready status: [myserver]", "reason": "ContainersNotReady", "status": "False", "type": "ContainersReady"}, {"lastProbeTime": null, "lastTransitionTime": "2019-10-10T04:15:41Z", "status": "True", "type": "PodScheduled"}], "containerStatuses": [{"image": "nginx", "imageID": "", "lastState": {}, "name": "myserver", "ready": false, "restartCount": 0, "state": {"waiting": {"reason": "ContainerCreating"}}}], "hostIP": "192.168.65.3", "phase": "Pending", "qosClass": "BestEffort", "startTime": "2019-10-10T04:15:41Z"}}, {"apiVersion": "v1", "kind": "Pod", "metadata": {"creationTimestamp": "2019-10-10T04:14:46Z", "deletionGracePeriodSeconds": 1, "deletionTimestamp": "2019-10-10T04:15:42Z", "generateName": "myserver-54c896dd7b-", "labels": {"pod-template-hash": "54c896dd7b", "run": "myserver"}, "name": "myserver-54c896dd7b-slfnq", "namespace": "litmus", "ownerReferences": [{"apiVersion": "apps/v1", "blockOwnerDeletion": true, "controller": true, "kind": "ReplicaSet", "name": "myserver-54c896dd7b", "uid": "7eb6b098-eb14-11e9-9684-025000000001"}], "resourceVersion": "194549", "selfLink": "/api/v1/namespaces/litmus/pods/myserver-54c896dd7b-slfnq", "uid": "7eb77f1e-eb14-11e9-9684-025000000001"}, "spec": {"containers": [{"image": "nginx", "imagePullPolicy": "Always", "name": "myserver", "resources": {}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [{"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount", "name": "default-token-nrwrt", "readOnly": true}]}], "dnsPolicy": "ClusterFirst", "enableServiceLinks": true, "nodeName": "docker-desktop", "priority": 0, "restartPolicy": "Always", "schedulerName": "default-scheduler", "securityContext": {}, "serviceAccount": "default", "serviceAccountName": "default", "terminationGracePeriodSeconds": 30, "tolerations": [{"effect": "NoExecute", "key": "node.kubernetes.io/not-ready", "operator": "Exists", "tolerationSeconds": 300}, {"effect": "NoExecute", "key": "node.kubernetes.io/unreachable", "operator": "Exists", "tolerationSeconds": 300}], "volumes": [{"name": "default-token-nrwrt", "secret": {"defaultMode": 420, "secretName": "default-token-nrwrt"}}]}, "status": {"conditions": [{"lastProbeTime": null, "lastTransitionTime": "2019-10-10T04:14:46Z", "status": "True", "type": "Initialized"}, {"lastProbeTime": null, "lastTransitionTime": "2019-10-10T04:15:43Z", "message": "containers with unready status: [myserver]", "reason": "ContainersNotReady", "status": "False", "type": "Ready"}, {"lastProbeTime": null, "lastTransitionTime": "2019-10-10T04:15:43Z", "message": "containers with unready status: [myserver]", "reason": "ContainersNotReady", "status": "False", "type": "ContainersReady"}, {"lastProbeTime": null, "lastTransitionTime": "2019-10-10T04:14:46Z", "status": "True", "type": "PodScheduled"}], "containerStatuses": [{"image": "nginx", "imageID": "", "lastState": {}, "name": "myserver", "ready": false, "restartCount": 0, "state": {"waiting": {"reason": "ContainerCreating"}}}], "hostIP": "192.168.65.3", "phase": "Pending", "qosClass": "BestEffort", "startTime": "2019-10-10T04:14:46Z"}}]}

TASK [Select a random pod to kill] *********************************************
task path: /chaoslib/litmus/kill_random_pod.yml:9
2019-10-10T04:15:47.458410 (delta: 0.493968)         elapsed: 15.453704 ******* 
ok: [127.0.0.1] => {"ansible_facts": {"a_pod_to_kill": "myserver-54c896dd7b-slfnq"}, "changed": false}

TASK [debug] *******************************************************************
task path: /chaoslib/litmus/kill_random_pod.yml:13
2019-10-10T04:15:47.516377 (delta: 0.05794)         elapsed: 15.511671 ******** 
ok: [127.0.0.1] => {
    "msg": "Killing pod myserver-54c896dd7b-slfnq"
}

TASK [Force Kill application pod] **********************************************
task path: /chaoslib/litmus/kill_random_pod.yml:16
2019-10-10T04:15:47.572658 (delta: 0.056251)         elapsed: 15.567952 ******* 
skipping: [127.0.0.1] => {"changed": false, "skip_reason": "Conditional result was False"}

TASK [Kill application pod] ****************************************************
task path: /chaoslib/litmus/kill_random_pod.yml:24
2019-10-10T04:15:47.616056 (delta: 0.043326)         elapsed: 15.61135 ******** 
changed: [127.0.0.1] => {"changed": true, "cmd": "kubectl delete pod -n litmus --grace-period=0 --wait=false myserver-54c896dd7b-slfnq", "delta": "0:00:00.445696", "end": "2019-10-10 04:15:48.210796", "rc": 0, "start": "2019-10-10 04:15:47.765100", "stderr": "", "stderr_lines": [], "stdout": "pod \"myserver-54c896dd7b-slfnq\" deleted", "stdout_lines": ["pod \"myserver-54c896dd7b-slfnq\" deleted"]}

TASK [Wait for the interval timer] *********************************************
task path: /chaoslib/litmus/kill_random_pod.yml:32
2019-10-10T04:15:48.256716 (delta: 0.640628)         elapsed: 16.25201 ******** 
Pausing for 5 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
ok: [127.0.0.1] => {"changed": false, "delta": 4, "echo": true, "rc": 0, "start": "2019-10-10 04:15:48.293678", "stderr": "", "stdout": "Paused for 4.97 seconds", "stop": "2019-10-10 04:15:53.259709", "user_input": ""}

TASK [Get a list of all pods from given namespace] *****************************
task path: /chaoslib/litmus/kill_random_pod.yml:1
2019-10-10T04:15:53.285095 (delta: 5.02835)         elapsed: 21.280389 ******** 
ok: [127.0.0.1] => {"changed": false, "resources": [{"apiVersion": "v1", "kind": "Pod", "metadata": {"creationTimestamp": "2019-10-10T04:15:41Z", "generateName": "myserver-54c896dd7b-", "labels": {"pod-template-hash": "54c896dd7b", "run": "myserver"}, "name": "myserver-54c896dd7b-4fcct", "namespace": "litmus", "ownerReferences": [{"apiVersion": "apps/v1", "blockOwnerDeletion": true, "controller": true, "kind": "ReplicaSet", "name": "myserver-54c896dd7b", "uid": "7eb6b098-eb14-11e9-9684-025000000001"}], "resourceVersion": "194560", "selfLink": "/api/v1/namespaces/litmus/pods/myserver-54c896dd7b-4fcct", "uid": "9f6b628f-eb14-11e9-9684-025000000001"}, "spec": {"containers": [{"image": "nginx", "imagePullPolicy": "Always", "name": "myserver", "resources": {}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [{"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount", "name": "default-token-nrwrt", "readOnly": true}]}], "dnsPolicy": "ClusterFirst", "enableServiceLinks": true, "nodeName": "docker-desktop", "priority": 0, "restartPolicy": "Always", "schedulerName": "default-scheduler", "securityContext": {}, "serviceAccount": "default", "serviceAccountName": "default", "terminationGracePeriodSeconds": 30, "tolerations": [{"effect": "NoExecute", "key": "node.kubernetes.io/not-ready", "operator": "Exists", "tolerationSeconds": 300}, {"effect": "NoExecute", "key": "node.kubernetes.io/unreachable", "operator": "Exists", "tolerationSeconds": 300}], "volumes": [{"name": "default-token-nrwrt", "secret": {"defaultMode": 420, "secretName": "default-token-nrwrt"}}]}, "status": {"conditions": [{"lastProbeTime": null, "lastTransitionTime": "2019-10-10T04:15:41Z", "status": "True", "type": "Initialized"}, {"lastProbeTime": null, "lastTransitionTime": "2019-10-10T04:15:49Z", "status": "True", "type": "Ready"}, {"lastProbeTime": null, "lastTransitionTime": "2019-10-10T04:15:49Z", "status": "True", "type": "ContainersReady"}, {"lastProbeTime": null, "lastTransitionTime": "2019-10-10T04:15:41Z", "status": "True", "type": "PodScheduled"}], "containerStatuses": [{"containerID": "docker://95cce2e27ebe5594aa547516d672d34d7363a39df184048ba97a4dcb3b4a17e9", "image": "nginx:latest", "imageID": "docker-pullable://nginx@sha256:aeded0f2a861747f43a01cf1018cf9efe2bdd02afd57d2b11fcc7fcadc16ccd1", "lastState": {}, "name": "myserver", "ready": true, "restartCount": 0, "state": {"running": {"startedAt": "2019-10-10T04:15:49Z"}}}], "hostIP": "192.168.65.3", "phase": "Running", "podIP": "10.1.0.120", "qosClass": "BestEffort", "startTime": "2019-10-10T04:15:41Z"}}]}

TASK [Select a random pod to kill] *********************************************
task path: /chaoslib/litmus/kill_random_pod.yml:9
2019-10-10T04:15:53.789193 (delta: 0.504064)         elapsed: 21.784487 ******* 
ok: [127.0.0.1] => {"ansible_facts": {"a_pod_to_kill": "myserver-54c896dd7b-4fcct"}, "changed": false}

TASK [debug] *******************************************************************
task path: /chaoslib/litmus/kill_random_pod.yml:13
2019-10-10T04:15:53.852998 (delta: 0.063774)         elapsed: 21.848292 ******* 
ok: [127.0.0.1] => {
    "msg": "Killing pod myserver-54c896dd7b-4fcct"
}

TASK [Force Kill application pod] **********************************************
task path: /chaoslib/litmus/kill_random_pod.yml:16
2019-10-10T04:15:53.914315 (delta: 0.061284)         elapsed: 21.909609 ******* 
skipping: [127.0.0.1] => {"changed": false, "skip_reason": "Conditional result was False"}

TASK [Kill application pod] ****************************************************
task path: /chaoslib/litmus/kill_random_pod.yml:24
2019-10-10T04:15:53.958732 (delta: 0.044369)         elapsed: 21.954026 ******* 
changed: [127.0.0.1] => {"changed": true, "cmd": "kubectl delete pod -n litmus --grace-period=0 --wait=false myserver-54c896dd7b-4fcct", "delta": "0:00:00.444983", "end": "2019-10-10 04:15:54.524053", "rc": 0, "start": "2019-10-10 04:15:54.079070", "stderr": "", "stderr_lines": [], "stdout": "pod \"myserver-54c896dd7b-4fcct\" deleted", "stdout_lines": ["pod \"myserver-54c896dd7b-4fcct\" deleted"]}

TASK [Wait for the interval timer] *********************************************
task path: /chaoslib/litmus/kill_random_pod.yml:32
2019-10-10T04:15:54.602713 (delta: 0.64395)         elapsed: 22.598007 ******** 
Pausing for 5 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
ok: [127.0.0.1] => {"changed": false, "delta": 5, "echo": true, "rc": 0, "start": "2019-10-10 04:15:54.638877", "stderr": "", "stdout": "Paused for 5.0 seconds", "stop": "2019-10-10 04:15:59.639149", "user_input": ""}

TASK [Verify AUT liveness post fault-injection] ********************************
task path: /experiments/generic/pod_delete/pod_delete_ansible_logic.yml:58
2019-10-10T04:15:59.665218 (delta: 5.062468)         elapsed: 27.660512 ******* 
included: /utils/common/status_app_pod.yml for 127.0.0.1

TASK [Get the container status of application.] ********************************
task path: /utils/common/status_app_pod.yml:2
2019-10-10T04:15:59.727409 (delta: 0.062164)         elapsed: 27.722703 ******* 
FAILED - RETRYING: Get the container status of application. (150 retries left).
changed: [127.0.0.1] => {"attempts": 2, "changed": true, "cmd": "kubectl get pod -n litmus -l run=\"myserver\" -o custom-columns=:..containerStatuses[].state --no-headers | grep -w \"running\"", "delta": "0:00:00.432895", "end": "2019-10-10 04:16:02.814363", "rc": 0, "start": "2019-10-10 04:16:02.381468", "stderr": "", "stderr_lines": [], "stdout": "map[running:map[startedAt:2019-10-10T04:16:00Z]]", "stdout_lines": ["map[running:map[startedAt:2019-10-10T04:16:00Z]]"]}

TASK [Checking {{ application_name }} pod is in running state] *****************
task path: /utils/common/status_app_pod.yml:13
2019-10-10T04:16:02.860502 (delta: 3.133067)         elapsed: 30.855796 ******* 
changed: [127.0.0.1] => {"attempts": 1, "changed": true, "cmd": "kubectl get pods -n litmus -o jsonpath='{.items[?(@.metadata.labels.run==\"myserver\")].status.phase}'", "delta": "0:00:00.456678", "end": "2019-10-10 04:16:03.426813", "rc": 0, "start": "2019-10-10 04:16:02.970135", "stderr": "", "stderr_lines": [], "stdout": "Running", "stdout_lines": ["Running"]}

TASK [set_fact] ****************************************************************
task path: /experiments/generic/pod_delete/pod_delete_ansible_logic.yml:67
2019-10-10T04:16:03.476420 (delta: 0.615891)         elapsed: 31.471714 ******* 
ok: [127.0.0.1] => {"ansible_facts": {"flag": "pass"}, "changed": false}

TASK [include_tasks] ***********************************************************
task path: /experiments/generic/pod_delete/pod_delete_ansible_logic.yml:77
2019-10-10T04:16:03.530486 (delta: 0.053984)         elapsed: 31.52578 ******** 
included: /utils/runtime/update_chaos_result_resource.yml for 127.0.0.1

TASK [Generate the litmus result CR to reflect SOT (Start of Test)] ************
task path: /utils/runtime/update_chaos_result_resource.yml:3
2019-10-10T04:16:03.597460 (delta: 0.066853)         elapsed: 31.592754 ******* 
skipping: [127.0.0.1] => {"changed": false, "skip_reason": "Conditional result was False"}

TASK [Apply the litmus result CR] **********************************************
task path: /utils/runtime/update_chaos_result_resource.yml:12
2019-10-10T04:16:03.635415 (delta: 0.037877)         elapsed: 31.630709 ******* 
skipping: [127.0.0.1] => {"changed": false, "skip_reason": "Conditional result was False"}

TASK [Update the litmus result CR to reflect EOT (End of Test)] ****************
task path: /utils/runtime/update_chaos_result_resource.yml:22
2019-10-10T04:16:03.674933 (delta: 0.039442)         elapsed: 31.670227 ******* 
changed: [127.0.0.1] => {"changed": true, "checksum": "7bd582f3deb9392050eef348952bd8ab5c04664b", "dest": "./chaos-result.yaml", "gid": 0, "group": "root", "md5sum": "0e766db40cb0d0418c002f63e27d264b", "mode": "0644", "owner": "root", "size": 339, "src": "/root/.ansible/tmp/ansible-tmp-1570680963.71-184898221330502/source", "state": "file", "uid": 0}

TASK [Apply the litmus result CR] **********************************************
task path: /utils/runtime/update_chaos_result_resource.yml:31
2019-10-10T04:16:04.704416 (delta: 1.029453)         elapsed: 32.69971 ******** 
changed: [127.0.0.1] => {"changed": true, "cmd": "kubectl apply -f chaos-result.yaml -n litmus", "delta": "0:00:00.485783", "end": "2019-10-10 04:16:05.313841", "failed_when_result": false, "rc": 0, "start": "2019-10-10 04:16:04.828058", "stderr": "", "stderr_lines": [], "stdout": "chaosresult.litmuschaos.io/engine-nginx-pod-delete configured", "stdout_lines": ["chaosresult.litmuschaos.io/engine-nginx-pod-delete configured"]}
META: ran handlers
META: ran handlers

PLAY RECAP *********************************************************************
127.0.0.1                  : ok=37   changed=12   unreachable=0    failed=0   

2019-10-10T04:16:05.340573 (delta: 0.636126)         elapsed: 33.335867 ******* 
=============================================================================== 

And here is the chaosresult

Name:         engine-nginx-pod-delete
Namespace:    litmus
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"litmuschaos.io/v1alpha1","kind":"ChaosResult","metadata":{"annotations":{},"name":"engine-nginx-pod-delete","namespace":"li...
API Version:  litmuschaos.io/v1alpha1
Kind:         ChaosResult
Metadata:
  Creation Timestamp:  2019-10-10T04:15:38Z
  Generation:          2
  Resource Version:    194606
  Self Link:           /apis/litmuschaos.io/v1alpha1/namespaces/litmus/chaosresults/engine-nginx-pod-delete
  UID:                 9d80f0c9-eb14-11e9-9684-025000000001
Spec:
  Experimentstatus:
    Phase:    <nil>
    Verdict:  pass
Events:       <none>

@kmjayadeep
Copy link
Contributor Author

It took me a while to figure out how to run the experiment with the changed code :)
I just pushed the image to kmjayadeep/ansible-runner and changed experiments.yaml file to use this image. Please suggest if there is a way to force use the local image instead of pulling everytime. In that case I can just build and tag as litmuschaos/ansible-runner:ci

I had to modify the ansible runner dockerfile to include 2 extra dependencies (openshift and jmespath) which are needed to run k8s module.

@ksatchit
Copy link
Member

ksatchit commented Oct 10, 2019

Typically, the approach is to run the playbook on the localhost (i.e., you work-environment that has access to the cluster via kubeconfig), before testing the final runs from an dev-image as a job, similar to what you have done. You could also just run the dev image with a huge sleep, for example & test these playbooks from inside the pod, if you wish to escape ansible setup etc., on your local workspace.

Once you are done with the test, you can submit the jobs as is (with official tag) - the travis CI will take care of building and tagging it.

@ksatchit ksatchit merged commit cf3675f into litmuschaos:master Oct 10, 2019
@codesniffy codesniffy bot added the size/M label Sep 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants