From 2fe9891fb6df6ae8adde530c72a5211bf240ee95 Mon Sep 17 00:00:00 2001 From: Matheus Nascimento Date: Mon, 20 May 2024 16:34:44 -0300 Subject: [PATCH 1/3] makefile: Add `docs-live` target This enables faster iteration on documentation. Signed-off-by: Matheus Nascimento --- Makefile | 5 +++++ docs/Makefile | 12 ++++++++++++ 2 files changed, 17 insertions(+) diff --git a/Makefile b/Makefile index de9a9424..cb3027da 100644 --- a/Makefile +++ b/Makefile @@ -93,6 +93,11 @@ docs: ## Generate documentation $(MAKE) -C docs clean $(MAKE) -C docs html +.PHONY: docs-live +docs-live: ## Serve documentation on localhost:8000, with live-reload + $(MAKE) -C docs clean + $(MAKE) -C docs livehtml + .PHONY: gh-pages gh-pages: ## Publish documentation on GitHub Pages $(eval GIT_REMOTE := $(shell git remote get-url $(UPSTREAM_GIT_REMOTE))) diff --git a/docs/Makefile b/docs/Makefile index e718d771..e76826bb 100644 --- a/docs/Makefile +++ b/docs/Makefile @@ -1,6 +1,8 @@ # Minimal makefile for Sphinx documentation # +SPHINXAUTOBUILD = sphinx-autobuild + # You can set these variables from the command line. SPHINXOPTS = -n -W --keep-going SPHINXBUILD = sphinx-build @@ -22,3 +24,13 @@ html: $(SPHINXBUILD) -W -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html @echo @echo "Build finished. The HTML pages are in $(BUILDDIR)/html." + +.PHONY: man +man: + $(SPHINXBUILD) -b man -D exclude_patterns= $(ALLSPHINXOPTS) $(BUILDDIR)/man + @echo + @echo "Build finished. The man pages are in $(BUILDDIR)/man." + +.PHONY: livehtml +livehtml: + $(SPHINXAUTOBUILD) -a $(ALLSPHINXOPTS) $(BUILDDIR)/html From bb99d86f0ae0311f414ebe0f2cf5b014e6cdbec0 Mon Sep 17 00:00:00 2001 From: Matheus Nascimento Date: Mon, 20 May 2024 16:35:42 -0300 Subject: [PATCH 2/3] docs: Add a tutorial on debugging a deadlock Signed-off-by: Matheus Nascimento --- docs/tutorials/deadlock.py | 30 +++++ docs/tutorials/deadlock.rst | 237 ++++++++++++++++++++++++++++++++++++ 2 files changed, 267 insertions(+) create mode 100644 docs/tutorials/deadlock.py create mode 100644 docs/tutorials/deadlock.rst diff --git a/docs/tutorials/deadlock.py b/docs/tutorials/deadlock.py new file mode 100644 index 00000000..48d38cdb --- /dev/null +++ b/docs/tutorials/deadlock.py @@ -0,0 +1,30 @@ +import os +import threading +import time + + +def background(first_lock, second_lock): + with first_lock: + print(" First lock acquired") + time.sleep(1) + with second_lock: + print(" Second lock acquired") + + +if __name__ == "__main__": + print(f"Process ID: {os.getpid()}") + lock_a = threading.Lock() + lock_b = threading.Lock() + + t1 = threading.Thread(target=background, args=(lock_a, lock_b)) + t2 = threading.Thread(target=background, args=(lock_b, lock_a)) + + print("Starting First Thread") + t1.start() + print("Starting Second Thread") + t2.start() + + t1.join() + t2.join() + + print("Finished execution") diff --git a/docs/tutorials/deadlock.rst b/docs/tutorials/deadlock.rst new file mode 100644 index 00000000..30135707 --- /dev/null +++ b/docs/tutorials/deadlock.rst @@ -0,0 +1,237 @@ +Deadlock +======== + +Intro +----- + +This lesson is meant to familiarize you with PyStack with a classic problem: lock acquisition. + +In this exercise, we will intentionally create a lock ordering issue, which is a common way of +causing a deadlock, where two or more threads are all waiting for the others to release resources, +causing the program to hang indefinitely. + +Development Environment Setup +----------------------------- + +Navigate to the `PyStack GitHub repo `_ and get a copy of the +source code. You can either clone it, or just download the zip, whatever is your preference here. + +You will also need a terminal with a reasonably recent version of ``python3`` installed. + +Once you have the repo ready, ``cd`` into the ``docs/tutorials`` folder: + +.. code:: shell + + cd docs/tutorials/ + +It is here where we will be running the tests and exercises for the remainder of the tutorial. + +Let's go ahead and setup our virtual environment. For reference, here are the official `python3 venv +docs `_. You can also just follow along with the +commands below. + +.. code:: shell + + python3 -m venv .venv + +Once your virtual environment has been created, you can activate it like so: + +.. code:: shell + + source .venv/bin/activate + +Your terminal prompt will be prefixed with ``(.venv)`` to show that activation was successful. +With our virtual environment ready, we can go ahead and install PyStack: + +.. code:: shell + + python3 -m pip install pystack + +Keep your virtual environment activated for the rest of the tutorial, and you should be able to run +any of the commands in the exercises. + +Debugging a running process +--------------------------- + +``pystack remote`` lets you analyze the status of a running ("remote") process. + +Triggering the deadlock +^^^^^^^^^^^^^^^^^^^^^^^ + +In the ``docs/tutorials`` directory, there is a script called ``deadlock.py``: + +.. literalinclude:: deadlock.py + :linenos: + +Since we navigated to that directory above, we can run the deadlock script with: + +.. code:: shell + + python3 deadlock.py & + +This script will intentionally deadlock. The ``&`` causes the process to be run in the background, +so that you're still able to run commands in the current terminal once it has deadlocked. The output +will contain the process ID, so this is the expected output: + +.. code:: shell + + Process ID: + Starting First Thread + First lock acquired + Starting Second Thread + First lock acquired + +You could also find the PID with: + +.. code:: shell + + ps aux | grep deadlock.py + +After the deadlock occurs we can use the ``pystack`` command to analyze the process (replace +```` with the process ID from the previous step): + +.. code:: shell + + pystack remote + +If you see ``Operation not permitted``, you may need to instead run it with: + +.. code:: shell + + sudo -E pystack remote + + +Understanding the results +^^^^^^^^^^^^^^^^^^^^^^^^^ + +The expected result is output similar to this: + +.. code:: python + + Traceback for thread 789 (python3) [] (most recent call last): + (Python) File "//threading.py", line 966, in _bootstrap + self._bootstrap_inner() + (Python) File "//threading.py", line 1009, in _bootstrap_inner + self.run() + (Python) File "//threading.py", line 946, in run + self._target(*self._args, **self._kwargs) + (Python) File "//deadlock.py", line 10, in background + with second_lock: + + Traceback for thread 456 (python3) [] (most recent call last): + (Python) File "//threading.py", line 966, in _bootstrap + self._bootstrap_inner() + (Python) File "//threading.py", line 1009, in _bootstrap_inner + self.run() + (Python) File "//threading.py", line 946, in run + self._target(*self._args, **self._kwargs) + (Python) File "//deadlock.py", line 10, in background + with second_lock: + + Traceback for thread 123 (python3) [] (most recent call last): + (Python) File "//deadlock.py", line 27, in + t1.join() + (Python) File "//threading.py", line 1089, in join + self._wait_for_tstate_lock() + (Python) File "//threading.py", line 1109, in _wait_for_tstate_lock + if lock.acquire(block, timeout): + +Notice that each section is displaying a running thread, and there are three threads here: + +1. Thread 123 is the original thread that creates the other two, and waits for them +2. Thread 456 is ``t1`` +3. Thread 789 is ``t2`` + +Each thread has a stack trace: + +- The thread 789 is trying to acquire ``lock_a`` but is blocked because ``lock_a`` is already held + by thread 456. +- The thread 456 is trying to acquire ``lock_b`` but is blocked because ``lock_b`` is already held + by thread 789. +- The main thread 123 is calling ``join()`` on ``t1``, waiting for it to finish. However, ``t1`` + cannot finish because it is stuck waiting for ``t2``. + +We can see that this is a deadlock: every thread is willing to wait forever for some condition that +will never happen, due to the improper lock acquisition ordering. + +Exploring more features +^^^^^^^^^^^^^^^^^^^^^^^ + +PyStack has some features that can help us diagnose the problem. Using ``--locals`` you can obtain +a simple string representation of the local variables in the different frames as well as the +function arguments. + +When you run: + +.. code:: shell + + pystack remote --locals + +The expected result is: + +.. code:: shell + + Traceback for thread 789 (python3) [] (most recent call last): + (Python) File "//threading.py", line 966, in _bootstrap + self._bootstrap_inner() + Arguments: + self: + (Python) File "//threading.py", line 1009, in _bootstrap_inner + self.run() + Arguments: + self: + (Python) File "//threading.py", line 946, in run + self._target(*self._args, **self._kwargs) + Arguments: + self: + (Python) File "//deadlock.py", line 10, in background + with second_lock: + Arguments: + second_lock: <_thread.lock at 0x7f0c04b90900> + first_lock: <_thread.lock at 0x7f0c04b90b40> + + Traceback for thread 456 (python3) [] (most recent call last): + (Python) File "//threading.py", line 966, in _bootstrap + self._bootstrap_inner() + Arguments: + self: + (Python) File "//threading.py", line 1009, in _bootstrap_inner + self.run() + Arguments: + self: + (Python) File "//threading.py", line 946, in run + self._target(*self._args, **self._kwargs) + Arguments: + self: + (Python) File "//deadlock.py", line 10, in background + with second_lock: + Arguments: + second_lock: <_thread.lock at 0x7f0c04b90b40> + first_lock: <_thread.lock at 0x7f0c04b90900> + + Traceback for thread 123 (python3) [] (most recent call last): + (Python) File "//deadlock.py", line 28, in + t1.join() + (Python) File "//threading.py", line 1089, in join + self._wait_for_tstate_lock() + Arguments: + timeout: None + self: + (Python) File "//threading.py", line 1109, in _wait_for_tstate_lock + if lock.acquire(block, timeout): + Arguments: + timeout: -1 + block: True + self: + Locals: + lock: <_thread.lock at 0x7f0c04bae100> + +Observe that we have the same format of result, with one section for each thread. +However now, there is now more information: the local variables and function arguments. + +- In thread 789 and 456 we can identify the ID of each lock. +- In the main thread 123 we can verify the arguments of ``lock.acquire()``, and see that no timeout + was set (``timeout: None``) and that ``self`` refers to the thread object ````. Moreover, in ``_wait_for_tstate_lock`` we see that ``timeout`` is ``-1``, which + represents an indefinite ``wait``, and ``block`` is ``True``, meaning it will block until the lock + is acquired. From 7b472371fb5adca1cb7b0faf53b70179fd9e8cd6 Mon Sep 17 00:00:00 2001 From: Matheus Nascimento Date: Mon, 20 May 2024 16:38:14 -0300 Subject: [PATCH 3/3] docs: Add tutorials section to the overview page Signed-off-by: Matheus Nascimento --- docs/overview.rst | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/docs/overview.rst b/docs/overview.rst index cfbb38cd..a0d0da65 100644 --- a/docs/overview.rst +++ b/docs/overview.rst @@ -48,7 +48,16 @@ Contents :hidden: :caption: Project Information - changelog +.. toctree:: + :hidden: + :caption: Hands-on Tutorial + + tutorials/deadlock + +.. toctree:: + :caption: Project Information + + changelog Indices and tables