Skip to content

Commit

Permalink
Capture profiling statistics (#1110)
Browse files Browse the repository at this point in the history
* Add placeholder function for recording stats

* scale_run returns the simulation object

* Record population dataframe size, memory usage in MB, and times extended

* Sort include

* Force ignore of profiling results directory, and force typing of stats

* Avoid list values in stats file

* Isort pass number 2

* Add psutil to requirements to capture disk IO
during profiling.

* Apply suggestions from code review

Co-authored-by: Matt Graham <[email protected]>

* Capture disk statistics too

* Capture disk I/O status

* Alter output format to optionally include HTML, and default-include stats

* Avoid creation of pyisession file, manage through stats file

* Lint file

* Obey the linter

* isort hook

* Apply suggestions from code review

* Inline help string definition

* Remove separate paths and parameters modules

* Inline help string and use default arg formatter

* Simplify paths and sim outputs to profiling directory

* Use relative import

* Use passed directory rather than constant

* Factor out saving arguments to JSON

Also wrap ignore_warnings into scale_run function

* Use ignore_warnings argument

* Use relative import

* Change save args function call signature

* Refactor run_profiling output writing

* Revert relative imports

Only work when within a package

* Refactoring of run_profiling

Decrease chance of keyword arg name collisions
Handle missing value in command-line args
Remove unnecessary logic for dealing with missing args
Correct type annotations

* Make logging of population checksum optional

* Change profiling run arguments

* Correct scale_run type annotations

* Add option to save raw profiler output

* Expose key simulation params from run_profiling

* Parse key-value pairs directly in parse_args

* Rename CLI argument for consistency with function

* Unpack CLI args in to function call

* Add basic check to key-value parsing

* Use variable to set profiling results directory in workflow

* Exclude profiling_results from distributions

* Remove whitespace to satisfy isort

* Ensure html_output Path converted to string to avoid JSON error

* Increase scheduled profiling job timeout

---------

Co-authored-by: Matt Graham <[email protected]>
Co-authored-by: Matt Graham <[email protected]>
  • Loading branch information
3 people authored Nov 27, 2023
1 parent 1dee535 commit e7cb080
Show file tree
Hide file tree
Showing 10 changed files with 438 additions and 145 deletions.
35 changes: 29 additions & 6 deletions .github/workflows/run-profiling.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,11 @@ jobs:
name: Create unique output file identifier and artifact name
runs-on: ubuntu-latest
outputs:
profiling-output-dir: profiling_results/
profiling-filename: ${{ steps.set-profiling-filename.outputs.name }}
artifact-name: ${{ steps.set-artifact-name.outputs.name }}
profiling-on-sha: ${{ steps.set-github-info.outputs.sha }}
profiling-event-trigger: ${{ steps.set-github-info.outputs.event }}
steps:
- id: set-profiling-filename
name: Set profiling output file name
Expand All @@ -45,6 +48,12 @@ jobs:
run: |
echo "name=profiling_results_${GITHUB_RUN_NUMBER}" >> "${GITHUB_OUTPUT}"
- id: set-github-info
name: Fix Git and GitHub information when passing between workflows
run: |
echo "sha=${GITHUB_SHA}" >> "${GITHUB_OUTPUT}"
echo "event=${GITHUB_EVENT_NAME}" >> "${GITHUB_OUTPUT}"
profile-on-comment:
name: Comment triggered profiling
if: github.event_name == 'issue_comment'
Expand All @@ -54,11 +63,17 @@ jobs:
runs-on: "[self-hosted, test]"
keyword: profiling
commands: |
tox -vv -e profile -- --output_name ${{ needs.set-variables.outputs.profiling-filename }}
tox -vv -e profile -- \
--html \
--root-output-dir ${{ needs.set-variables.outputs.profiling-output-dir }} \
--output-name ${{ needs.set-variables.outputs.profiling-filename }} \
--additional-stats \
sha=${{ needs.set-variables.outputs.profiling-on-sha }} \
trigger=${{ needs.set-variables.outputs.profiling-event-trigger }}
description: Profiled run of the model
timeout-minutes: 8640
application-organization: UCL
artifact-path: profiling_results/
artifact-path: ${{ needs.set-variables.outputs.profiling-output-dir }}
artifact-name: ${{ needs.set-variables.outputs.artifact-name }}
artifact-retention-days: 1
secrets:
Expand All @@ -70,6 +85,7 @@ jobs:
if: ${{ github.event_name != 'issue_comment' }}
needs: set-variables
runs-on: [self-hosted, test]
timeout-minutes: 8640
steps:
- name: Checkout repository
uses: actions/checkout@v4
Expand All @@ -78,14 +94,21 @@ jobs:

## The profile environment produces outputs in the /results directory
- name: Run profiling in dev environment
run: tox -vv -e profile -- --output_name ${{ needs.set-variables.outputs.profiling-filename }}
run: |
tox -vv -e profile -- \
--html \
--output-dir ${{ needs.set-variables.outputs.profiling-output-dir }} \
--output-name ${{ needs.set-variables.outputs.profiling-filename }} \
--additional-stats \
sha=${{ needs.set-variables.outputs.profiling-on-sha }} \
trigger=${{ needs.set-variables.outputs.profiling-event-trigger }}
## Upload the output as an artifact so we can push it to the profiling repository
- name: Save results as artifact
uses: actions/upload-artifact@v3
with:
name: ${{ needs.set-variables.outputs.artifact-name }}
path: profiling_results/
path: ${{ needs.set-variables.outputs.profiling-output-dir }}
retention-days: 1

upload-profiling-results:
Expand All @@ -108,7 +131,7 @@ jobs:
uses: actions/download-artifact@v3
with:
name: ${{ needs.set-variables.outputs.artifact-name }}
path: profiling_results/
path: ${{ needs.set-variables.outputs.profiling-output-dir }}

## The token provided needs contents and pages access to the target repo
## Token can be (re)generated by a member of the UCL organisation,
Expand All @@ -119,7 +142,7 @@ jobs:
env:
API_TOKEN_GITHUB: ${{ secrets.PROFILING_REPO_ACCESS }}
with:
source_file: profiling_results/
source_file: ${{ needs.set-variables.outputs.profiling-output-dir }}
destination_repo: UCL/TLOmodel-profiling
destination_folder: .
destination_branch: results
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -120,5 +120,8 @@ tlo.conf
outputs/*
!outputs/README

# ignore all files in profiling_results directory
profiling_results/

# ignore _version.py file generated by setuptools_scm
src/**/_version.py
4 changes: 1 addition & 3 deletions MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ recursive-exclude .ci *
recursive-exclude deploy *
recursive-exclude outputs *
recursive-exclude resources *
recursive-exclude profiling_results *

# some files inside of docs are generated
recursive-exclude docs/reference tlo.*.rst
Expand All @@ -25,6 +26,3 @@ exclude docs/hsi_events.csv
exclude docs/_contributors_list.html

global-exclude *.py[cod] __pycache__ *.so *.dylib .ipynb_checkpoints/** *~

# exclude profiling scripts from minimal build
recursive-exclude profiling *
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ dev = [
"pylint",
"ruff",
# Profiling
"psutil",
"pyinstrument>=4.3",
# Building requirements files
"pip-tools",
Expand Down
26 changes: 25 additions & 1 deletion requirements/dev.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#
# This file is autogenerated by pip-compile with Python 3.11
# This file is autogenerated by pip-compile with Python 3.8
# by the following command:
#
# pip-compile --extra=dev --output-file=requirements/dev.txt
Expand Down Expand Up @@ -76,6 +76,8 @@ distlib==0.3.7
# via virtualenv
et-xmlfile==1.1.0
# via openpyxl
exceptiongroup==1.1.3
# via pytest
execnet==2.0.2
# via pytest-xdist
filelock==3.12.4
Expand All @@ -90,6 +92,10 @@ gitpython==3.1.36
# via tlo (pyproject.toml)
idna==3.4
# via requests
importlib-metadata==6.8.0
# via build
importlib-resources==6.1.1
# via matplotlib
iniconfig==2.0.0
# via pytest
isodate==0.6.1
Expand Down Expand Up @@ -154,6 +160,8 @@ pluggy==1.3.0
# tox
portalocker==2.8.2
# via msal-extensions
psutil==5.9.5
# via tlo (pyproject.toml)
pycparser==2.21
# via cffi
pyinstrument==4.5.3
Expand Down Expand Up @@ -211,17 +219,29 @@ smmap==5.0.1
# via gitdb
squarify==0.4.3
# via tlo (pyproject.toml)
tomli==2.0.1
# via
# build
# coverage
# pip-tools
# pylint
# pyproject-api
# pyproject-hooks
# pytest
# tox
tomlkit==0.12.1
# via pylint
tox==4.11.3
# via tlo (pyproject.toml)
typing-extensions==4.8.0
# via
# astroid
# azure-core
# azure-keyvault-certificates
# azure-keyvault-keys
# azure-keyvault-secrets
# azure-storage-file-share
# pylint
tzdata==2023.3
# via pandas
urllib3==2.0.4
Expand All @@ -232,6 +252,10 @@ virtualenv==20.24.5
# tox
wheel==0.41.2
# via pip-tools
zipp==3.17.0
# via
# importlib-metadata
# importlib-resources

# The following packages are considered to be unsafe in a requirements file:
# pip
Expand Down
20 changes: 0 additions & 20 deletions src/scripts/profiling/_parameters.py

This file was deleted.

8 changes: 0 additions & 8 deletions src/scripts/profiling/_paths.py

This file was deleted.

Loading

0 comments on commit e7cb080

Please sign in to comment.