Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exp crccp sensitivity 01 #10

Closed
wants to merge 21 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,6 @@ __pycache__/

# Other project-specific files
.DS_Store
.vscode
output.csv
results.csv
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ See [docs/design.md](./docs/design.md) for a discussion of the package's design.

To set up a Python development environment for `crcsim`:

1. Create a Python virtual environment based on Python 3.6. (Currently, support is limited to Python 3.6, because that is the version installed on the RTI Cluster, which is where we expect to deploy and run the experiments.)
1. Create a Python virtual environment based on Python 3.11.
1. Activate the virtual environment.
1. Install development dependencies with `pip install -r requirements.txt`.
1. Install `crcsim` with `pip install -e .`. The `-e` option specifies "development" mode, meaning that any changes you make to the code are recognized immediately without having to reinstall the package.
Expand All @@ -25,7 +25,7 @@ Note that although a Docker setup is provided, it's designed to be used only for

**TODO: since open sourcing the model, only formatting and linting tests run in CI. We need to start running unit tests too.**

Tests are run automatically as part of GitHub's continuous integration process using Docker. However, you can also run them locally with `./run_tests.sh`, or you can mimic the CI process by running them in Docker with:
Testsa are run automatically as part of GitHub's continuous integration process using Docker. However, you can also run them locally with `./run_tests.sh`, or you can mimic the CI process by running them in Docker with:

```
$ docker-compose build
Expand Down
10 changes: 5 additions & 5 deletions crcsim/analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -679,7 +679,7 @@ def summarize(self):
stage_counts = clinical_detections.new_state.value_counts()
stage_counts.index = stage_counts.index.str.replace("CLINICAL_", "").str.lower()
onset_distrib = stage_counts / len(clinical_detections)
for stage, value in onset_distrib.iteritems():
for stage, value in onset_distrib.items():
replication_output_row[f"crc_onset_proportion_{stage}"] = value

# Among all individuals who died from CRC, mean time between the onset of CRC
Expand Down Expand Up @@ -779,7 +779,7 @@ def compute_status_arrays(self):
f"Unexpected: more than one death event for person {p}"
)
else:
death_age = int(death.time)
death_age = int(death.time.iloc[0])

alive = np.arange(max_age + 1)
alive = np.where(alive > death_age, 0, 1)
Expand Down Expand Up @@ -906,11 +906,11 @@ def compute_status_arrays(self):
f"Unexpected: more than one clinical onset event for person {p}"
)
# Clinical onset overall
clinical_detection_age = int(clinical_detection.time)
clinical_detection_age = int(clinical_detection.time.iloc[0])
clinical_onset[clinical_detection_age] = 1
# Five-year survival overall
clinical_detection_age_decimal = float(clinical_detection.time)
death_age_decimal = float(death.time)
clinical_detection_age_decimal = float(clinical_detection.time.iloc[0])
death_age_decimal = float(death.time.iloc[0])
crc_onset_to_death = death_age_decimal - clinical_detection_age_decimal
if crc_onset_to_death > 5:
five_year_survival[clinical_detection_age] = 1
Expand Down
16 changes: 16 additions & 0 deletions crcsim/experiment/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
FROM python:3.11.6-bullseye

WORKDIR /code

# First install the package dependencies. By copying only requirements.txt into
# the image beforehand, Docker will re-run this step only when requirements.txt
# changes.
COPY ./requirements.txt ./
RUN pip install --upgrade pip \
&& pip install -r requirements.txt

# Copy the rest of the experiment files.
COPY ./ /code

# Declare the directory with experiment files as a volume
VOLUME /code
134 changes: 134 additions & 0 deletions crcsim/experiment/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# Replication of the CRCCP intervention scenarios

This experiment is a replication of the CRCCP compliance intervention experiment, which was conducted prior to open-sourcing the model and making some changes to the AWS infrastructure. We are replicating this experiment to ensure continuity after those changes.

The CRCCP compliance intervention experiment examines the cost-effectiveness of interventions designed to improve compliance with routine screening.

The experiment is designed around 8 health centers (labeled FHQC1-FHQC8), each having its own baseline compliance rate, intervention cost, and post-intervention compliance rate.

We don't model the intervention explicitly. In other words, we didn't add any code to the model to implement the intervention. Instead, we model the intervention by assuming it leads to a change in the compliance rate, and so we run a pair of simulations: one using the baseline compliance rate and another using the post-intervention compliance rate. Any differences in outcomes can therefore be attributed to the intervention.
## Scenarios

Includes 2 scenarios per health center: one baseline scenario and one intervention scenario. The baseline scenarios are based on real data and the intervention scenarios include a hypothetical increase in screening compliance rates.

All scenarios use an Incidence Rate Ratio (IRR) of 1.19. This is implemented by multiplying the calibrated value of `lesion_risk_alpha` (0.47) by 1.19.

The scenarios are created by `prepare.py`. This script reads a set of base parameters defined in `crcsim/experiment/parameters.json`, modifies them to create the scenarios, and saves them in a directory structure that will eventually be copied to AWS.

## Defining new experiments

To use this code as the basis for a new experiment, you should edit the following:

- [./prepare.py](./prepare.py) - Change the scenarios to run and the associated parameter transformations.
- [./simulate.py](./simulate.py) - Change the AWS batch objects and parameters.
- [./run_iteration.sh](./run_iteration.sh) - Change the s3 bucket name, or if the experiment is substantially different, the series of commands each run entails.
- [./summarize.py](./summarize.py) - You may want to change the derived variables that are added to the model results.
- [./parameters.json](./parameters.json) - You may want to edit the base parameter values.

## Running the experiment on AWS

### Test runs

If you want to conduct a test run of the experiment, consider reducing the number of iterations in Step 3 and/or using a smaller population size in Step 5. You could also comment out some FQHCs in the `initial_compliance` dict in the `create_scenarios` function in `prepare.py`.
### 1. Setup

1. Clone this repo to your local machine
1. Set your working directory to `./simulator/crcsim/experiment`
1. Create and activate a Python 3.11 virtual environment
1. Install dependencies with `pip install -r requirements.txt`

### 2. (Optional) Build and Push Image

Unless you've made changes to files that will affect simulation runs, this step is not necessary, since the `crcsim` image has already been uploaded to ECR. If you've changed anything in `run_iteration.sh`, `requirements.txt`, etc., you will need to rebuild and push the image.

Run `bash deploy_to_aws.sh` to run a series of commands which build the image locally from the Dockerfile in this repo and push it to ECR.

### 3. Prepare the Experiment Files

The script `prepare.py` prepares the `scenarios/` directory, the parameter files that define each scenario, and the `seeds.txt` file that defines the seeds used for multiple iterations of each scenario.

Running this script with all default arguments will replicate the seed and number of iterations of this experiment's original run. You can vary the seed by editing the script, and you can vary the number of iterations with a command line argument, e.g. `prepare.py --n=10`.

### 4. Upload the Experiment Files to S3

The subdirectories and files in `scenarios/` must be uploaded to AWS S3 for the Docker containers running Batch jobs to access them. *(Note: it would be possible to avoid this step because `scenarios/` is in the build context, but we chose to rely on S3 so the Docker image does not have to be rebuilt every time `prepare.py` is run.)*

To upload the files to S3, run
```
aws s3 cp ./scenarios s3://crcsim-exp-crccp-sensitivity01/scenarios --recursive
```
*(Another note: this manual step is necessary because `boto3` does not include functionality to upload a directory to S3 recursively. Future experiments could improve this workflow by writing a function to upload the directory recursively in `prepare.py`. Or submit a patch to resolve https://github.com/boto/boto3/issues/358)*

### 5. Launch the Jobs

The script `simulate.py` uses boto3 to launch jobs in AWS Batch. It relies on the structure of `scenarios/` generated by `prepare.py` to determine the jobs to launch and their parameters.

By default, each run uses a population size of 100,000 as with other experiments/batches. You can us the `n_people` argument to vary this parameter. For example, launch the jobs with a population size of 1,000 with the command `python simulate.py --n_people=1000`.

After launching, you can view job status and CloudWatch logs for individual jobs in the Batch console.

### 6. Check for Errors

Check the Batch console to see if any of the jobs failed. It is normal for a handful to fail due to Spot availability. Check the logs if you're concerned. If more than a few jobs failed, you may have a more serious issue. Use the CloudWatch logs to diagnose.

If you have only a few failed jobs and the reason looks innocuous, the easiest solution is to rerun the jobs manually via the Batch console.

### 6. Analyze the Results

Once all jobs have completed, run `summarize.py` to analyze the combined results of the model runs. This script uses pandas and s3fs to read and write files directly from S3 without saving them to your local machine. Like `simulate.py`, `summarize.py` relies on the structure of `scenarios/` to determine the files it fetches from S3.

This step generates `summary/` and its contents:
- `combined.csv` has one row per model run
- `summarized.xlsx` includes summary statistics for each scenario. Scenarios are separated into three sheets, one for each sensitivity test.

## AWS Architecture

The architecture relies on four AWS services - Batch, CloudWatch, Elastic Container Registry (ECR), and S3. The role of each service is as follows.

- Batch: high-level interface to launch Elastic Container Service (ECS) jobs
- CloudWatch: logging service to view logs for Batch jobs
- ECR: store Docker container used to run jobs
- S3: store output files generated by jobs

Most of the AWS architecture was built via the AWS Console. As such, there is not a script available to replicate the setup steps. This section outlines those steps.

**Important:** all AWS resources should be tagged following CDS protocols. For this project, all resources were tagged as follows.

- project-name: crcsim
- project-number: 0216648.001.001
- responsible-person: [email protected]

### S3

We used the S3 console to create the `crcsim-exp-template` bucket to store output files generated by simulation jobs.

### IAM

We used the IAM console to create the `crcsim-s3-access` IAM role. This IAM role has the `AmazonS3FullAccess` policy attached, which allows a service with this role to read and write to S3.

### Batch

We created the following Batch objects:

- Compute environment: `crcsim` using the FARGATE_SPOT provisioning model
- Job queue: `crcsim`
- Job definition: `crcsim`. Executes the following command in the `crcsim:latest` image.
```
["./run_iteration.sh","Ref::npeople","Ref::iteration","Ref::seed","Ref::scenario"]
```
- Important job definition properties:
- The `Ref::<name>` placeholders in the command define parameters. We vary these parameters across jobs.
- Job role ARN allows us to add the `crcsim-s3-access` IAM role which gives jobs access to S3.
- Enabling tag propagation passes the job's tags on to the underlying ECS resources. This is important to ensure costs are billed to the project.

Note that most of these resources were named `crcsim` rather than something like `crcsim-exp-template`. We expect that we will be able to use the same objects across experiments, since their structure is not specific to this experiment.
### CloudWatch

AWS Batch automatically sends log streams from jobs to AWS CloudWatch. Some logging info is viewable from within Batch by opening a job. However, the added detail of the complete logs may be useful, particularly for debugging. To view Batch logs in CloudWatch:
1. Navigate to the [CloudWatch console](https://console.aws.amazon.com/cloudwatch/)
1. Open `Log groups` and the `/aws/batch/job` log group
1. Find the log stream for the job of interest.

### ECR

Pushing the Docker image to ECR is the only step of the architecture setup that was NOT completed via the AWS Console. The script `deploy_to_aws.sh` contains all commands necessary to build the `crcsim` Docker image and upload it to ECR.
Loading
Loading