Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add automatic quality control of analyses for microbial samples #2754

Merged
merged 69 commits into from
Jan 2, 2024

Conversation

seallard
Copy link
Contributor

@seallard seallard commented Dec 8, 2023

Description

This PR implements automatic QC for microsalt following the documented requirements in Atlas. Part of #1648.

This logic will likely be trashed once the JASEN pipeline replaces microsalt 😅

Originally implemented in #1655 and then disabled in #2505.

Description of implementation

The flow is the following:

  1. Given a list of cases which are ready for quality control
  2. For each case:
    • Check whether the quality control is necessary for the case
  3. For each sample in the case
    • Check whether the sample pass the criteria listed in the documentation.
    • The failed samples are tracked for the overall case evaluation.
  4. Check whether the case pass the criteria listed in the documentation.
  5. If the case fails, set status to failed in trailblazer.
  6. If the case pass, include it in the cases to store.

Description of implemented sample quality controls

  • The reads passes > 70% of the application target reads
  • The reads for the negative control < 20 % of the application target reads
  • The mapping rate passes > 30 %
  • The duplication rate is < 80 %
  • The median insert size is > 100 bp
  • The average coverage is > 10x
  • The % BP with 10x coverage is > 75 %

Description of implemented case quality controls

  • The negative control sample pass the quality control
  • All urgent samples ("MWR") pass the quality control
  • At least 90 % of the non urgent samples ("MWX") pass the quality control

Questions

  • Do we want to change the threshold for the reads? Yes, skip constant and use percentage guaranteed reads.
  • Will all microsalt cases have a negative control? No
  • Can a case contain a mix of urgent and non urgent samples? If so, what validation rules should we apply? No.
  • What does expected reads mean and how does it differ from target reads for an application? Use target reads.
  • Do you want to change something about the structure of the generated report? Add summary to top of report.
  • Any additional information in trailblazer? Add short summary regardless of pass/fail.

Tested in stage on Hasta with cg workflow microsalt qc-microsalt gamelizard

@seallard
Copy link
Contributor Author

Blocked until meeting with production and help with manual testing.

@seallard
Copy link
Contributor Author

seallard commented Dec 19, 2023

Tested in stage

[sebastian.allard@hasta:/home/proj/stage/microbial/results] [S_base] 1 $ cg workflow microsalt qc-microsalt gamelizard
Performing QC on case gamelizard
ACC10005A1 passed QC.
Control sample ACC10005A2 passed QC.
ACC10005A3 passed QC.
ACC10005A4 passed QC.
ACC10005A5 passed QC.
ACC10005A6 passed QC.
ACC10005A7 passed QC.
ACC10005A8 passed QC.
ACC10005A9 passed QC.
ACC10005A10 passed QC.
ACC10005A11 passed QC.
ACC10005A12 passed QC.
ACC10005A13 passed QC.
ACC10005A14 passed QC.
ACC10005A15 passed QC.
ACC10005A16 passed QC.
ACC10005A17 passed QC.
ACC10005A18 passed QC.
ACC10005A19 passed QC.
QC passed, see /home/proj/stage/microbial/results/ACC10005A14_2022.7.26_13.1.1/QC_done.json for details.
Sample results: 0 failed, 19 passed, 19 total.

The generated qc report:

{
    "case": {
        "passes_qc": true,
        "control_passes_qc": true,
        "urgent_passes_qc": true,
        "non_urgent_passes_qc": true
    },
    "samples": [{
        "sample_id": "ACC10005A1",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A2",
        "passes_qc": true,
        "is_control": true,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A3",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A4",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A5",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A6",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A7",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A8",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A9",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A10",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A11",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A12",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A13",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A14",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A15",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A16",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A17",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A18",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }, {
        "sample_id": "ACC10005A19",
        "passes_qc": true,
        "is_control": false,
        "application_tag": "MWRNXTR003",
        "passes_reads_qc": true,
        "passes_mapping_qc": true,
        "passes_duplication_qc": true,
        "passes_inserts_qc": true,
        "passes_coverage_qc": true,
        "passes_10x_coverage_qc": true
    }]
}

@Clinical-Genomics Clinical-Genomics deleted a comment from sonarqubecloud bot Dec 19, 2023
Copy link
Contributor

@karlnyr karlnyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work, I have some changes I would like to see before we implement this into production.

cg/constants/constants.py Outdated Show resolved Hide resolved
cg/meta/workflow/microsalt/quality_controller/utils.py Outdated Show resolved Hide resolved
cg/meta/workflow/microsalt/quality_controller/utils.py Outdated Show resolved Hide resolved
cg/meta/workflow/microsalt/constants.py Show resolved Hide resolved
cg/meta/workflow/microsalt/microsalt.py Outdated Show resolved Hide resolved
tests/meta/workflow/test_microsalt.py Outdated Show resolved Hide resolved
@seallard
Copy link
Contributor Author

seallard commented Jan 2, 2024

Tested in stage

[sebastian.allard@hasta:~] [S_base] 22s 2 $ cg workflow microsalt qc gamelizard
Performing QC on case gamelizard
ACC10005A1 passed QC.
Control sample ACC10005A2 passed QC.
ACC10005A3 passed QC.
ACC10005A4 passed QC.
ACC10005A5 passed QC.
ACC10005A6 passed QC.
ACC10005A7 passed QC.
ACC10005A8 passed QC.
ACC10005A9 passed QC.
ACC10005A10 passed QC.
ACC10005A11 passed QC.
ACC10005A12 passed QC.
ACC10005A13 passed QC.
ACC10005A14 passed QC.
ACC10005A15 passed QC.
ACC10005A16 passed QC.
ACC10005A17 passed QC.
ACC10005A18 passed QC.
ACC10005A19 passed QC.
QC passed, see /home/proj/stage/microbial/results/ACC10005A14_2022.7.26_13.1.1/QC_done.json for details.
Sample results: 0 failed, 19 passed, 19 total

Copy link

sonarqubecloud bot commented Jan 2, 2024

Quality Gate Passed Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@seallard seallard merged commit 0ae1afe into master Jan 2, 2024
9 checks passed
@seallard seallard deleted the add-microsalt-qc branch January 2, 2024 09:38
@seallard
Copy link
Contributor Author

seallard commented Jan 2, 2024

Deployed to prod Hasta

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants