Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat - new sample sheet validation #2944

Closed
wants to merge 20 commits into from

Conversation

diitaz93
Copy link
Contributor

@diitaz93 diitaz93 commented Feb 13, 2024

Description

Fix #2922: Create a more strict validation for v2 sample sheets.

This PR implements a new validation for v2 sample sheets that takes into account 5 aspects:

  1. All sections are present
    1. [Header]
    2. [Reads]
    3. [BCLConvert_Settings]
    4. [BCLConvert_Data]
  2. IndexSettings is present in the [Header] of the sample sheet
  3. Run read and index cycles are present in the [Reads] section and are valid
  4. The [BCLConvert_Data] section has the correct columns (sample validation)
  5. The override cycle values are correct. This implies that the cycle values for the reads and the index match with the run cycles specified in the [Reads] section and the index2 cycles are in the correct format according to the IndexSettings (reverse or forward).

Added

  • A new class SampleSheetValidator with the endpoint function validate_sample_sheet which will be the new function to validate sample sheets.
  • A new class OverrideCyclesValidator with the endpoint function validate_sample which will validate if the override cycles for a single sample is correct. It is called for each sample inside the SampleSheetValidator.
  • Tests for new functions

Changed

  • The CLI command validate_sample_sheet (cg demultiplex samplesheet validate) uses now the new validation.

Fixed

  • Tests for CLI command, parametrised with new fixtures

How to prepare for test

  • Ssh to relevant server (depending on type of change)
  • Use stage: us
  • Paxa the environment: paxa
  • Install on stage (example for Hasta):
    bash /home/proj/production/servers/resources/hasta.scilifelab.se/update-tool-stage.sh -e S_cg -t cg -b [THIS-BRANCH-NAME] -a

How to test

  • Do ...

Expected test outcome

  • Check that ...
  • Take a screenshot and attach or copy/paste the output.

Review

  • Tests executed by
  • "Merge and deploy" approved by
    Thanks for filling in who performed the code review and the test!

This version is a

  • MAJOR - when you make incompatible API changes
  • MINOR - when you add functionality in a backwards compatible manner
  • PATCH - when you make backwards compatible bug fixes or documentation/instructions

Implementation Plan

  • Document in ...
  • Deploy this branch on ...
  • Inform to ...

@diitaz93 diitaz93 self-assigned this Feb 13, 2024
Copy link
Contributor

@karlnyr karlnyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! Seems like most cases are covered in the validation. However, I think that there are a few pints that may need another look.

for row in self.content:
if not row:
continue
if SampleSheetBCLConvertSections.Data.HEADER in row[0]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't be in row one right? Will not SampleSheetBCLConvertSections.Header.HEADER be the first thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly, SampleSheetBCLConvertSections.Header.HEADER, which is [Header] will be the first thing, but SampleSheetBCLConvertSections.Data.HEADER which is [BCLConvert_Data] will be further down

cg/apps/demultiplex/sample_sheet/validator.py Outdated Show resolved Hide resolved
cg/apps/demultiplex/sample_sheet/validator.py Outdated Show resolved Hide resolved
cg/apps/demultiplex/sample_sheet/validator.py Outdated Show resolved Hide resolved
cg/apps/demultiplex/sample_sheet/validator.py Outdated Show resolved Hide resolved
cg/apps/demultiplex/sample_sheet/validator.py Outdated Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to create one validation that fails? We have here the one where the sample sheet is missing, but do we want to assume it just works as well?

Copy link

Quality Gate Passed Quality Gate passed

Issues
2 New issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@diitaz93
Copy link
Contributor Author

@karlnyr I increased this PR's scope slightly and implemented the changes in #2958. Just so you know – the changes you suggested are implemented there. There fore I am closing this PR

@diitaz93 diitaz93 closed this Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regenerate all v2 sample sheets from before 25-01-24
2 participants