-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Raredisease configurator - Create services for file creation #4260
base: dev-start-pipelines
Are you sure you want to change the base?
Conversation
cg/services/analysis_starter/configurator/file_creators/config_creator.py
Outdated
Show resolved
Hide resolved
cg/services/analysis_starter/configurator/file_creators/params/raredisease.py
Outdated
Show resolved
Hide resolved
cg/services/analysis_starter/configurator/file_creators/params/utils.py
Outdated
Show resolved
Hide resolved
cg/services/analysis_starter/configurator/file_creators/sample_sheet/raredisease.py
Show resolved
Hide resolved
cg/services/analysis_starter/configurator/implementations/raredisease.py
Outdated
Show resolved
Hide resolved
cg/services/analysis_starter/configurator/implementations/nextflow.py
Outdated
Show resolved
Hide resolved
cg/services/analysis_starter/configurator/implementations/nextflow.py
Outdated
Show resolved
Hide resolved
cg/services/analysis_starter/configurator/implementations/raredisease.py
Outdated
Show resolved
Hide resolved
cg/services/analysis_starter/configurator/file_creators/utils.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will continue looking after lunch!
def __init__(self, store: Store, lims: LimsAPI, params: str): | ||
self.store = store | ||
self.lims = lims | ||
self.params = params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is "params"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a path to a file in hasta. It is pipeline-specific. We get it from the RareDiseaseConfig class in the CGConfig
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of scope for this PR and probably not something we can change without cooperation from biodev, but it would be nice to have it type hinted as a Path and a name called parameters_file or something.
cg/services/analysis_starter/configurator/file_creators/abstract.py
Outdated
Show resolved
Hide resolved
cg/services/analysis_starter/configurator/file_creators/params/utils.py
Outdated
Show resolved
Hide resolved
cg/services/analysis_starter/configurator/implementations/nextflow.py
Outdated
Show resolved
Hide resolved
cg/services/analysis_starter/configurator/implementations/nextflow.py
Outdated
Show resolved
Hide resolved
cg/services/analysis_starter/configurator/file_creators/utils.py
Outdated
Show resolved
Hide resolved
cg/services/analysis_starter/configurator/file_creators/config_creator.py
Outdated
Show resolved
Hide resolved
cg/services/analysis_starter/configurator/implementations/nextflow.py
Outdated
Show resolved
Hide resolved
|
|
||
|
||
class PipelineExtension: | ||
def configure(self, case_id: str, case_path: Path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets make an effort to give this a better name
|
||
|
||
class RarediseaseExtension(PipelineExtension): | ||
"""Configurator for Raredisease analysis.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Configurator for Raredisease analysis.""" | |
"""Class holding all additional actions needed to create a RarediseaseCaseConfig not specified in the NextflowConfigurator.""" |
|
||
@staticmethod | ||
def get_file_path(case_id: str, case_path: Path) -> Path: | ||
"""Return the path to the nextflow config file.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Return the path to the nextflow config file.""" | |
"""Return the path to the Nextflow config file.""" |
return Path(case_path, f"{case_id}_nextflow_config").with_suffix(FileExtensions.JSON) | ||
|
||
def create(self, case_id: str, case_path: Path) -> None: | ||
"""Create the nextflow config file for a case.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Create the nextflow config file for a case.""" | |
"""Create the Nextflow config file for a case.""" |
content: str = self._get_content(case_path=case_path) | ||
write_json(file_path=file_path, content=content) | ||
|
||
def _get_content(self, case_path: Path) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _get_content(self, case_path: Path) -> str: | |
def _get_content(self, case_id: str, case_path: Path) -> str: |
Add gene panels combinations for gene panels being part of gene panel combination and | ||
return updated gene panels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤯
@staticmethod | ||
def get_file_path(case_path: Path) -> Path: | ||
return Path(case_path, "gene_panels").with_suffix(FileExtensions.BED) | ||
|
||
def create(self, case_path: Path) -> None: | ||
file_path: Path = self.get_file_path(case_path=case_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This static method is not used anywhere else, we could remove it
@staticmethod | |
def get_file_path(case_path: Path) -> Path: | |
return Path(case_path, "gene_panels").with_suffix(FileExtensions.BED) | |
def create(self, case_path: Path) -> None: | |
file_path: Path = self.get_file_path(case_path=case_path) | |
def create(self, case_path: Path) -> None: | |
file_path= Path(case_path, "gene_panels").with_suffix(FileExtensions.BED) |
def get_file_path(case_id: str, case_path: Path) -> Path: | ||
pass | ||
|
||
@abstractmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add _get_content
method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super nice works with the tests!
self.account = account | ||
|
||
@staticmethod | ||
def get_file_path(case_id: str, case_path: Path) -> Path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To get rid of passing both case_id and case_path one could consider setting a root path in servers and having two methods, get_case_path = Path(self.root_path, case_id)
and get_file_path=Path(self.root_path, case_id, <file_name>
.
vep_filters_scout_fmt=f"{case_path}/{ScoutExportFileName.PANELS}", | ||
) | ||
|
||
def _get_data_analysis_type(self, case_id: str) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this method should be in the store instead?
""" | ||
Return the target bed file from LIMS or use default capture kit for WHOLE_GENOME_SEQUENCING. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
""" | |
Return the target bed file from LIMS or use default capture kit for WHOLE_GENOME_SEQUENCING. | |
""" | |
""" | |
Return the target bed file from LIMS or use default capture kit for WHOLE_GENOME_SEQUENCING. | |
Raises: | |
ValueError if not capture kit can be assigned to the case. | |
""" |
maternal_id=case_sample.get_maternal_sample_id, | ||
case_id=case_sample.case.internal_id, | ||
) | ||
return sample_sheet_entry.reformat_sample_content |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a cryptic return statement to me. What does reformat sample content do?
if forward_read and not reverse_read: | ||
read_direction = 1 | ||
elif reverse_read and not forward_read: | ||
read_direction = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of scope, but surely we should only have one boolean as input for this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree
return case_path.name | ||
|
||
|
||
def get_genome_build(workflow: Workflow) -> GenePanelGenomeBuild: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could consider having this as a bare dict
from cg.constants.gene_panel import GenePanelGenomeBuild | ||
|
||
|
||
def get_case_id_from_path(case_path: Path) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used anywhere now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe not anymore
def _get_case_priority(self, case_id: str) -> SlurmQos: | ||
"""Get case priority.""" | ||
case: Case = self.store.get_case_by_internal_id(case_id) | ||
return SlurmQos(case.slurm_priority) | ||
|
||
def _get_case_workflow(self, case_id: str) -> Workflow: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not feel like we should have these methods in this class. They seem like store methods to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no strong opinion for nor against
@@ -2962,6 +2942,7 @@ def raredisease_deliverables_file_path(raredisease_dir, raredisease_case_id) -> | |||
).with_suffix(FileExtensions.YAML) | |||
|
|||
|
|||
# TODO: Take a look at this for the tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be resolved in this PR?
|
||
@pytest.fixture | ||
def raredisease_extension() -> PipelineExtension: | ||
return PipelineExtension() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably do a small PR and test the actual RarediseaseExtension
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true
write_yaml_nextflow_style(file_path=file_path, content=content) | ||
|
||
def _get_content(self, case_id: str, case_path: Path, sample_sheet_path: Path) -> dict: | ||
"""Create parameters file for a case.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Create parameters file for a case.""" | |
"""Return the content of a params file for a case.""" |
case_workflow_parameters: dict = self._get_case_parameters( | ||
case_id=case_id, case_path=case_path, sample_sheet_path=sample_sheet_path | ||
).model_dump() | ||
workflow_parameters: any = read_yaml(self.params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix this type hint
import re | ||
|
||
|
||
def replace_values_in_params_file(workflow_parameters: dict) -> dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having an inheritance and utils module may be redundant. We could include these functions in the parent class and get rid of this module
|
||
class RarediseaseSampleSheetEntry(NextflowSampleSheetEntry): | ||
"""Raredisease sample model is used when building the sample sheet.""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heads up here, I removed a sex
attribute that was a str
from cg.store.models import Case, CaseSample, Sample | ||
from cg.store.store import Store | ||
|
||
LOG = logging.getLogger(__name__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove LOG
def create(self, case_id: str, case_path: Path) -> None: | ||
"""Create the sample sheet for a case.""" | ||
file_path: Path = self.get_file_path(case_id=case_id, case_path=case_path) | ||
content: any = self._get_content(case_id=case_id, case_path=case_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
content: any = self._get_content(case_id=case_id, case_path=case_path) | |
content: list[list[str]] = self._get_content(case_id=case_id, case_path=case_path) |
content: any = self._get_content(case_id=case_id, case_path=case_path) | ||
write_csv(file_path=file_path, content=content) | ||
|
||
def _get_content(self, case_id: str, case_path: Path) -> list[list[str]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _get_content(self, case_id: str, case_path: Path) -> list[list[str]]: | |
def _get_content(self, case_id: str) -> list[list[str]]: |
|
||
def __init__( | ||
self, | ||
config: CommonAppConfig, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config: CommonAppConfig, | |
pipeline_config: CommonAppConfig, |
params_file_creator: ParamsFileCreator, | ||
pipeline_extension: PipelineExtension = PipelineExtension(), | ||
): | ||
self.root_dir: str = config.root |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can just inject the root dir instead of the whole pipeline config
self.params_file_creator = params_file_creator | ||
|
||
def create_config(self, case_id: str) -> NextflowCaseConfig: | ||
"""Create a Nextflow case config.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
improve docstring with a numbered list of the steps involved in the configuration. It is not just "creating a config", it involves much more
|
||
|
||
@pytest.fixture | ||
def expected_raredisease_params_file_content( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is in the wrong file
"configurator_fixture, case_id_fixture", | ||
[("raredisease_configurator", "raredisease_case_id")], | ||
"configurator_fixture, case_id_fixture, case_path_fixture", | ||
[("raredisease_configurator", "raredisease_case_id", "raredisease_case_path")], | ||
ids=["raredisease"], | ||
) | ||
def test_create_nextflow_config_file_exists( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be an integration test that checks that all required files were created
Description
Alternate version of #4255
Need to solve: The Nextflow configurators need to create the following files:
Some pipelines need to create more files.
The process of file creation follows a common structure:
The content creation is specific for each pipeline and for each file type.
The file creation process was defined in the 3 steps mentioned above. In its abstract form, content creation (step 2) requires a class called FileContentCreator, which is implemented for every pipeline and file type.
Added
Changed
Fixed
How to prepare for test
us
paxa
How to test
Expected test outcome
Review
Thanks for filling in who performed the code review and the test!
This version is a
Implementation Plan