Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(streamline mutant delivery and upload) #3916

Merged
merged 96 commits into from
Dec 18, 2024
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
96 commits
Select commit Hold shift + click to select a range
a3f2a4c
add MutantFileFormatter class backbone for enhanced file formatting w…
eliottBo Nov 4, 2024
15372cb
add LIMS metadata handling to MutantFileFormatter for enhanced file f…
eliottBo Nov 4, 2024
c89b050
add fixture for LIMS naming metadata in tests
eliottBo Nov 4, 2024
6f6ebc4
Move MutantFileFormatter to sample_concatenation_service.py
eliottBo Nov 5, 2024
f1e9366
Move MutantFileFormatter and update SampleFileFormatter
eliottBo Nov 5, 2024
b088b07
Move lims_naming_metadata
eliottBo Nov 5, 2024
5a2aeb8
Add fixtures for LIMS naming metadata and test MutantFileFormatter fu…
eliottBo Nov 5, 2024
234ef71
Add test for MutantFileFormatter functionality in test_formatter_util…
eliottBo Nov 5, 2024
550d49d
Refactor test_mutant_file_formatter for clarity and consistency
eliottBo Nov 5, 2024
a215190
add fix
ChrOertlin Nov 6, 2024
267d026
register components
ChrOertlin Nov 6, 2024
a591564
Merge branch 'master' into mutant-file-formatter
eliottBo Nov 6, 2024
ddfa128
Merge branch 'mutant-file-formatter' of https://github.com/Clinical-G…
eliottBo Nov 6, 2024
5f532e0
Add lims_api mock to delivery service builder test
eliottBo Nov 6, 2024
356c33b
Change comment
eliottBo Nov 6, 2024
55f93de
Merge branch 'master' into mutant-file-formatter
eliottBo Nov 6, 2024
ff5ae8a
fix mutant formatter
ChrOertlin Nov 25, 2024
decb439
fix naming
ChrOertlin Nov 25, 2024
3d86914
fix mocking and make function clearer
ChrOertlin Nov 25, 2024
89688a3
remove fastq delivery from mutant (#3972)
ChrOertlin Nov 26, 2024
db0081c
fix(fohm upload from sample bundle) (#3970)
ChrOertlin Nov 26, 2024
eb1858c
Merge branch 'master' into mutant-file-formatter
ChrOertlin Nov 27, 2024
d6a6aeb
register formatter
ChrOertlin Nov 27, 2024
b4a861f
Merge branch 'mutant-file-formatter' of https://github.com/Clinical-G…
ChrOertlin Nov 27, 2024
bc984bc
add debug
ChrOertlin Nov 27, 2024
15f5d0f
fix test
ChrOertlin Nov 27, 2024
491daf6
Update cg/apps/lims/api.py
ChrOertlin Nov 27, 2024
e5469e4
Merge branch 'master' into mutant-file-formatter
ChrOertlin Dec 2, 2024
a4ce215
Merge branch 'mutant-file-formatter' of https://github.com/Clinical-G…
ChrOertlin Dec 2, 2024
841fa43
linting
ChrOertlin Dec 2, 2024
fbd2234
add another debug
ChrOertlin Dec 2, 2024
5bc0bec
debug reports
ChrOertlin Dec 2, 2024
56c4d40
temp test fix
ChrOertlin Dec 2, 2024
3e79f8c
fix
ChrOertlin Dec 2, 2024
62b2183
update docstring
ChrOertlin Dec 2, 2024
cd8c3da
debug
ChrOertlin Dec 2, 2024
f28cc87
fix
ChrOertlin Dec 2, 2024
1fb4887
lint
ChrOertlin Dec 2, 2024
1a28d97
rework filemovers
ChrOertlin Dec 2, 2024
090862e
add dependencies in factory
ChrOertlin Dec 2, 2024
f9a5eb4
register things in factory add new parameter
ChrOertlin Dec 2, 2024
12d6236
pass param
ChrOertlin Dec 2, 2024
7a92203
fix
ChrOertlin Dec 2, 2024
899e922
fix
ChrOertlin Dec 4, 2024
fbf7284
revert some fixture changes
ChrOertlin Dec 4, 2024
ca5b52e
fix
ChrOertlin Dec 4, 2024
89731ad
Update cg/constants/constants.py
ChrOertlin Dec 4, 2024
075ba85
Merge branch 'master' into mutant-file-formatter
ChrOertlin Dec 4, 2024
f03d56b
make concatenation sample specific
ChrOertlin Dec 4, 2024
bfdfc3e
Merge branch 'mutant-file-formatter' of https://github.com/Clinical-G…
ChrOertlin Dec 4, 2024
fb3950a
add test for concatenation fastq multiple sample in same dir
ChrOertlin Dec 5, 2024
3349fa3
eureka
ChrOertlin Dec 6, 2024
e53dadc
docstrings
ChrOertlin Dec 6, 2024
8403617
add debug
ChrOertlin Dec 6, 2024
f4c4b5c
add debugging
ChrOertlin Dec 6, 2024
12f7d76
add more debug
ChrOertlin Dec 6, 2024
7103ae9
add debug
ChrOertlin Dec 6, 2024
a07705a
add errr
ChrOertlin Dec 6, 2024
7312b32
add debug
ChrOertlin Dec 9, 2024
10c7551
debugs
ChrOertlin Dec 9, 2024
dd50ff5
fix
ChrOertlin Dec 9, 2024
62ecc41
debug
ChrOertlin Dec 9, 2024
1a7510d
more debug
ChrOertlin Dec 9, 2024
853f315
trace delivery path
ChrOertlin Dec 9, 2024
ae142af
fix path passing
ChrOertlin Dec 9, 2024
71c996c
fix
ChrOertlin Dec 9, 2024
264bc73
fix param
ChrOertlin Dec 9, 2024
1557a7b
don't ask
ChrOertlin Dec 9, 2024
0652fc4
feat(File fetching sample specific) (#4011)
ChrOertlin Dec 10, 2024
88122bb
Merge branch 'master' into mutant-file-formatter
ChrOertlin Dec 10, 2024
d41c4e3
add mutant upload api (#4017)
ChrOertlin Dec 11, 2024
bb2f560
refactor and document formatters (#4014)
ChrOertlin Dec 11, 2024
41fd392
remove redundant formatter
ChrOertlin Dec 11, 2024
b00ee50
Merge branch 'master' into mutant-file-formatter
ChrOertlin Dec 11, 2024
5600714
change delivery type fohm
ChrOertlin Dec 12, 2024
8d4a900
add fastq file check to concatenation map
ChrOertlin Dec 12, 2024
7713a8f
make FOHM fastq delivery again
ChrOertlin Dec 12, 2024
5ddce77
add(FOHM upload tags fetcher) (#4021)
ChrOertlin Dec 16, 2024
8b7573b
Merge branch 'master' into mutant-file-formatter
ChrOertlin Dec 16, 2024
a495451
Merge branch 'master' into mutant-file-formatter
ChrOertlin Dec 17, 2024
9054b1b
add(FOHM and GSAID to upload API) (#4028)
ChrOertlin Dec 17, 2024
4962b8b
update(covid orderform) (#4020)
ChrOertlin Dec 17, 2024
f6ca8b3
complete docstrings
ChrOertlin Dec 17, 2024
3f93600
complete docstrings
ChrOertlin Dec 17, 2024
1fbff5f
Update cg/services/deliver_files/deliver_files_service/deliver_files_…
ChrOertlin Dec 17, 2024
82c5ae1
Apply suggestions from code review
ChrOertlin Dec 18, 2024
0a82127
vincent review
ChrOertlin Dec 18, 2024
f8356e4
Merge branch 'master' into mutant-file-formatter
ChrOertlin Dec 18, 2024
ba9c4f5
Apply suggestions from code review
ChrOertlin Dec 18, 2024
fcf9ef9
Update cg/apps/lims/api.py
ChrOertlin Dec 18, 2024
95ab0b9
Update cg/services/deliver_files/file_fetcher/analysis_service.py
ChrOertlin Dec 18, 2024
c92ee47
Update cg/services/deliver_files/utils.py
ChrOertlin Dec 18, 2024
6f6e3b0
Update cg/services/deliver_files/file_formatter/files/abstract.py
ChrOertlin Dec 18, 2024
dc6c7ad
improve factory docstring
ChrOertlin Dec 18, 2024
7e3893a
fix function name
ChrOertlin Dec 18, 2024
7133948
remove comment
ChrOertlin Dec 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions cg/apps/lims/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -555,3 +555,9 @@ def _get_negative_controls_from_list(samples: list[Sample]) -> list[Sample]:
):
negative_controls.append(sample)
return negative_controls

def get_sample_region_and_lab_code(self, sample_id: str) -> str:
"""Return the reqgion code and lab code for a sample formatted as a suffix string."""
ChrOertlin marked this conversation as resolved.
Show resolved Hide resolved
ChrOertlin marked this conversation as resolved.
Show resolved Hide resolved
region_code = self.get_sample_attribute(lims_id=sample_id, key="region_code").split(" ")[0]
lab_code = self.get_sample_attribute(lims_id=sample_id, key="lab_code").split(" ")[0]
ChrOertlin marked this conversation as resolved.
Show resolved Hide resolved
ChrOertlin marked this conversation as resolved.
Show resolved Hide resolved
return f"{region_code}_{lab_code}_"
1 change: 1 addition & 0 deletions cg/models/cg_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -755,6 +755,7 @@ def delivery_service_factory(self) -> DeliveryServiceFactory:
LOG.debug("Instantiating delivery service factory")
factory = DeliveryServiceFactory(
store=self.status_db,
lims_api=self.lims_api,
hk_api=self.housekeeper_api,
tb_service=self.trailblazer_api,
rsync_service=self.delivery_rsync_service,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
from typing import Type

from cg.apps.housekeeper.hk import HousekeeperAPI
from cg.apps.lims import LimsAPI
from cg.apps.tb import TrailblazerAPI
from cg.constants import DataDelivery, Workflow
from cg.constants.constants import PrepCategory
Expand Down Expand Up @@ -49,12 +50,14 @@ class DeliveryServiceFactory:
def __init__(
self,
store: Store,
lims_api: LimsAPI,
hk_api: HousekeeperAPI,
rsync_service: DeliveryRsyncService,
tb_service: TrailblazerAPI,
analysis_service: AnalysisService,
):
self.store = store
self.lims_api = lims_api
self.hk_api = hk_api
self.rsync_service = rsync_service
self.tb_service = tb_service
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
from pathlib import Path

from cg.apps.lims import LimsAPI
from cg.services.deliver_files.file_fetcher.models import SampleFile
from cg.services.deliver_files.file_formatter.models import FormattedFile
from cg.services.deliver_files.file_formatter.utils.sample_concatenation_service import (
SampleFileConcatenationFormatter,
)
from cg.services.deliver_files.file_formatter.utils.sample_service import FileManagingService


class MutantFileFormatter:
def __init__(
self,
lims_api: LimsAPI,
file_formatter: SampleFileConcatenationFormatter,
file_manager: FileManagingService,
):
self.lims_api: LimsAPI = lims_api
self.file_formatter: SampleFileConcatenationFormatter = file_formatter
self.file_manager = file_manager

def format_files(
self, moved_files: list[SampleFile], ticket_dir_path: Path
) -> list[FormattedFile]:
formatted_files: list[FormattedFile] = self.file_formatter.format_files(
moved_files=moved_files, ticket_dir_path=ticket_dir_path
)
appended_formatted_files: list[FormattedFile] = self._add_lims_metadata_to_file_name(
formatted_files=formatted_files, sample_files=moved_files
)
unique_formatted_files: list[FormattedFile] = self._filter_unique_path_combinations(
appended_formatted_files
)
for unique_files in unique_formatted_files:
self.file_manager.rename_file(
src=unique_files.original_path, dst=unique_files.formatted_path
)
return unique_formatted_files

def _add_lims_metadata_to_file_name(
self, formatted_files: list[FormattedFile], sample_files: list[SampleFile]
) -> list[FormattedFile]:
ChrOertlin marked this conversation as resolved.
Show resolved Hide resolved
"""This functions adds the region and lab code to the file name of the formatted files."""
appended_formatted_files: list[FormattedFile] = []
for formatted_file in formatted_files:
sample_id: str = self._get_sample_id_by_original_path(
original_path=formatted_file.original_path, sample_files=sample_files
)
lims_meta_data = self.lims_api.get_sample_region_and_lab_code(sample_id)

new_original_path: Path = formatted_file.formatted_path
new_formatted_path = Path(
formatted_file.formatted_path.parent,
f"{lims_meta_data}{formatted_file.formatted_path.name}",
)
appended_formatted_files.append(
FormattedFile(original_path=new_original_path, formatted_path=new_formatted_path)
)
return appended_formatted_files

@staticmethod
def _get_sample_id_by_original_path(original_path: Path, sample_files: list[SampleFile]) -> str:
for sample_file in sample_files:
if sample_file.file_path == original_path:
return sample_file.sample_id
raise ValueError(f"Could not find sample file with path {original_path}")

@staticmethod
def _filter_unique_path_combinations(
islean marked this conversation as resolved.
Show resolved Hide resolved
formatted_files: list[FormattedFile],
) -> list[FormattedFile]:
"""
During fastq concatenation Sample_R1 and Sample_R2 files are concatenated and moved to the same file Concat_Sample.
This mean that there can be multiple entries for the same concatenated file in the formatted_files list coming
from the SampleFileConcatenationService.
This function filters out the duplicates to avoid moving the same file multiple times
which would result in an error the second time since the files is no longer in the original path.
"""
unique_combinations = set()
unique_files: list[FormattedFile] = []
for formatted_file in formatted_files:
combination = (formatted_file.original_path, formatted_file.formatted_path)
if combination not in unique_combinations:
unique_combinations.add(combination)
unique_files.append(formatted_file)
return unique_files
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
from pathlib import Path

from cg.apps.lims import LimsAPI
from cg.constants.constants import ReadDirection, FileFormat, FileExtensions

from cg.services.fastq_concatenation_service.fastq_concatenation_service import (
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import os
from pathlib import Path

from cg.services.deliver_files.file_fetcher.models import SampleFile
from cg.services.deliver_files.file_formatter.models import FormattedFile

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from pathlib import Path
from pathlib import Path, PosixPath

import pytest

Expand All @@ -15,6 +15,7 @@
DeliveryMetaData,
SampleFile,
)
from cg.services.deliver_files.file_formatter.models import FormattedFile
from cg.store.models import Case
from cg.store.store import Store

Expand Down Expand Up @@ -243,3 +244,29 @@ def swap_file_paths_with_inbox_paths(
new_file_model.file_path = Path(inbox_dir_path, file_model.file_path.name)
new_file_models.append(new_file_model)
return new_file_models


@pytest.fixture
def lims_naming_matadata() -> str:
return "01_SE100_"


@pytest.fixture
def expected_mutant_formatted_files(
expected_concatenated_fastq_formatted_files, lims_naming_matadata
) -> list[FormattedFile]:
unique_combinations = []
for formatted_file in expected_concatenated_fastq_formatted_files:
formatted_file.original_path = formatted_file.formatted_path
formatted_file.formatted_path = Path(
formatted_file.formatted_path.parent,
f"{lims_naming_matadata}{formatted_file.formatted_path.name}",
)
if formatted_file not in unique_combinations:
unique_combinations.append(formatted_file)
return unique_combinations


@pytest.fixture
def mutant_moved_files(fastq_concatenation_sample_files) -> list[SampleFile]:
return fastq_concatenation_sample_files
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ class DeliveryServiceScenario(BaseModel):
def test_build_delivery_service(scenario: DeliveryServiceScenario, request: FixtureRequest):
# GIVEN a delivery service builder with mocked store and hk_api
builder = DeliveryServiceFactory(
lims_api=MagicMock(),
store=request.getfixturevalue(scenario.store_name),
hk_api=MagicMock(),
rsync_service=MagicMock(),
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@
import os
from unittest import mock
from unittest.mock import Mock
import pytest
from pathlib import Path

from cg.apps.lims import LimsAPI
from cg.services.deliver_files.file_formatter.utils.mutant_sample_service import MutantFileFormatter
from cg.services.fastq_concatenation_service.fastq_concatenation_service import (
FastqConcatenationService,
)
Expand Down Expand Up @@ -78,3 +82,43 @@ def test_file_formatter_utils(
for file in formatted_files:
assert file.formatted_path.exists()
assert not file.original_path.exists()


def test_mutant_file_formatter(
mutant_moved_files: list[SampleFile],
expected_mutant_formatted_files: list[FormattedFile],
lims_naming_matadata: str,
):
# GIVEN existing ticket directory path and a customer inbox
ticket_dir_path: Path = mutant_moved_files[0].file_path.parent

os.makedirs(ticket_dir_path, exist_ok=True)

for moved_file in mutant_moved_files:
moved_file.file_path.touch()

lims_mock = Mock()
lims_mock.get_sample_region_and_lab_code.return_value = lims_naming_matadata

# Initialize file_formatter
file_formatter = MutantFileFormatter(
file_manager=FileManagingService(),
file_formatter=SampleFileConcatenationFormatter(
file_manager=FileManagingService(),
file_formatter=SampleFileNameFormatter(),
concatenation_service=FastqConcatenationService(),
),
lims_api=lims_mock,
)

# WHEN formatting the files
formatted_files: list[FormattedFile] = file_formatter.format_files(
moved_files=mutant_moved_files,
ticket_dir_path=ticket_dir_path,
)

# THEN the files should be formatted
assert formatted_files == expected_mutant_formatted_files
for file in formatted_files:
assert file.formatted_path.exists()
assert not file.original_path.exists()
Loading