Skip to content

Commit ff1d262

Browse files
authored
Merge pull request #54 from LUMC/release_1.0.0
Release 1.0.0
2 parents 6944edb + 84af28d commit ff1d262

18 files changed

+361
-217
lines changed

.pylintrc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ disable=wrong-import-order, # This conflicts with flake8-import-order
44
W0511, # We can figure out the fixme's later.
55
line-too-long, # Already tested by flake8
66
unused-import, # Already tested by flake8
7+
no-else-return, # This is pylint opinionated to the hilt and annoying.
78
missing-docstring
89
# Sometimes docstrings are missing because they add visual clutter on self-documenting methods.
910
# Adding `# pylint: disable missing-docstrings because this method is self-documenting` does not help with the visual clutter!

HISTORY.rst

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,49 @@ Changelog
77
.. NOTE: This document is user facing. Please word the changes in such a way
88
.. that users understand how the changes affect the new version.
99
10+
Version 1.0.0
11+
---------------------------
12+
Lots of small fixes that improve the usability of pytest-workflow are included
13+
in version 1.0.0.
14+
15+
+ Gzipped files can now also be checked for contents. Files with '.gz' as
16+
extension are automatically decompressed.
17+
+ ``stdout`` and ``stderr`` of workflows are now streamed to a file instead of
18+
being kept in memory. This means you can check the progress of a workflow by
19+
running ``tail -f <stdout or stderr>``. The location of ``stdout`` and
20+
``stderr`` is now reported at the start of each worflow. If the
21+
``--keep-workflow-wd`` is not set the ``stdout`` and ``stderr`` files will be
22+
deleted with the rest of the workflow files.
23+
+ The log reports now when a workflow is starting, instead of when it is added
24+
to the queue. This makes it easier to see which workflows are currently
25+
running and if you forgot to use the ``--workflow-threads`` or ``--wt`` flag.
26+
+ Workflow exit code failures now mention the name of the workflow. Previously
27+
the generic name "Workflow" was used, which made it harder to figure out
28+
which workflows failed.
29+
+ When tests of file content fail because the file does not exist, a different
30+
error message is given compared to when the file exist, but the content is
31+
not there, which makes debugging easier. Also the accompanying
32+
"FileNotFound" error stacktrace is now suppressed, which keeps the test
33+
output more pleasant.
34+
+ When tests of stdout/stderr content or file content fail a more informative
35+
error message is given to allow for easier debugging.
36+
+ All workflows now get their own folder within the `same` temporary directory.
37+
This fixes a bug where if ``basetemp`` was not set, each workflow would get
38+
its own folder in a separate temp directory. For example running workflows
39+
'workflow1' and 'workflow2' would create two temporary folders:
40+
41+
'/tmp/pytest_workflow\_\ **33mrz5a5**/workflow1' and
42+
'/tmp/pytest_workflow\_\ **b8m1wzuf**/workflow2'
43+
44+
This is now changed to have all workflows in one temporary directory per
45+
pytest run:
46+
47+
'/tmp/pytest_workflow\_\ **33mrz5a5**/workflow1' and
48+
'/tmp/pytest_workflow\_\ **33mrz5a5**/workflow2'
49+
50+
+ Disallow empty ``command`` and ``name`` keys. An empty ``command`` caused
51+
pytest-workflow to hang. Empty names are also disallowed.
52+
1053
Version 0.4.0
1154
---------------------------
1255
+ Added more information to the manual on how to debug pipelines and use

README.rst

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -44,10 +44,14 @@ Run ``pytest`` from an environment with pytest-workflow installed.
4444
Pytest will automatically gather files in the ``tests`` directory starting with
4545
``test`` and ending in ``.yaml`` or ``.yml``.
4646

47-
For debugging pipelines running ``pytest -v --keep-workflow-wd`` is
48-
recommended. This will save the logs and the workflow directory so it is
49-
possible to check where the pipeline crashed. It will also give a better
50-
overview of succeeded and failed tests.
47+
To check the progress of a workflow while it is running you can use ``tail -f``
48+
on the ``stdout`` or ``stderr`` file of the workflow. The locations of these
49+
files are reported in the log as soon as a workflow is started.
50+
51+
For debugging pipelines using the ``--keep-workflow-wd`` flag is
52+
recommended. This will keep the workflow directory and logs after the test run
53+
so it is possible to check where the pipeline crashed. The ``-v`` flag can come
54+
in handy as well as it gives a complete overview of succeeded and failed tests.
5155

5256
Below is an example of a YAML file that defines a test:
5357

docs/running_pytest_workflow.rst

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,12 @@ Pytest will automatically gather files in the ``tests`` directory starting with
77
``test`` and ending in ``.yaml`` or ``.yml``.
88

99
The workflows are run automatically. Each workflow gets its own temporary
10-
directory to run. These directories are cleaned up after the tests are
11-
completed. If you wish to inspect the output of a failing workflow you can use
12-
the ``--keep-workflow-wd`` flag to disable cleanup. This will also make sure
13-
the logs of the pipeline are saved in the temporary directory. When
14-
``--keep-workflow-wd`` is set, the paths to the logs and the temporary
15-
directory are reported in pytest's output. The `--keep-workflow-wd`` flag is
16-
highly recommended when debugging pipelines.
10+
directory to run. The ``stdout`` and ``stderr`` of the workflow command are
11+
also saved to this directory. The temporary directories are cleaned up after
12+
the tests are completed. If you wish to inspect the output of a failing
13+
workflow you can use the ``--keep-workflow-wd`` flag to disable cleanup. This
14+
will also make sure the logs of the pipeline are not deleted. The
15+
``--keep-workflow-wd`` flag is highly recommended when debugging pipelines.
1716

1817
If you wish to change the temporary directory in which the workflows are run
1918
use ``--basetemp <dir>`` to change pytest's base temp directory.
@@ -33,6 +32,10 @@ To run multiple workflows simultaneously you can use
3332
of workflows that can be run simultaneously. This will speed up things if
3433
you have enough resources to process these workflows simultaneously.
3534

35+
To check the progress of a workflow while it is running you can use ``tail -f``
36+
on the ``stdout`` or ``stderr`` file of the workflow. The locations of these
37+
files are reported in the log as soon as a workflow is started.
38+
3639
Running specific workflows
3740
----------------------------
3841
To run a specific workflow use the ``--tag`` flag. Each workflow is tagged with

docs/writing_tests.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,8 @@ A more advanced example:
5555
exit_code: 2 # What the exit code should be (optional, if not given defaults to 0)
5656
files:
5757
- path: "fail.log" # Multiple files can be tested for each workflow
58-
- path: "TomCruise.txt"
58+
- path: "TomCruise.txt.gz" # Gzipped files can also be searched, provided their extension is '.gz'
59+
contains: "starring"
5960
stderr: # Options for testing stderr (optional)
6061
contains: # A list of strings which should be in stderr (optional)
6162
- "BSOD error, please contact the IT crowd"

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121

2222
setup(
2323
name="pytest-workflow",
24-
version="0.4.0",
24+
version="1.0.0",
2525
description="A pytest plugin for configuring workflow/pipeline tests "
2626
"using YAML files",
2727
author="Leiden University Medical Center",

src/pytest_workflow/content_tests.py

Lines changed: 63 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -19,10 +19,11 @@
1919
2020
The design philosophy here was that each piece of text should only be read
2121
once."""
22-
22+
import functools
23+
import gzip
2324
import threading
2425
from pathlib import Path
25-
from typing import Callable, Iterable, List, Set
26+
from typing import Iterable, List, Optional, Set
2627

2728
import pytest
2829

@@ -78,56 +79,64 @@ def check_content(strings: List[str],
7879

7980
def file_to_string_generator(filepath: Path) -> Iterable[str]:
8081
"""
81-
Turns a file into a line generator.
82+
Turns a file into a line generator. Files ending with .gz are automatically
83+
decompressed.
8284
:param filepath: the file path
8385
:return: yields lines of the file
8486
"""
85-
# Use 'r' here explicitly as opposed to 'rb'
86-
with filepath.open("r") as file_handler:
87+
file_open = (functools.partial(gzip.open, str(filepath))
88+
if filepath.suffix == ".gz" else
89+
filepath.open)
90+
# Use 'rt' here explicitly as opposed to 'rb'
91+
with file_open(mode='rt') as file_handler:
8792
for line in file_handler:
8893
yield line
8994

9095

9196
class ContentTestCollector(pytest.Collector):
9297
def __init__(self, name: str, parent: pytest.Collector,
93-
content_generator: Callable[[], Iterable[str]],
98+
filepath: Path,
9499
content_test: ContentTest,
95-
workflow: Workflow):
100+
workflow: Workflow,
101+
content_name: Optional[str] = None):
96102
"""
97103
Creates a content test collector
98104
:param name: Name of the thing which contents are tested
99105
:param parent: a pytest.Collector object
100-
:param content_generator: a function that should return the content as
101-
lines. This function is a placeholder for the content itself. In other
102-
words: instead of passing the contents of a file directly to the
103-
ContentTestCollector, you pass a function that when called will return
104-
the contents. This allows the pytest collection phase to finish before
105-
the file is read. This is useful because the workflows are run after
106-
the collection phase.
106+
:param filepath: the file that contains the content
107107
:param content_test: a ContentTest object.
108108
:param workflow: the workflow is running.
109+
:param content_name: The name of the content that will be displayed if
110+
the test fails. Defaults to filepath.
109111
"""
110112
# pylint: disable=too-many-arguments
111-
# it is still only 5 not counting self.
113+
# Cannot think of a better way to do this.
112114
super().__init__(name, parent=parent)
113-
self.content_generator = content_generator
115+
self.filepath = filepath
114116
self.content_test = content_test
115117
self.workflow = workflow
116118
self.found_strings = None
117119
self.thread = None
120+
# We check the contents of files. Sometimes files are not there. Then
121+
# content can not be checked. We save FileNotFoundErrors in this
122+
# boolean.
123+
self.file_not_found = False
124+
self.content_name = content_name or str(filepath)
118125

119126
def find_strings(self):
120-
"""Find the strings that are looked for in the given content
121-
The content_generator function shines here. It only starts looking
122-
for lines of text AFTER the workflow is finished. So that is why a
123-
function is needed here and not just a variable containing lines of
124-
text."""
127+
"""Find the strings that are looked for in the given file
128+
129+
When a file we test is not produced, we save the FileNotFoundError so
130+
we can give an accurate repr_failure."""
125131
self.workflow.wait()
126132
strings_to_check = (self.content_test.contains +
127133
self.content_test.must_not_contain)
128-
self.found_strings = check_content(
129-
strings=strings_to_check,
130-
text_lines=self.content_generator())
134+
try:
135+
self.found_strings = check_content(
136+
strings=strings_to_check,
137+
text_lines=file_to_string_generator(self.filepath))
138+
except FileNotFoundError:
139+
self.file_not_found = True
131140

132141
def collect(self):
133142
# A thread is started that looks for the strings and collection can go
@@ -141,15 +150,17 @@ def collect(self):
141150
ContentTestItem(
142151
parent=self,
143152
string=string,
144-
should_contain=True
153+
should_contain=True,
154+
content_name=self.content_name
145155
)
146156
for string in self.content_test.contains]
147157

148158
test_items += [
149159
ContentTestItem(
150160
parent=self,
151161
string=string,
152-
should_contain=False
162+
should_contain=False,
163+
content_name=self.content_name
153164
)
154165
for string in self.content_test.must_not_contain]
155166

@@ -160,7 +171,7 @@ class ContentTestItem(pytest.Item):
160171
"""Item that reports if a string has been found in content."""
161172

162173
def __init__(self, parent: ContentTestCollector, string: str,
163-
should_contain: bool):
174+
should_contain: bool, content_name: str):
164175
"""
165176
Create a ContentTestItem
166177
:param parent: A ContentTestCollector. We use a ContentTestCollector
@@ -169,36 +180,50 @@ def __init__(self, parent: ContentTestCollector, string: str,
169180
finished.
170181
:param string: The string that was searched for.
171182
:param should_contain: Whether the string should have been there
183+
:param content_name: the name of the content which allows for easier
184+
debugging if the test fails
172185
"""
173186
contain = "contains" if should_contain else "does not contain"
174187
name = "{0} '{1}'".format(contain, string)
175188
super().__init__(name, parent=parent)
176189
self.should_contain = should_contain
177190
self.string = string
191+
self.content_name = content_name
178192

179193
def runtest(self):
180194
"""Only after a workflow is finished the contents of files and logs are
181-
read. The ContentTestCollector parent reads each file/log once. This is
195+
read. The ContentTestCollector parent reads each file once. This is
182196
done in its thread. We wait for this thread to complete. Then we check
183197
all the found strings in the parent.
184198
This way we do not have to read each file one time per ContentTestItem
185199
this makes content checking much faster on big files (NGS > 1 GB files)
186200
were we are looking for multiple words (variants / sequences). """
187201
# Wait for thread to complete.
188202
self.parent.thread.join()
203+
assert not self.parent.file_not_found
189204
assert ((self.string in self.parent.found_strings) ==
190205
self.should_contain)
191206

192207
def repr_failure(self, excinfo):
193208
# pylint: disable=unused-argument
194209
# excinfo needed for pytest.
195-
message = (
196-
"'{string}' was {found} in {content} "
197-
"while it {should} be there."
198-
).format(
199-
string=self.string,
200-
found="not found" if self.should_contain else "found",
201-
content=self.parent.name,
202-
should="should" if self.should_contain else "should not"
203-
)
204-
return message
210+
if self.parent.file_not_found:
211+
return (
212+
"'{content}' does not exist and cannot be searched "
213+
"for {containing} '{string}'."
214+
).format(
215+
content=self.content_name,
216+
containing="containing" if self.should_contain
217+
else "not containing",
218+
string=self.string)
219+
220+
else:
221+
return (
222+
"'{string}' was {found} in {content} "
223+
"while it {should} be there."
224+
).format(
225+
string=self.string,
226+
found="not found" if self.should_contain else "found",
227+
content=self.content_name,
228+
should="should" if self.should_contain else "should not"
229+
)

src/pytest_workflow/file_tests.py

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,13 +15,12 @@
1515
# along with pytest-workflow. If not, see <https://www.gnu.org/licenses/
1616

1717
"""All tests for workflow files"""
18-
import functools
1918
import hashlib
2019
from pathlib import Path
2120

2221
import pytest
2322

24-
from .content_tests import ContentTestCollector, file_to_string_generator
23+
from .content_tests import ContentTestCollector
2524
from .schema import FileTest
2625
from .workflow import Workflow
2726

@@ -63,12 +62,10 @@ def collect(self):
6362
tests += [ContentTestCollector(
6463
name="content",
6564
parent=self,
66-
content_generator=functools.partial(file_to_string_generator,
67-
filepath),
65+
filepath=filepath,
6866
content_test=self.filetest,
6967
# FileTest inherits from ContentTest. So this is valid.
70-
workflow=self.workflow
71-
)]
68+
workflow=self.workflow)]
7269

7370
if self.filetest.md5sum:
7471
tests += [FileMd5(

0 commit comments

Comments
 (0)