Skip to content

Commit bc4374a

Browse files
committed
chore: address PR comments
Signed-off-by: behnazh-w <[email protected]>
1 parent 8788015 commit bc4374a

File tree

9 files changed

+196
-112
lines changed

9 files changed

+196
-112
lines changed

docs/source/pages/developers_guide/index.rst

Lines changed: 93 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -32,68 +32,91 @@ triggered in the CI.
3232

3333
Another important aspect of our design is that all the check results are automatically mapped and stored in a local database.
3434
By performing this mapping, we make it possible to enforce use case-specific policies on the results of the checks. While storing
35-
the check results to the database happens automatically in Macaron's backend, the developer needs to add a brief specification
35+
the check results in the database happens automatically in Macaron's backend, the developer needs to add a brief specification
3636
to make that possible as we will see later.
3737

38+
Once you get familiar with writing a basic check, you can explore the check dependency feature in Macaron. The checks
39+
in our framework can be customized to only run if another check has run and returned a specific
40+
:class:`result type <macaron.slsa_analyzer.checks.check_result.CheckResultType>`. This feature can be used when some checks
41+
can be ordered and have a parent-child relationship, i.e., one check implements a weaker or stronger version of a
42+
security property in a parent check. Therefore, it might make sense to skip running the check and report a
43+
:class:`result type <macaron.slsa_analyzer.checks.check_result.CheckResultType>` based on the result of the parent check.
44+
3845
+++++++++++++++++++
3946
The Check Interface
4047
+++++++++++++++++++
4148

4249
Each check needs to be implemented as a Python class in a Python module under ``src/macaron/slsa_analyzer/checks``.
4350
A check class should subclass the :class:`BaseCheck <macaron.slsa_analyzer.checks.base_check.BaseCheck>` class.
4451

45-
You need to set the name, description, and other details of your new check in the ``__init__`` method. After implementing
46-
the initializer, you need to implement the ``run_check`` abstract method. This method provides the context object
47-
:class:`AnalyzeContext <macaron.slsa_analyzer.analyze_context.AnalyzeContext>`, which contains various
48-
intermediate representations and models. The ``dynamic_data`` property would be particularly useful as it contains
49-
data about the CI service, artifact registry, and build tool used for building the software component.
50-
51-
``component`` is another useful attribute in the :class:`AnalyzeContext <macaron.slsa_analyzer.analyze_context.AnalyzeContext>` object
52-
that you should know about. This attribute contains the information about a software component, such
53-
as it's corresponding ``repository`` and ``dependencies``. Note that ``component`` will also be stored into the database and its attributes
54-
such as ``repository`` are established as database relationships. You can see the existing tables and their
55-
relationships in our :mod:`data model <macaron.database.table_definitions>`.
56-
57-
Once you implement the logic of your check in the ``run_check`` method, you need to add a class to help
58-
Macaron handle your check's output:
59-
60-
* Add a class that subclasses ``CheckFacts`` to map your outputs to a table in the database. The class name should follow the ``<MyCheck>Facts`` pattern.
61-
* Specify the table name in the ``__tablename__ = "_my_check"`` class variable. Note that the table name should start with ``_`` and it should not have been used by other checks.
62-
* Add the ``id`` column as the primary key where the foreign key is ``_check_facts.id``.
63-
* Add columns for the check outputs that you would like to store into the database. If a column needs to appear as a justification in the HTML/JSON report, pass ``info={"justification": JustificationType.<TEXT or HREF>}`` to the column mapper.
64-
* Add ``__mapper_args__`` class variable and set ``"polymorphic_identity"`` key to the table name.
52+
The main logic of a check should be implemented in the :func:`run_check <macaron.slsa_analyzer.checks.base_check.BaseCheck.run_check>` abstract method. It is important to understand the input
53+
parameters and output objects computed by this method.
6554

66-
Next, you need to create a ``result_tables`` list and append check facts as part of the ``run_check`` implementation.
67-
You should also specify a :class:`Confidence <macaron.slsa_analyzer.checks.check_result.Confidence>`
68-
score choosing one of the ``Confidence`` enum values, e.g., ``Confidence.HIGH`` and pass it via keyword
69-
argument ``confidence``. You should choose a suitable confidence score based on the accuracy
70-
of your check analysis.
55+
.. code-block: python
56+
def run_check(self, ctx: AnalyzeContext) -> CheckResultData:
7157
72-
.. code-block:: python
58+
''''''''''''''''
59+
Input Parameters
60+
''''''''''''''''
7361

74-
result_tables.append(MyCheckFacts(col_foo=foo, col_bar=bar, confidence=Confidence.HIGH))
62+
The :func:`run_check <macaron.slsa_analyzer.checks.base_check.BaseCheck.run_check>` method is a callback called by our checker framework. The framework pre-computes a context object,
63+
:class:`ctx: AnalyzeContext <macaron.slsa_analyzer.analyze_context.AnalyzeContext>` and makes it available as the input
64+
parameter to the function. The ``ctx`` object contains various intermediate representations and models as the input parameter.
65+
Most likely, you will need to use the following properties:
7566

76-
This list as well as the check result status should be stored in a :class:`CheckResultData <macaron.slsa_analyzer.checks.check_result.CheckResultData>`
77-
object and returned by ``run_check``.
67+
* :attr:`component <macaron.slsa_analyzer.analyze_context.AnalyzeContext.component>`
68+
* :attr:`dynamic_data <macaron.slsa_analyzer.analyze_context.AnalyzeContext.dynamic_data>`
7869

79-
Finally, you need to register your check by adding it to the :mod:`registry module <macaron.slsa_analyzer.registry>`:
70+
The :attr:`component <macaron.slsa_analyzer.analyze_context.AnalyzeContext.component>`
71+
object acts as a representation of a software component and contains data, such as it's
72+
corresponding :class:`Repository <macaron.database.table_definitions.Repository>` and
73+
:data:`dependencies <macaron.database.table_definitions.components_association_table>`.
74+
Note that :attr:`component <macaron.slsa_analyzer.analyze_context.AnalyzeContext.component>` will also be stored
75+
in the database and its attributes, such as :attr:`repository <macaron.database.table_definitions.Component.repository>`
76+
are established as database relationships. You can see the existing tables and their relationships
77+
in our :mod:`data model <macaron.database.table_definitions>`.
8078

81-
.. code-block:: python
79+
The :attr:`dynamic_data <macaron.slsa_analyzer.analyze_context.AnalyzeContext.dynamic_data>` property would be particularly useful as it contains
80+
data about the CI service, artifact registry, and build tool used for building the software component.
81+
Note that this object is a shared state among checks. If a check runs before another check, it can
82+
make changes to this object, which will be accessible to the checks run subsequently.
8283

83-
registry.register(MyCheck())
84+
''''''
85+
Output
86+
''''''
8487

85-
And of course, make sure to add tests for your check by adding a module under ``tests/slsa_analyzer/checks/``.
88+
The :func:`run_check <macaron.slsa_analyzer.checks.base_check.BaseCheck.run_check>` method returns a :class:`CheckResultData <macaron.slsa_analyzer.checks.check_result.CheckResultData>` object.
89+
This object consists of :attr:`result_tables <macaron.slsa_analyzer.checks.check_result.CheckResultData.result_tables>` and
90+
:attr:`result_type <macaron.slsa_analyzer.checks.check_result.CheckResultData.result_type>`.
91+
The :attr:`result_tables <macaron.slsa_analyzer.checks.check_result.CheckResultData.result_tables>` object is the list of facts generated from the check. The :attr:`result_type <macaron.slsa_analyzer.checks.check_result.CheckResultData.result_type>`
92+
value shows the final result type of the check.
8693

8794
+++++++
8895
Example
8996
+++++++
9097

91-
In this example, we show how to add a check determine if a software component has a source-code repository.
98+
In this example, we show how to add a check to determine if a software component has a source-code repository.
99+
Note that this is a simple example to just demonstrate how to add a check from scratch.
92100
Feel free to explore other existing checks under ``src/macaron/slsa_analyzer/checks`` for more examples.
93101

94-
1. First create a module called ``repo_check.py`` under ``src/macaron/slsa_analyzer/checks``.
102+
As discussed earlier, each check needs to be implemented as a Python class in a Python module under ``src/macaron/slsa_analyzer/checks``.
103+
A check class should subclass the :class:`BaseCheck <macaron.slsa_analyzer.checks.base_check.BaseCheck>` class.
104+
105+
'''''''''''''''
106+
Create a module
107+
'''''''''''''''
108+
First create a module called ``repo_check.py`` under ``src/macaron/slsa_analyzer/checks``.
95109

96-
2. Add a class and specify the columns that you want to store for the check outputs to the database.
110+
111+
''''''''''''''''''''''''''''
112+
Add a class for the database
113+
''''''''''''''''''''''''''''
114+
115+
* Add a class that subclasses :class:`CheckFacts <macaron.database.table_definitions.CheckFacts>` to map your outputs to a table in the database. The class name should follow the ``<MyCheck>Facts`` pattern.
116+
* Specify the table name in the ``__tablename__`` class variable. Note that the table name should start with ``_`` and it should not have been used by other checks.
117+
* Add the ``id`` column as the primary key where the foreign key is ``_check_facts.id``.
118+
* Add columns for the check outputs that you would like to store in the database. If a column needs to appear as a justification in the HTML/JSON report, pass ``info={"justification": JustificationType.<TEXT or HREF>}`` to the column mapper.
119+
* Add ``__mapper_args__`` class variable and set ``"polymorphic_identity"`` key to the table name.
97120

98121
.. code-block:: python
99122
@@ -113,10 +136,25 @@ Feel free to explore other existing checks under ``src/macaron/slsa_analyzer/che
113136
git_repo: Mapped[str] = mapped_column(String, nullable=True, info={"justification": JustificationType.HREF})
114137
115138
__mapper_args__ = {
116-
"polymorphic_identity": "__repo_check",
139+
"polymorphic_identity": "_repo_check",
117140
}
118141
119-
3. Add a class for your check, provide the check details in the initializer method, and implement the logic of the check in ``run_check``.
142+
'''''''''''''''''''
143+
Add the check class
144+
'''''''''''''''''''
145+
146+
Add a class for your check that subclasses :class:`BaseCheck <macaron.slsa_analyzer.checks.base_check.BaseCheck>`,
147+
provide the check details in the initializer method, and implement the logic of the check in
148+
:func:`run_check <macaron.slsa_analyzer.checks.base_check.BaseCheck.run_check>`.
149+
150+
A ``check_id`` should meet the following requirements:
151+
152+
- The general format: ``mcn_<name>_<digits>``
153+
- In ``name``, only lowercase alphabetical letters are allowed. If ``name`` contains multiple \
154+
words, they must be separated by underscores.
155+
156+
157+
You can set the ``depends_on`` attribute in the initializer method to declare such dependencies. In this example, we leave this list empty.
120158

121159
.. code-block:: python
122160
@@ -156,13 +194,28 @@ Feel free to explore other existing checks under ``src/macaron/slsa_analyzer/che
156194
result_type=CheckResultType.PASSED,
157195
)
158196
159-
4. Register your check.
197+
As you can see, the result of the check is returned via the :class:`CheckResultData <macaron.slsa_analyzer.checks.check_result.CheckResultData>` object.
198+
You should specify a :class:`Confidence <macaron.slsa_analyzer.checks.check_result.Confidence>`
199+
score choosing one of the :class:`Confidence <macaron.slsa_analyzer.checks.check_result.Confidence>` enum values,
200+
e.g., :class:`Confidence.HIGH <macaron.slsa_analyzer.checks.check_result.Confidence.HIGH>` and pass it via keyword
201+
argument :attr:`confidence <macaron.database.table_definitions.CheckFacts.confidence>`. You should choose a suitable
202+
confidence score based on the accuracy of your check analysis.
203+
204+
'''''''''''''''''''
205+
Register your check
206+
'''''''''''''''''''
207+
208+
Finally, you need to register your check by adding it to the :mod:`registry module <macaron.slsa_analyzer.registry>` at the end of your check module:
160209

161210
.. code-block:: python
162211
163212
registry.register(RepoCheck())
164213
165214
215+
'''''''''''''''
216+
Test your check
217+
'''''''''''''''
218+
166219
Finally, you can add tests for you check by adding ``tests/slsa_analyzer/checks/test_repo_check.py`` module. Macaron
167220
uses `pytest <https://docs.pytest.org>`_ and `hypothesis <https://hypothesis.readthedocs.io>`_ for testing. Take a look
168221
at other tests for inspiration!

src/macaron/database/table_definitions.py

Lines changed: 1 addition & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
import string
1616
from datetime import datetime
1717
from pathlib import Path
18-
from typing import Any, Self
18+
from typing import Any
1919

2020
from packageurl import PackageURL
2121
from sqlalchemy import (
@@ -448,30 +448,6 @@ class CheckFacts(ORMBase):
448448
#: A many-to-one relationship with check results.
449449
checkresult: Mapped["MappedCheckResult"] = relationship(back_populates="checkfacts")
450450

451-
def __lt__(self, other: Self) -> bool:
452-
"""Compare two check facts using their confidence values.
453-
454-
This comparison function is intended to be used by a heapq, which is a Min-Heap data structure.
455-
The root element in a heapq is the minimum element in the queue and each `confidence` value is in [0, 1].
456-
Therefore, we need reverse the comparison function to make sure the fact with highest confidence is stored
457-
in the root element. This implementation compares `1 - confidence` to return True if the confidence of
458-
`fact_a` is greater than the confidence of `fact_b`.
459-
460-
.. code-block:: pycon
461-
462-
>>> fact_a = CheckFacts()
463-
>>> fact_b = CheckFacts()
464-
>>> fact_a.confidence = 0.2
465-
>>> fact_b.confidence = 0.7
466-
>>> fact_b < fact_a
467-
True
468-
469-
Return
470-
------
471-
bool
472-
"""
473-
return (1 - self.confidence) < (1 - other.confidence)
474-
475451
#: The polymorphic inheritance configuration.
476452
__mapper_args__ = {
477453
"polymorphic_identity": "CheckFacts",

src/macaron/slsa_analyzer/analyze_context.py

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,9 @@ def __init__(
6464
output_dir : str
6565
The output dir.
6666
"""
67-
self.component = component
67+
# The component attribute should be accessed via the `component` property.
68+
self._component = component
69+
6870
self.ctx_data: dict[ReqName, SLSAReqStatus] = create_requirement_status_dict()
6971

7072
self.slsa_level = SLSALevels.LEVEL0
@@ -92,6 +94,20 @@ def __init__(
9294
expectation=None,
9395
)
9496

97+
@property
98+
def component(self) -> Component:
99+
"""Return the object associated with a target software component.
100+
101+
This property contains the information about a software component, such as it's
102+
corresponding repository and dependencies.
103+
104+
105+
Returns
106+
-------
107+
Component
108+
"""
109+
return self._component
110+
95111
@property
96112
def dynamic_data(self) -> ChecksOutputs:
97113
"""Return the `dynamic_data` object that contains various intermediate representations.
@@ -104,8 +120,8 @@ def dynamic_data(self) -> ChecksOutputs:
104120
are that what you try to implement is already implemented and the results are available in the
105121
`dynamic_data` object.
106122
107-
Return
108-
------
123+
Returns
124+
-------
109125
ChecksOutputs
110126
"""
111127
return self._dynamic_data

src/macaron/slsa_analyzer/checks/check_result.py

Lines changed: 9 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@
44
"""This module contains the CheckResult class for storing the result of a check."""
55
from dataclasses import dataclass
66
from enum import Enum
7-
from heapq import heappush
87
from typing import TypedDict
98

109
from macaron.database.table_definitions import CheckFacts
@@ -78,19 +77,18 @@ class CheckResultData:
7877
@property
7978
def justification_report(self) -> list[tuple[Confidence, list]]:
8079
"""
81-
Return the list of justifications for the check result generated from the tables in the database.
80+
Return a sorted list of justifications based on confidence scores in descending order.
8281
83-
Note that the elements in the justification will be rendered different based on their types:
82+
These justifications are generated from the tables in the database.
83+
Note that the elements in the justification will be rendered differently based on their types:
8484
8585
* a :class:`JustificationType.TEXT` element is displayed in plain text in the HTML report.
8686
* a :class:`JustificationType.HREF` element is rendered as a hyperlink in the HTML report.
8787
88-
Return
89-
------
88+
Returns
89+
-------
9090
list[tuple[Confidence, list]]
9191
"""
92-
# Interestingly, mypy cannot infer the type of elements later at `heappush` if we specify
93-
# list[tuple[Confidence, list]]. But still, it insists on specifying the `list` type here.
9492
justification_list: list = []
9593
for result in self.result_tables:
9694
# The HTML report generator requires the justification elements that need to be rendered in HTML
@@ -112,15 +110,15 @@ def justification_report(self) -> list[tuple[Confidence, list]]:
112110
if dict_elements:
113111
list_elements.append(dict_elements)
114112

115-
# Use heapq to always keep the justification with the highest confidence score in the first element.
116113
if list_elements:
117-
heappush(justification_list, (result.confidence, list_elements))
114+
justification_list.append((result.confidence, list_elements))
118115

119116
# If there are no justifications available, return a default "Not Available" one.
120117
if not justification_list:
121118
return [(Confidence.HIGH, ["Not Available."])]
122119

123-
return justification_list
120+
# Sort the justification list based on the confidence score in descending order.
121+
return sorted(justification_list, key=lambda item: item[0], reverse=True)
124122

125123

126124
@dataclass(frozen=True)
@@ -147,7 +145,7 @@ def get_summary(self) -> dict:
147145
"check_id": self.check.check_id,
148146
"check_description": self.check.check_description,
149147
"slsa_requirements": [str(BUILD_REQ_DESC.get(req)) for req in self.check.eval_reqs],
150-
# The justification report is stored in a heapq where the first element has the highest confidence score.
148+
# The justification report is sorted and the first element has the highest confidence score.
151149
"justification": self.result.justification_report[0][1],
152150
"result_tables": self.result.result_tables,
153151
"result_type": self.result.result_type,

src/macaron/slsa_analyzer/checks/infer_artifact_pipeline_check.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,10 +36,10 @@ class InferArtifactPipelineFacts(CheckFacts):
3636
id: Mapped[int] = mapped_column(ForeignKey("_check_facts.id"), primary_key=True) # noqa: A003
3737

3838
#: The workflow job that triggered deploy.
39-
deploy_job: Mapped[str] = mapped_column(String, nullable=False, info={"justification": JustificationType.HREF})
39+
deploy_job: Mapped[str] = mapped_column(String, nullable=False, info={"justification": JustificationType.TEXT})
4040

4141
#: The workflow step that triggered deploy.
42-
deploy_step: Mapped[str] = mapped_column(String, nullable=False, info={"justification": JustificationType.HREF})
42+
deploy_step: Mapped[str] = mapped_column(String, nullable=False, info={"justification": JustificationType.TEXT})
4343

4444
#: The workflow run URL.
4545
run_url: Mapped[str] = mapped_column(String, nullable=False, info={"justification": JustificationType.HREF})

0 commit comments

Comments
 (0)