Skip to content

Commit 9aedac6

Browse files
committed
docs: add instructions for writing new checks
Signed-off-by: behnazh-w <[email protected]>
1 parent 7f92fe2 commit 9aedac6

File tree

4 files changed

+204
-86
lines changed

4 files changed

+204
-86
lines changed

docs/source/pages/developers_guide/index.rst

Lines changed: 157 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.. Copyright (c) 2023 - 2023, Oracle and/or its affiliates. All rights reserved.
1+
.. Copyright (c) 2023 - 2024, Oracle and/or its affiliates. All rights reserved.
22
.. Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/.
33
44
=========================
@@ -11,6 +11,162 @@ To follow the project's code style, see the :doc:`Macaron Style Guide </pages/de
1111

1212
For API reference, see the :doc:`API Reference </pages/developers_guide/apidoc/index>` page.
1313

14+
-------------------
15+
Writing a New Check
16+
-------------------
17+
18+
As a contributor to Macaron, it is very likely to need to write a new check or modify an existing one at some point. In this
19+
section, we will understand how Macaron checks work and what we need to do to develop one.
20+
21+
+++++++++++++++++
22+
High-level Design
23+
+++++++++++++++++
24+
25+
Before jumping into coding, it is useful to understand how Macaron as a framework works. Macaron is an extensible
26+
framework designed to make writing new supply chain security analyses easy. It provides an interface
27+
that you can leverage to access existing models and abstractions instead of implementing everything from scratch. For
28+
instance, many security checks require to traverse through the code in GitHub Actions configurations. Normally,
29+
you would need to find the right repository and commit, clone it, find the workflows, and parse them. With Macaron,
30+
you don't need to do any of that and can simply write your security check by using the parsed shell scripts that are
31+
triggered in the CI.
32+
33+
Another important aspect of our design is that all the check results are automatically mapped and stored in a local database.
34+
By performing this mapping, we make it possible to enforce flexible policies on the results of the checks. While storing
35+
the check results to the database happens automatically by Macaron's backend, the developer needs to add a brief specification
36+
to make that possible as we will see later.
37+
38+
+++++++++++++++++++
39+
The Check Interface
40+
+++++++++++++++++++
41+
42+
Each check needs to be implemented as a Python class in a Python module under ``src/macaron/slsa_analyzer/checks``.
43+
A check class should subclass the ``BaseCheck`` class in :ref:`base_check module <pages/developers_guide/apidoc/macaron\.slsa_analyzer\.checks:macaron.slsa\\_analyzer.checks.base\\_check module>`.
44+
45+
You need to set the name, description, and other details of your new check in the ``__init__`` method. After implementing
46+
the initializer, you need to implement the ``run_check`` abstract method. This method provides the context object
47+
:ref:`AnalyzeContext <pages/developers_guide/apidoc/macaron\.slsa_analyzer:macaron.slsa\\_analyzer.analyze\\_context module>`, which contains various
48+
intermediate representations and models. The ``dynamic_data`` property would be particularly useful as it contains
49+
data about the CI service, artifact registry, and build tool used for building the software component.
50+
51+
``component`` is another useful attribute in the :ref:`AnalyzeContext <pages/developers_guide/apidoc/macaron\.slsa_analyzer:macaron.slsa\\_analyzer.analyze\\_context module>` object
52+
that you should know about. This attribute contains the information about a software component, such
53+
as it's corresponding ``repository`` and ``dependencies``. Note that ``component`` will also be stored into the database and its attributes
54+
such as ``repository`` are established as database relationships. You can see the existing tables and their
55+
relationships in our :ref:`data model <pages/developers_guide/apidoc/macaron.database:macaron.database.table\\_definitions module>`.
56+
57+
Once you implement the logic of your check in the ``run_check`` method, you need to add a class to help
58+
Macaron handle your check's output:
59+
60+
* Add a class that subclasses ``CheckFacts`` to map your outputs to a table in the database. The class name should follow the ``<MyCheck>Facts`` pattern.
61+
* Specify the table name in the ``__tablename__ = "_my_check"`` class variable. Note that the table name should start with ``_`` and it should not have been used by other checks.
62+
* Add the ``id`` column as the primary key where the foreign key is ``_check_facts.id``.
63+
* Add columns for the check outputs that you would like to store into the database. If a column needs to appear as a justification in the HTML/JSON report, pass ``info={"justification": JustificationType.<TEXT or HREF>}`` to the column mapper.
64+
* Add ``__mapper_args__`` class variable and set ``"polymorphic_identity"`` key to the table name.
65+
66+
Next, you need to create a ``result_tables`` list and append check facts as part of the ``run_check`` implementation.
67+
You should also specify a :ref:`Confidence <pages/developers_guide/apidoc/macaron\.slsa_analyzer\.checks:macaron.slsa\\_analyzer.checks.check\\_result module>`
68+
score choosing one of the ``Confidence`` enum values, e.g., ``Confidence.HIGH`` and pass it via keyword
69+
argument ``confidence``. You should choose a suitable confidence score based on the accuracy
70+
of your check analysis.
71+
72+
.. code-block:: python
73+
74+
result_tables.append(MyCheckFacts(col_foo=foo, col_bar=bar, confidence=Confidence.HIGH))
75+
76+
This list as well as the check result status should be stored in a :ref:`CheckResultData <pages/developers_guide/apidoc/macaron\.slsa_analyzer\.checks:macaron.slsa\\_analyzer.checks.check\\_result module>`
77+
object and returned by ``run_check``.
78+
79+
Finally, you need to register your check by adding it to the :ref:`registry module <pages/developers_guide/apidoc/macaron\.slsa_analyzer:macaron.slsa\\_analyzer.registry module>`:
80+
81+
.. code-block:: python
82+
83+
registry.register(MyCheck())
84+
85+
And of course, make sure to add tests for you check by adding a module under ``tests/slsa_analyzer/checks/``.
86+
87+
+++++++
88+
Example
89+
+++++++
90+
91+
In this example, we show how to add a check determine if a software component has a source-code repository.
92+
Feel free to explore other existing checks under ``src/macaron/slsa_analyzer/checks`` for more examples.
93+
94+
1. First create a module called ``repo_check.py`` under ``src/macaron/slsa_analyzer/checks``.
95+
96+
2. Add a class and specify the columns that you want to store for the check outputs to the database.
97+
98+
.. code-block:: python
99+
100+
# Add this line at the top of the file to create the logger object if you plan to use it.
101+
logger: logging.Logger = logging.getLogger(__name__)
102+
103+
104+
class RepoCheckFacts(CheckFacts):
105+
"""The ORM mapping for justifications in the check repository check."""
106+
107+
__tablename__ = "_repo_check"
108+
109+
#: The primary key.
110+
id: Mapped[int] = mapped_column(ForeignKey("_check_facts.id"), primary_key=True)
111+
112+
#: The Git repository path.
113+
git_repo: Mapped[str] = mapped_column(String, nullable=True, info={"justification": JustificationType.HREF})
114+
115+
__mapper_args__ = {
116+
"polymorphic_identity": "__repo_check",
117+
}
118+
119+
3. Add a class for your check, provide the check details in the initializer method, and implement the logic of the check in ``run_check``.
120+
121+
.. code-block:: python
122+
123+
class RepoCheck(BaseCheck):
124+
"""This Check checks whether the target software component has a source-code repository."""
125+
126+
def __init__(self) -> None:
127+
"""Initialize instance."""
128+
check_id = "mcn_repo_exists_1"
129+
description = "Check whether the target software component has a source-code repository."
130+
depends_on: list[tuple[str, CheckResultType]] = [] # This check doesn't depend on any other checks.
131+
eval_reqs = [
132+
ReqName.VCS
133+
] # Choose a SLSA requirement that roughly matches this check from the ReqName enum class.
134+
super().__init__(check_id=check_id, description=description, depends_on=depends_on, eval_reqs=eval_reqs)
135+
136+
def run_check(self, ctx: AnalyzeContext) -> CheckResultData:
137+
"""Implement the check in this method.
138+
139+
Parameters
140+
----------
141+
ctx : AnalyzeContext
142+
The object containing processed data for the target software component.
143+
144+
Returns
145+
-------
146+
CheckResultData
147+
The result of the check.
148+
"""
149+
if not ctx.component.repository:
150+
logger.info("Unable to find a Git repository for %s", ctx.component.purl)
151+
# We do not store any results in the database if a check fails. So, just leave result_tables empty.
152+
return CheckResultData(result_tables=[], result_type=CheckResultType.FAILED)
153+
154+
return CheckResultData(
155+
result_tables=[RepoCheckFacts(git_repo=ctx.component.repository.remote_path, confidence=Confidence.HIGH)],
156+
result_type=CheckResultType.PASSED,
157+
)
158+
159+
4. Register your check.
160+
161+
.. code-block:: python
162+
163+
registry.register(RepoCheck())
164+
165+
166+
Finally, you can add tests for you check by adding ``tests/slsa_analyzer/checks/test_repo_check.py`` module. Macaron
167+
uses `pytest <https://docs.pytest.org>`_ and `hypothesis <https://hypothesis.readthedocs.io>`_ for testing. Take a look
168+
at other tests for inspiration!
169+
14170
.. toctree::
15171
:maxdepth: 1
16172

src/macaron/slsa_analyzer/analyze_context.py

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,8 @@ def __init__(
8282
self.check_results: dict[str, CheckResult] = {}
8383

8484
# Add the data computed at runtime to the dynamic_data attribute.
85-
self.dynamic_data: ChecksOutputs = ChecksOutputs(
85+
# This attribute should be accessed via the `dynamic_data` property.
86+
self._dynamic_data: ChecksOutputs = ChecksOutputs(
8687
git_service=NoneGitService(),
8788
build_spec=BuildSpec(tools=[]),
8889
ci_services=[],
@@ -91,6 +92,24 @@ def __init__(
9192
expectation=None,
9293
)
9394

95+
@property
96+
def dynamic_data(self) -> ChecksOutputs:
97+
"""Return the `dynamic_data` object that contains various intermediate representations.
98+
99+
This object is used to pass various models and intermediate representations from the backend
100+
in Macaron to checks. A check can also store intermediate results in this object to be used
101+
by checks that depend on it. However, please avoid adding arbitrary attributes to this object!
102+
103+
We recommend to take a look at the attributes in this object before writing a new check. Chances
104+
are that what you try to implement is already implemented and the results are available in the
105+
`dynamic_data` object.
106+
107+
Return
108+
------
109+
ChecksOutputs
110+
"""
111+
return self._dynamic_data
112+
94113
@property
95114
def provenances(self) -> dict[str, list[InTotoV01Statement | InTotoV1Statement]]:
96115
"""Return the provenances data as a dictionary.
Lines changed: 2 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -1,42 +1,6 @@
11
# Defining Checks
22

3-
The checks defined in this directory are automatically loaded during the startup of Macaron and used during the analysis. This `README.md` shows how a Check can be created.
3+
The checks defined in this directory are automatically loaded during the startup of Macaron and used during the analysis. For detailed instructions to write a new check, see our [website](https://oracle.github.io/macaron/pages/developers_guide/index.html).
44

5-
## Base Check
6-
The `BaseCheck` class (located at [base_check.py](./base_check.py)) is the abstract class to be inherited by other concrete checks.
7-
Please see [base_check.py](./base_check.py) for the attributes of a `BaseCheck` instance.
85

9-
## Writing a Macaron Check
10-
These are the steps for creating a Check in Macaron:
11-
1. Create a module with the name `<name>_check.py`. Note that Macaron **only** loads check modules that have this name format.
12-
2. Create a class that inherits `BaseCheck` and initiates the attributes of a `BaseCheck` instance.
13-
3. Register the newly created Check class to the Registry ([registry.py](../registry.py)). This will make the Check available to Macaron. For example:
14-
```python
15-
from macaron.slsa_analyzer.registry import registry
16-
17-
# Check class is defined here
18-
# class ExampleCheck(BaseCheck):
19-
# ...
20-
21-
registry.register(ExampleCheck())
22-
```
23-
4. Add an ORM mapped class for the check facts so that the policy engine can reason about the properties. To provide the mapped class, all you need to do is to add a class that inherits from `CheckFacts` class and add the following attributes (rename the `MyCheckFacts` check name and `__tablename__` as appropriate).
24-
25-
```python
26-
class MyCheckFacts(CheckFacts):
27-
"""The ORM mapping for justifications in my check."""
28-
29-
__tablename__ = "_my_check"
30-
31-
#: The primary key.
32-
id: Mapped[int] = mapped_column(ForeignKey("_check_facts.id"), primary_key=True) # noqa: A003
33-
34-
#: The name of the column (property) that becomes available to policy engine.
35-
my_column_name: Mapped[str] = mapped_column(String, nullable=False)
36-
37-
__mapper_args__ = {
38-
"polymorphic_identity": "_my_check",
39-
}
40-
```
41-
42-
For more examples, please see the existing Checks in [checks/](./).
6+
You can also have a look at the existing Checks in [this](./) directory for inspiration.
Lines changed: 25 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -1,62 +1,41 @@
1-
# Copyright (c) 2022 - 2023, Oracle and/or its affiliates. All rights reserved.
1+
# Copyright (c) 2022 - 2024, Oracle and/or its affiliates. All rights reserved.
22
# Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/.
33

44
"""This modules contains tests for the provenance available check."""
55

66
import os
7+
from pathlib import Path
78

89
from macaron.database.table_definitions import Analysis, Component, Repository
9-
from macaron.slsa_analyzer.analyze_context import AnalyzeContext, ChecksOutputs
1010
from macaron.slsa_analyzer.checks.check_result import CheckResultType
1111
from macaron.slsa_analyzer.checks.vcs_check import VCSCheck
12-
from macaron.slsa_analyzer.git_service.base_git_service import NoneGitService
13-
from macaron.slsa_analyzer.slsa_req import SLSALevels
14-
from macaron.slsa_analyzer.specs.build_spec import BuildSpec
12+
from tests.conftest import MockAnalyzeContext
1513

16-
from ...macaron_testcase import MacaronTestCase
1714
from ..mock_git_utils import initiate_repo
1815

1916
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
2017
REPO_DIR = os.path.join(BASE_DIR, "mock_repos", "vcs_check_repo/sample_repo")
2118

2219

23-
# pylint: disable=super-init-not-called
24-
class MockAnalyzeContext(AnalyzeContext):
25-
"""This class can be initiated without a git obj."""
26-
27-
def __init__(self) -> None:
28-
# Make the VCS Check fails.
29-
self.component = Component(purl="pkg:invalid/invalid", analysis=Analysis(), repository=None)
30-
self.ctx_data: dict = {}
31-
self.slsa_level = SLSALevels.LEVEL0
32-
self.is_full_reach = False
33-
self.dynamic_data: ChecksOutputs = ChecksOutputs(
34-
git_service=NoneGitService(),
35-
build_spec=BuildSpec(tools=[]),
36-
ci_services=[],
37-
is_inferred_prov=True,
38-
expectation=None,
39-
package_registries=[],
40-
)
41-
self.wrapper_path = ""
42-
self.output_dir = ""
43-
44-
45-
class TestVCSCheck(MacaronTestCase):
46-
"""Test the vcs check."""
47-
48-
def test_vcs_check(self) -> None:
49-
"""Test the vcs check."""
50-
check = VCSCheck()
51-
initiate_repo(REPO_DIR)
52-
53-
component = Component(
54-
purl="pkg:github/package-url/purl-spec@244fd47e07d1004f0aed9c",
55-
analysis=Analysis(),
56-
repository=Repository(complete_name="github.com/package-url/purl-spec"),
57-
)
58-
use_git_repo = AnalyzeContext(component=component, macaron_path=REPO_DIR, output_dir="")
59-
assert check.run_check(use_git_repo).result_type == CheckResultType.PASSED
60-
61-
no_git_repo = MockAnalyzeContext()
62-
assert check.run_check(no_git_repo).result_type == CheckResultType.FAILED
20+
def test_vcs_check_valid_repo(macaron_path: Path) -> None:
21+
"""Test the vcs check for a valid repo."""
22+
check = VCSCheck()
23+
initiate_repo(REPO_DIR)
24+
use_git_repo = MockAnalyzeContext(macaron_path=macaron_path, output_dir="")
25+
use_git_repo.component = Component(
26+
purl="pkg:github/package-url/purl-spec@244fd47e07d1004f0aed9c",
27+
analysis=Analysis(),
28+
repository=Repository(complete_name="github.com/package-url/purl-spec"),
29+
)
30+
assert check.run_check(use_git_repo).result_type == CheckResultType.PASSED
31+
32+
33+
def test_vcs_check_invalid_repo(macaron_path: Path) -> None:
34+
"""Test the vcs check for an invalid repo."""
35+
check = VCSCheck()
36+
initiate_repo(REPO_DIR)
37+
no_git_repo = MockAnalyzeContext(macaron_path=macaron_path, output_dir="")
38+
no_git_repo.component = Component(
39+
purl="pkg:github/package-url/purl-spec@244fd47e07d1004f0aed9c", analysis=Analysis(), repository=None
40+
)
41+
assert check.run_check(no_git_repo).result_type == CheckResultType.FAILED

0 commit comments

Comments
 (0)