You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat!: introduce confidence scores for check facts (#620)
This PR changes the data model and allows specifying confidence scores for check results, which is especially useful when a check reports multiple candidate results. All of these confidence scores are added to the check tables in the database. However, the fact that has the highest confidence is shown in the HTML/JSON report only.
The justifications are no longer required to be added manually to the CheckResultData. Instead, they are curated directly from the results in the table. If a column has specified JustificationType in the column mapping, it will be picked up automatically and rendered as plain text or href depending on the specified type. If a check fails or is skipped, we show a default Not Available. justification. This allows to create HTML/JSON reports from the database reproducibly.
Signed-off-by: behnazh-w <[email protected]>
Copy file name to clipboardExpand all lines: docs/source/pages/developers_guide/index.rst
+208-1Lines changed: 208 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,4 @@
1
-
.. Copyright (c) 2023 - 2023, Oracle and/or its affiliates. All rights reserved.
1
+
.. Copyright (c) 2023 - 2024, Oracle and/or its affiliates. All rights reserved.
2
2
.. Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/.
3
3
4
4
=========================
@@ -11,6 +11,213 @@ To follow the project's code style, see the :doc:`Macaron Style Guide </pages/de
11
11
12
12
For API reference, see the :doc:`API Reference </pages/developers_guide/apidoc/index>` page.
13
13
14
+
-------------------
15
+
Writing a New Check
16
+
-------------------
17
+
18
+
Contributors to Macaron are very likely to need to write a new check or modify an existing one at some point. In this
19
+
section, we will explain how Macaron checks work. We will also show how to develop a new check.
20
+
21
+
+++++++++++++++++
22
+
High-level Design
23
+
+++++++++++++++++
24
+
25
+
Before jumping into coding, it is useful to understand how Macaron as a framework works. Macaron is an extensible
26
+
framework designed to make writing new supply chain security analyses easy. It provides an interface
27
+
that you can leverage to access existing models and abstractions instead of implementing everything from scratch. For
28
+
instance, many security checks require traversing through the code in GitHub Actions configurations. Normally,
29
+
you would need to find the right repository and commit, clone it, find the workflows, and parse them. With Macaron,
30
+
you don't need to do any of that and can simply write your security check by using the parsed shell scripts that are
31
+
triggered in the CI.
32
+
33
+
Another important aspect of our design is that all the check results are automatically mapped and stored in a local database.
34
+
By performing this mapping, we make it possible to enforce use case-specific policies on the results of the checks. While storing
35
+
the check results in the database happens automatically in Macaron's backend, the developer needs to add a brief specification
36
+
to make that possible as we will see later.
37
+
38
+
Once you get familiar with writing a basic check, you can explore the check dependency feature in Macaron. The checks
39
+
in our framework can be customized to only run if another check has run and returned a specific
40
+
:class:`result type <macaron.slsa_analyzer.checks.check_result.CheckResultType>`. This feature can be used when checks
41
+
have an ordering and a parent-child relationship, i.e., one check implements a weaker or stronger version of a
42
+
security property in a parent check. Therefore, it might make sense to skip running the check and report a
43
+
:class:`result type <macaron.slsa_analyzer.checks.check_result.CheckResultType>` based on the result of the parent check.
44
+
45
+
+++++++++++++++++++
46
+
The Check Interface
47
+
+++++++++++++++++++
48
+
49
+
Each check needs to be implemented as a Python class in a Python module under ``src/macaron/slsa_analyzer/checks``.
50
+
A check class should subclass the :class:`BaseCheck <macaron.slsa_analyzer.checks.base_check.BaseCheck>` class.
51
+
52
+
The main logic of a check should be implemented in the :func:`run_check <macaron.slsa_analyzer.checks.base_check.BaseCheck.run_check>` abstract method. It is important to understand the input
53
+
parameters and output objects computed by this method.
The :func:`run_check <macaron.slsa_analyzer.checks.base_check.BaseCheck.run_check>` method is a callback called by our checker framework. The framework pre-computes a context object,
63
+
:class:`ctx: AnalyzeContext <macaron.slsa_analyzer.analyze_context.AnalyzeContext>` and makes it available as the input
64
+
parameter to the function. The ``ctx`` object contains various intermediate representations and models as the input parameter.
65
+
Most likely, you will need to use the following properties:
Note that :attr:`component <macaron.slsa_analyzer.analyze_context.AnalyzeContext.component>` will also be stored
75
+
in the database and its attributes, such as :attr:`repository <macaron.database.table_definitions.Component.repository>`
76
+
are established as database relationships. You can see the existing tables and their relationships
77
+
in our :mod:`data model <macaron.database.table_definitions>`.
78
+
79
+
The :attr:`dynamic_data <macaron.slsa_analyzer.analyze_context.AnalyzeContext.dynamic_data>` property would be particularly useful as it contains
80
+
data about the CI service, artifact registry, and build tool used for building the software component.
81
+
Note that this object is a shared state among checks. If a check runs before another check, it can
82
+
make changes to this object, which will be accessible to the checks run subsequently.
83
+
84
+
''''''
85
+
Output
86
+
''''''
87
+
88
+
The :func:`run_check <macaron.slsa_analyzer.checks.base_check.BaseCheck.run_check>` method returns a :class:`CheckResultData <macaron.slsa_analyzer.checks.check_result.CheckResultData>` object.
89
+
This object consists of :attr:`result_tables <macaron.slsa_analyzer.checks.check_result.CheckResultData.result_tables>` and
The :attr:`result_tables <macaron.slsa_analyzer.checks.check_result.CheckResultData.result_tables>` object is the list of facts generated from the check. The :attr:`result_type <macaron.slsa_analyzer.checks.check_result.CheckResultData.result_type>`
92
+
value shows the final result type of the check.
93
+
94
+
+++++++
95
+
Example
96
+
+++++++
97
+
98
+
In this example, we show how to add a check to determine if a software component has a source-code repository.
99
+
Note that this is a simple example to just demonstrate how to add a check from scratch.
100
+
Feel free to explore other existing checks under ``src/macaron/slsa_analyzer/checks`` for more examples.
101
+
102
+
As discussed earlier, each check needs to be implemented as a Python class in a Python module under ``src/macaron/slsa_analyzer/checks``.
103
+
A check class should subclass the :class:`BaseCheck <macaron.slsa_analyzer.checks.base_check.BaseCheck>` class.
104
+
105
+
'''''''''''''''
106
+
Create a module
107
+
'''''''''''''''
108
+
First create a module called ``repo_check.py`` under ``src/macaron/slsa_analyzer/checks``.
109
+
110
+
111
+
''''''''''''''''''''''''''''
112
+
Add a class for the database
113
+
''''''''''''''''''''''''''''
114
+
115
+
* Add a class that subclasses :class:`CheckFacts <macaron.database.table_definitions.CheckFacts>` to map your outputs to a table in the database. The class name should follow the ``<MyCheck>Facts`` pattern.
116
+
* Specify the table name in the ``__tablename__`` class variable. Note that the table name should start with ``_`` and it should not have been used by other checks.
117
+
* Add the ``id`` column as the primary key where the foreign key is ``_check_facts.id``.
118
+
* Add columns for the check outputs that you would like to store in the database. If a column needs to appear as a justification in the HTML/JSON report, pass ``info={"justification": JustificationType.<TEXT or HREF>}`` to the column mapper.
119
+
* Add ``__mapper_args__`` class variable and set ``"polymorphic_identity"`` key to the table name.
120
+
121
+
.. code-block:: python
122
+
123
+
# Add this line at the top of the file to create the logger object if you plan to use it.
As you can see, the result of the check is returned via the :class:`CheckResultData <macaron.slsa_analyzer.checks.check_result.CheckResultData>` object.
196
+
You should specify a :class:`Confidence <macaron.slsa_analyzer.checks.check_result.Confidence>`
197
+
score choosing one of the :class:`Confidence <macaron.slsa_analyzer.checks.check_result.Confidence>` enum values,
198
+
e.g., :class:`Confidence.HIGH <macaron.slsa_analyzer.checks.check_result.Confidence.HIGH>` and pass it via keyword
199
+
argument :attr:`confidence <macaron.database.table_definitions.CheckFacts.confidence>`. You should choose a suitable
200
+
confidence score based on the accuracy of your check analysis.
201
+
202
+
'''''''''''''''''''
203
+
Register your check
204
+
'''''''''''''''''''
205
+
206
+
Finally, you need to register your check by adding it to the :mod:`registry module <macaron.slsa_analyzer.registry>` at the end of your check module:
207
+
208
+
.. code-block:: python
209
+
210
+
registry.register(RepoCheck())
211
+
212
+
213
+
'''''''''''''''
214
+
Test your check
215
+
'''''''''''''''
216
+
217
+
Finally, you can add tests for you check by adding ``tests/slsa_analyzer/checks/test_repo_check.py`` module. Macaron
218
+
uses `pytest <https://docs.pytest.org>`_ and `hypothesis <https://hypothesis.readthedocs.io>`_ for testing. Take a look
0 commit comments