Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: enable repo finder to support more languages via Open Source Insights #388

Merged
merged 31 commits into from
Sep 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
b30ef37
feat: enable repo finder to support more languages via Open Source In…
benmss Jul 28, 2023
39c1279
chore: enabled all supported languages in repo finder
benmss Jul 28, 2023
ede5b58
chore: add configuration option for deps.dev and update docs
benmss Jul 28, 2023
1a171ff
chore: addressed PR feedback
benmss Aug 1, 2023
6ab78f5
chore: add integration test for repo finder
benmss Aug 3, 2023
36f22c5
chore: addressed review feedback
benmss Aug 7, 2023
7e725c2
chore: addressed PR feedback; rebased and refactored
benmss Aug 29, 2023
e56ed6c
chore: updated parser name
benmss Aug 29, 2023
d274dd0
chore: minor fix
benmss Aug 30, 2023
0a7096f
chore: extended docstring of repo finder
benmss Aug 30, 2023
0a8b665
chore: Addressed PR feedback.
benmss Sep 5, 2023
7745019
chore: moved repo finder integration test to new file
benmss Sep 5, 2023
60a5cc7
chore: try to derive the SBOM component type
benmss Sep 5, 2023
64a7472
chore: Add PURL to DependencyInfo; Try to retrieve PURL from SBOM for…
benmss Sep 5, 2023
556514e
chore: renaming of deps.dev files
benmss Sep 5, 2023
3ad9c4a
chore: added integration tests for more languages
benmss Sep 5, 2023
abcd9a0
chore: restored removed test
benmss Sep 5, 2023
011da8b
chore: repo finder interface refactoring
benmss Sep 6, 2023
3ab5664
chore: updated repo finder return values
benmss Sep 6, 2023
83be471
chore: correctly added repo finder integration tests; fixed duplicate…
benmss Sep 6, 2023
8eae174
chore: removed repo finder test in docker integration tests
benmss Sep 6, 2023
8bce886
chore: moved URL validation to within Repo Finders
benmss Sep 7, 2023
4c8cba6
chore: moved url validator to repo finder
benmss Sep 8, 2023
f0ee636
chore: added repo validator
benmss Sep 8, 2023
50137f7
chore: updated docs
benmss Sep 8, 2023
9d4f657
chore: rebase and integrate with config purl change
benmss Sep 14, 2023
b369fe6
chore: addressed review feedback
benmss Sep 18, 2023
11ab3aa
chore: addressed review feedback
benmss Sep 21, 2023
5b600bf
chore: rebase and make use of updated send_get_http_raw
benmss Sep 21, 2023
bca62d5
chore: enable repo finder for sboms
benmss Sep 21, 2023
056b3f3
chore: updated docs
benmss Sep 21, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -40,11 +40,3 @@ macaron.dependency\_analyzer.dependency\_resolver module
:members:
:undoc-members:
:show-inheritance:

macaron.dependency\_analyzer.java\_repo\_finder module
------------------------------------------------------

.. automodule:: macaron.dependency_analyzer.java_repo_finder
:members:
:undoc-members:
:show-inheritance:
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
macaron.repo\_finder package
============================

.. automodule:: macaron.repo_finder
:members:
:undoc-members:
:show-inheritance:

Submodules
----------

macaron.repo\_finder.repo\_finder module
----------------------------------------

.. automodule:: macaron.repo_finder.repo_finder
:members:
:undoc-members:
:show-inheritance:

macaron.repo\_finder.repo\_finder\_base module
----------------------------------------------

.. automodule:: macaron.repo_finder.repo_finder_base
:members:
:undoc-members:
:show-inheritance:

macaron.repo\_finder.repo\_finder\_deps\_dev module
---------------------------------------------------

.. automodule:: macaron.repo_finder.repo_finder_deps_dev
:members:
:undoc-members:
:show-inheritance:

macaron.repo\_finder.repo\_finder\_java module
----------------------------------------------

.. automodule:: macaron.repo_finder.repo_finder_java
:members:
:undoc-members:
:show-inheritance:

macaron.repo\_finder.repo\_validator module
-------------------------------------------

.. automodule:: macaron.repo_finder.repo_validator
:members:
:undoc-members:
:show-inheritance:
1 change: 1 addition & 0 deletions docs/source/pages/developers_guide/apidoc/macaron.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Subpackages
macaron.output_reporter
macaron.parsers
macaron.policy_engine
macaron.repo_finder
macaron.slsa_analyzer

Submodules
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,14 @@ macaron.slsa\_analyzer.build\_tool.base\_build\_tool module
:undoc-members:
:show-inheritance:

macaron.slsa\_analyzer.build\_tool.docker module
------------------------------------------------

.. automodule:: macaron.slsa_analyzer.build_tool.docker
:members:
:undoc-members:
:show-inheritance:

macaron.slsa\_analyzer.build\_tool.gradle module
------------------------------------------------

Expand Down
57 changes: 50 additions & 7 deletions docs/source/pages/using.rst
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ To simplify the examples, we use the same configurations as above if needed (e.g
The list bellow shows examples for the corresponding PURL strings for different git repositories:

.. list-table:: Example of PURL strings for git repositories.
.. list-table:: Examples of PURL strings for git repositories.
:widths: 50 50
:header-rows: 1

Expand Down Expand Up @@ -133,6 +133,39 @@ You can also provide the PURL string together with the repository path. In this
.. note:: When providing the PURL and the repository path, both the branch name and commit digest must be provided as well.

''''''''''''''''''''''''''''''''''''''
Providing an artifact as a PURL string
''''''''''''''''''''''''''''''''''''''

The PURL format supports artifacts as well as repositories, and Macaron supports (some of) these too.

.. code-block::
pkg:<package_type>/<artifact_details>
Where ``artifact_details`` varies based on the provided ``package_type``. Examples for those currently supported by Macaron are as follows:

.. list-table:: Examples of PURL strings for artifacts.
:widths: 50 50
:header-rows: 1

* - Package Type
- PURL String
* - Maven (Java)
- ``pkg:maven/org.apache.xmlgraphics/[email protected]``
* - PyPi (Python)
- ``pkg:pypi/[email protected]``
* - Cargo (Rust)
- ``pkg:cargo/[email protected]``
* - NuGet (.Net)
- ``pkg:nuget/[email protected]``
* - NPM (NodeJS)
- ``pkg:npm/%40angular/[email protected]``

For more detailed information on converting a given artifact into a PURL, see `PURL Specification <https://github.com/package-url/purl-spec/blob/master/PURL-SPECIFICATION.rst>`_ and `PURL Types <https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst>`_

.. note:: If a repository is not also provided, Macaron will try to discover it based on the artifact purl. For this to work, ``find_repos`` in the configuration file **must be enabled**\. See `Analyzing more dependencies <#more-deps>`_ for more information about the configuration options of the Repository Finding feature.

-------------------------------------------------
Verifying provenance expectations in CUE language
-------------------------------------------------
Expand Down Expand Up @@ -191,6 +224,8 @@ With the example above, the generated output reports can be seen here:
- `micronaut-core.html <../_static/examples/micronaut-projects/micronaut-core/analyze_with_sbom/micronaut-core.html>`__
- `micronaut-core.json <../_static/examples/micronaut-projects/micronaut-core/analyze_with_sbom/micronaut-core.json>`__

.. _more-deps:

'''''''''''''''''''''''''''
Analyzing more dependencies
'''''''''''''''''''''''''''
Expand All @@ -203,30 +238,38 @@ This feature is enabled by default. To disable, or configure its behaviour in ot

See :ref:`dump-defaults <action_dump_defaults>`, the CLI command to dump the default configurations in ``defaults.ini``. After making changes, see :ref:`analyze <analyze-action-cli>` CLI command for the option to pass the modified ``defaults.ini`` file.

Within the configuration file under the ``repofinder.java`` header, five options exist: ``find_repos``, ``artifact_repositories``, ``repo_pom_paths``, ``find_parents``, ``artifact_ignore_list``. These options behave as follows:
Within the configuration file under the ``repofinder.java`` header, three options exist: ``artifact_repositories``, ``repo_pom_paths``, ``find_parents``. These options behave as follows:
nathanwn marked this conversation as resolved.
Show resolved Hide resolved

- ``find_repos`` (Values: True or False) - Enables or disables the Repository Finding feature.
- ``artifact_repositories`` (Values: List of URLs) - Determines the remote artifact repositories to attempt to retrieve dependency information from.
- ``repo_pom_paths`` (Values: List of POM tags) - Determines where to search for repository information in the POM files. E.g. scm.url.
- ``find_parents`` (Values: True or False) - When enabled, the Repository Finding feature will also search for repository URLs in parents POM files of the current dependency.
- ``artifact_ignore_list`` (Values: List of GAs) - The Repository Finding feature will skip any artifact in this list. Format is "GroupId":"ArtifactId". E.g. org.apache.maven:maven

Under the related header ``repofinder``, two more options exist: ``find_repos``, and ``use_open_source_insights``:
nathanwn marked this conversation as resolved.
Show resolved Hide resolved

- ``find_repos`` (Values: True or False) - Enables or disables the Repository Finding feature.
- ``use_open_source_insights`` (Values: True or False) - Enables or disables use of Google's Open Source Insights API.

.. note:: Finding repositories requires at least one remote call, adding some additional overhead to an analysis run.

.. note:: Google's Open Source Insights API is currently used to find repositories for: Python, Rust, .Net, NodeJS

An example configuration file for utilising this feature:

.. code-block:: ini
[repofinder.java]
[repofinder]
find_repos = True
use_open_source_insights = True
[repofinder.java]
artifact_repositories = https://repo.maven.apache.org/maven2
repo_pom_paths =
scm.url
scm.connection
scm.developerConnection
find_parents = True
artifact_ignore_list =
org.apache.maven:maven
-------------------------------------
Analyzing a locally cloned repository
Expand Down
13 changes: 13 additions & 0 deletions scripts/dev_scripts/integration_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ HOMEDIR=$2
RESOURCES=$WORKSPACE/src/macaron/resources
COMPARE_DEPS=$WORKSPACE/tests/dependency_analyzer/compare_dependencies.py
COMPARE_JSON_OUT=$WORKSPACE/tests/e2e/compare_e2e_result.py
TEST_REPO_FINDER=$WORKSPACE/tests/e2e/repo_finder/repo_finder.py
RUN_MACARON="python -m macaron -o $WORKSPACE/output"
RESULT_CODE=0

Expand Down Expand Up @@ -532,3 +533,15 @@ then
echo -e "Expected zero status code but got $RESULT_CODE."
exit 1
fi

# Testing the Repo Finder's remote calls.
# This requires the 'packageurl' Python module
echo -e "\n----------------------------------------------------------------------------------"
echo "Testing Repo Finder functionality."
echo -e "----------------------------------------------------------------------------------\n"
python $TEST_REPO_FINDER || log_fail
if [ $? -ne 0 ];
then
echo -e "Expect zero status code but got $?."
log_fail
fi
6 changes: 3 additions & 3 deletions src/macaron/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,9 @@ def analyze_slsa_levels_single(analyzer_single_args: argparse.Namespace) -> None
# We don't mention --config-path as a possible option in this log message as it going to be move soon.
# See: https://github.com/oracle/macaron/issues/417
logger.error(
"Analysis target missing. Please provide a package url (PURL) and/or repo path. "
+ "Examples of a PURL can be seen at https://github.com/package-url/purl-spec: "
+ "pkg:github/micronaut-projects/micronaut-core."
"""Analysis target missing. Please provide a package url (PURL) and/or repo path.
Examples of a PURL can be seen at https://github.com/package-url/purl-spec:
pkg:github/micronaut-projects/micronaut-core."""
)
sys.exit(os.EX_USAGE)

Expand Down
8 changes: 4 additions & 4 deletions src/macaron/config/defaults.ini
Original file line number Diff line number Diff line change
Expand Up @@ -44,19 +44,19 @@ timeout = 2400
recursive = False

# This is the repo finder script.
[repofinder]
find_repos = True
use_open_source_insights = True

[repofinder.java]
# The list of maven-like repositories to attempt to retrieve artifact POMs from.
artifact_repositories = https://repo.maven.apache.org/maven2
find_repos = True
repo_pom_paths =
scm.url
scm.connection
scm.developerConnection
find_parents = True
parent_limit = 10
# Disables repo finding for specific artifacts based on their group and artifact IDs. Format: {groupId}:{artifactId}
# E.g. com.oracle.coherence.ce:coherence
artifact_ignore_list =

# Git services that Macaron has access to clone repositories.
# For security purposes, Macaron will only clone repositories from the hostnames specified.
Expand Down
1 change: 0 additions & 1 deletion src/macaron/config/global_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@ class GlobalConfig:
gh_token: str = ""
debug_level: int = logging.DEBUG
resources_path: str = ""
find_repos: bool = True

def load(
self,
Expand Down
34 changes: 24 additions & 10 deletions src/macaron/dependency_analyzer/cyclonedx.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,14 @@
from collections.abc import Iterable
from pathlib import Path

from packageurl import PackageURL

from macaron.config.defaults import defaults
from macaron.config.global_config import global_config
from macaron.dependency_analyzer.dependency_resolver import DependencyAnalyzer, DependencyInfo
from macaron.errors import MacaronError
from macaron.output_reporter.scm import SCMStatus
from macaron.repo_finder.repo_validator import find_valid_repository_url

logger: logging.Logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -160,21 +163,32 @@ def convert_components_to_artifacts(
Returns
-------
dict
A dictionary where dependency artifacts are grouped based on "artifactId:groupId".
A dictionary where dependency artifacts are grouped based on "groupId:artifactId".
"""
all_versions: dict[str, list[DependencyInfo]] = {} # Stores all the versions of dependencies for debugging.
latest_deps: dict[str, DependencyInfo] = {} # Stores the latest version of dependencies.
url_to_artifact: dict[str, set] = {} # Used to detect artifacts that have similar repos.
for component in components:
try:
# TODO make this function language agnostic when CycloneDX SBOM processing also is.
# See https://github.com/oracle/macaron/issues/464
key = f"{component.get('group')}:{component.get('name')}"
if component.get("purl"):
purl = PackageURL.from_string(str(component.get("purl")))
else:
# TODO remove maven assumption when optional non-existence of the component's purl is handled
# See https://github.com/oracle/macaron/issues/464
purl = PackageURL(
type="maven",
namespace=component.get("group"),
name=component.get("name"),
version=component.get("version") or None,
)

# According to PEP-0589 all keys must be present in a TypedDict.
# See https://peps.python.org/pep-0589/#totality
item = DependencyInfo(
version=component.get("version") or "",
group=component.get("group") or "",
name=component.get("name") or "",
purl=component.get("purl") or "",
purl=purl,
url="",
note="",
available=SCMStatus.AVAILABLE,
Expand All @@ -187,10 +201,10 @@ def convert_components_to_artifacts(
# IN case of a build error, we use this as a heuristic to avoid analyzing
# submodules that produce development artifacts in the same repo.
if (
"snapshot"
in (item.get("version") or "").lower() # or "" is not necessary but mypy produces a FP otherwise.
"snapshot" in (purl.version or "").lower()
# or "" is not necessary but mypy produces a FP otherwise.
and root_component
and item.get("group") == root_component.get("group")
and purl.namespace == root_component.get("group")
):
continue
logger.debug(
Expand All @@ -199,7 +213,7 @@ def convert_components_to_artifacts(
)
else:
# Find a valid URL.
item["url"] = DependencyAnalyzer.find_valid_url(
item["url"] = find_valid_repository_url(
link.get("url") for link in component.get("externalReferences") # type: ignore
)

Expand Down Expand Up @@ -228,7 +242,7 @@ def get_deps_from_sbom(sbom_path: str | Path) -> dict[str, DependencyInfo]:
Returns
-------
A dictionary where dependency artifacts are grouped based on "artifactId:groupId".
A dictionary where dependency artifacts are grouped based on "groupId:artifactId".
"""
return convert_components_to_artifacts(
get_dep_components(
Expand Down
Loading
Loading