Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarifying questions re: cosign vulnerability attestation spec #1381

Closed
luhring opened this issue Jan 31, 2022 · 9 comments
Closed

Clarifying questions re: cosign vulnerability attestation spec #1381

luhring opened this issue Jan 31, 2022 · 9 comments
Labels
no-issue-activity question Further information is requested

Comments

@luhring
Copy link
Contributor

luhring commented Jan 31, 2022

Hi! I saw @developer-guy open anchore/grype#614. I think vulnerability scan attestations are a great idea, and I've been catching up on what exists already via these places so far:

Before starting to apply this spec in a real implementation, I have a few questions that would help me better understanding how to use it.

(ALSO: I should apologize for just now asking these questions and not weighing in earlier in the issue and PR discussions. Sorry about that! 🙇 )

  1. The invocation object is shown in the spec but its fields aren't documented below like the other parts of the spec. I don't understand the role that invocation is intended to play in vuln scanning workflows. I believe @dlorenc suggested that we drop this object. Can it be removed from the spec? And if not, could we document the intent of this data in the spec?

  2. I was following the conversation here, and it looks like there's an intention to use the statement's predicateType field to specify not only this Cosign spec but also the type of the "wrapped" data that goes into the result object. Is that true? If so, could the spec example be updated to show how this relationship should work? (i.e. can we include a bit more of the in-toto statement than just the predicate itself?)

  3. It looks like this specification features Trivy consistently throughout the document. (And it looks like the example attestation isn't using a wrapped standard, it's using Trivy's own JSON format.) Is this intentional?

  4. Is result the right name for a field that wraps an entire scan output from another spec? I'm thinking this term might be overloaded. Those specs tend to include data that's a superset of "results", and then result tends to be an actual field within the specification (i.e. to specify the vulnerability matches).

  5. The data contained in scanner seems to always be included in other specifications (e.g. see SARIF, CycloneDX). Is this redundant?

  6. The metadata section shows the start and finish times for the scan. Two thoughts on this: a) this tends to be provided in the vulnerability standard spec (e.g. see SARIF, CycloneDX), and b) (which may be moot), usually the time of the scan is relatively less important than other time-based factors. What matters most in a vulnerability scan are the inputs that are required to reproduce the same results: the vulnerability dataset, the list of discovered software (ideally an SBOM), and the scan tool itself. It can be misleading to promote "scan time" (especially when coupled with a "finish time") to consumers making a policy decision, because a later scan run doesn't necessarily imply that the scan was done in light of more recent vulnerability data, as some would assume.

Thanks for the help!

@luhring luhring added the question Further information is requested label Jan 31, 2022
@luhring luhring changed the title Clarying questions re: cosign vulnerability attestation spec Clarifying questions re: cosign vulnerability attestation spec Jan 31, 2022
@dlorenc
Copy link
Member

dlorenc commented Feb 2, 2022

  • The invocation object is shown in the spec but its fields aren't documented below like the other parts of the spec. I don't understand the role that invocation is intended to play in vuln scanning workflows. I believe @dlorenc suggested that we drop this object. Can it be removed from the spec? And if not, could we document the intent of this data in the spec?

+1 to removing this one!

  • It looks like this specification features Trivy consistently throughout the document. (And it looks like the example attestation isn't using a wrapped standard, it's using Trivy's own JSON format.) Is this intentional?

I think the idea was to allow embedding anything here. The only real "standard" we saw around this was SARIF, which isn't widely supported enough for me to feel comfortable requiring.

  • Is result the right name for a field that wraps an entire scan output from another spec? I'm thinking this term might be overloaded. Those specs tend to include data that's a superset of "results", and then result tends to be an actual field within the specification (i.e. to specify the vulnerability matches).

Naming is hard! Any other suggestions? I'm not picky :)

  • The data contained in scanner seems to always be included in other specifications (e.g. see SARIF, CycloneDX). Is this redundant?

I kind of like having this at the top level since it would make it easier to write policy against consistently, without needing to know where to look in different places. But I'm not too opinionated here.

@dlorenc
Copy link
Member

dlorenc commented Feb 2, 2022

cc @developer-guy any other thoughts here?

@Dentrax
Copy link
Member

Dentrax commented Mar 1, 2022

Missed this issue. Thanks for the questions. @luhring


I don't understand the role that invocation is intended to play in vuln scanning workflows.

Actually, we were thinking that field to use as a source of scan, for example:

    "invocation": { 
      "parameters": null,
      "uri": "https://github.com/developer-guy/alpine/actions/runs/1071875574",
      "event_id": "1071875574",
      "builder.id": "github actions"
    },

I'm saying here: "This vuln spec generated in GitHub Action at URI". If this makes sense to you, we can consider documenting it.

it looks like there's an intention to use the statement's predicateType field to specify not only this Cosign spec but also the type of the "wrapped" data that goes into the result object.

Correct! We should update the documentation. But here is an example:

"predicateType": "https://sigstore.dev/v1/VulnerabilityScanResult+sarif"
"predicateType": "https://sigstore.dev/v1/VulnerabilityScanResult+json"

We are supporting to change the predicateType according to the given result format. That's because we want to provide a generic solution in the first place for all scanners. You can put an .csv if you wish.

Those specs tend to include data that's a superset of "results", and then result tends to be an actual field within the specification

I didn't think from this perspective. :D Since result is an arbitrary interface{} type, i'm not sure how results fits in this case, it sounds like list of interfaces []interface{} to me.

What matters most in a vulnerability scan are the inputs that are required to reproduce the same results: the vulnerability dataset, the list of discovered software (ideally an SBOM), and the scan tool itself. It can be misleading to promote "scan time" (especially when coupled with a "finish time") to consumers making a policy decision, because a later scan run doesn't necessarily imply that the scan was done in light of more recent vulnerability data, as some would assume.

Strongly agree with you. We thought that times might be useful for some tracing and estimating scan durations. It would be great to extend and add some more important fields in metadata spec. What kind of information would be much useful to see in this case? Any spec model you want to propose?

@luhring
Copy link
Contributor Author

luhring commented Mar 6, 2022

Thanks for the comments! I'll try to keep my thoughts organized by topic...

Including an invocation section (question 1)

I'm saying here: "This vuln spec generated in GitHub Action at URI". If this makes sense to you, we can consider documenting it.

I still don't think I'm following yet... Why does this information need to be captured in the attestation spec for all vulnerability scans? This data structure seems very specific to GitHub Actions, and possibly other similar CI services, but it doesn't seem generally applicable to vulnerability scans. And I agree with @dlorenc's comment about this invocation data that the "scanning service should encapsulate all of that."

The consensus in the original issue for this spec seemed to be to remove invocation, and I'm guessing that just got missed in the final review?

Documenting the predicateType (question 2)

it looks like there's an intention to use the statement's predicateType field to specify not only this Cosign spec but also the type of the "wrapped" data that goes into the result object.

Correct! We should update the documentation. But here is an example:

"predicateType": "https://sigstore.dev/v1/VulnerabilityScanResult+sarif"
"predicateType": "https://sigstore.dev/v1/VulnerabilityScanResult+json"

+1 to documenting how to use this field for wrapping vulnerability scan data in this particular attestation spec. 👏

But the examples of ...+sarif and ...+json don't quite clear it up for me. From what I understand, all in-toto predicate data should ideally be JSON, and so we shouldn't need to specify "JSON" in predicateType URIs.

Addressing this would also obviate my next question, which is "what data structure is actually described by "https://sigstore.dev/v1/VulnerabilityScanResult+json"?".

Demonstrating the spec w/ a Trivy-specific format (question 3)

The only real "standard" we saw around this was SARIF, which isn't widely supported enough for me to feel comfortable requiring.

I think it's important that we use an open standard for the wrapped data format for this spec (i.e. not a format specific to a single vulnerability scanning tool). I'd suggest using either SARIF or CycloneDX for the examples in this spec. If neither of these meet the requirements for an open standard, let's follow up so I can understand the reqs a bit better!

Naming the field for scanner output (question 4)

Is result the right name for a field that wraps an entire scan output from another spec? I'm thinking this term might be overloaded. Those specs tend to include data that's a superset of "results", and then result tends to be an actual field within the specification (i.e. to specify the vulnerability matches).

Naming is hard! Any other suggestions? I'm not picky :)

Agreed 😄 If this field is contextualized by its parent field scanner, maybe something like "output", resulting in .scanner.output? This name is unlikely to be used in the spec that the scanner is outputting.

Since result is an arbitrary interface{} type, i'm not sure how results fits in this case, it sounds like list of interfaces []interface{} to me.

I don't think we should make this field have a slice type. I think we leave it as a completely arbitrary type, which can be strongly inferred using the "wrapped type" described in predicateType.

Redundancy of scanner section (question 5)

The data contained in scanner seems to always be included in other specifications (e.g. see SARIF, CycloneDX). Is this redundant?

I kind of like having this at the top level since it would make it easier to write policy against consistently, without needing to know where to look in different places. But I'm not too opinionated here.

After thinking about this more I agree, @dlorenc. I like the idea of policy (and queries) for attestations being achievable for certain situations without needing to delve into the wrapped data.

The db section might be an exception to this, since these data values seem to require an understanding of the underlying tool — but certainly up for discussion.

The metadata section (question 6)

What matters most in a vulnerability scan are the inputs that are required to reproduce the same results: the vulnerability dataset, the list of discovered software (ideally an SBOM), and the scan tool itself. It can be misleading to promote "scan time" (especially when coupled with a "finish time") to consumers making a policy decision, because a later scan run doesn't necessarily imply that the scan was done in light of more recent vulnerability data, as some would assume.

Strongly agree with you. We thought that times might be useful for some tracing and estimating scan durations. It would be great to extend and add some more important fields in metadata spec. What kind of information would be much useful to see in this case? Any spec model you want to propose?

Why is it important for consumers of this data to trace and estimate scan duration? What impact would this have on downstream policy?

Re: adding more fields, I'd lean toward @dlorenc's thinking on question 5. If we want to have a section of this spec specifically motivated by high-level policy evaluation outside of the wrapped scan data, perhaps we just have one section for this (not two: scanner and metadata). I could see a case for "scan time" being one of the fields, but like I mention above, we'd need to be careful when describing the intended use of this field. If we're considering more fields to pull out from existing vulnerability scan specifications, I'd suggestion perusing the SARIF and CycloneDX vulnerability scan specs.

Overall thinking

A recurring suggestion I saw in in-toto/attestation#58 that I never saw applied was to map out a realistic story of how consumers might use these vulnerability attestations in a real security workflow. I think if we did this, that'd really help to motivate which fields are generally useful and which ones aren't needed after all.

Thanks for everyone's thoughts so far! I'm looking forward to the end state of this discussion. ❤️

@jessesanford
Copy link

I am interested in if the predicateType field was ever overloaded to include wrapped type?

It looks like it was agreed that using the "+" syntax was not necessary and that "/" would be used instead here: in-toto/attestation#58 (comment) however in the above comment by @Dentrax #1381 (comment) "+" is mentioned again.

Also the current version of the spec no wrapped type is mentioned in the predicate type URI at all: "cosign.sigstore.dev/attestation/vuln/v1" https://github.com/sigstore/cosign/blob/main/specs/COSIGN_VULN_ATTESTATION_SPEC.md

Is this decision still in flight?

@github-actions
Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions
Copy link

github-actions bot commented Sep 2, 2022

This issue was closed because it has been stalled for 5 days with no activity.

@github-actions
Copy link

github-actions bot commented Nov 2, 2022

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions
Copy link

github-actions bot commented Nov 8, 2022

This issue was closed because it has been stalled for 5 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-issue-activity question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants