Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No new DB published since 21-11-2024 #2282

Closed
tomersein opened this issue Nov 25, 2024 · 7 comments
Closed

No new DB published since 21-11-2024 #2282

tomersein opened this issue Nov 25, 2024 · 7 comments
Assignees
Labels
bug Something isn't working changelog-ignore Don't include this issue in the release changelog

Comments

@tomersein
Copy link
Contributor

What happened:
hello! i've just noticed no new db was released since 21-11-2024
is it the expected behave?
What you expected to happen:
new db each day \ 2 days

    "5": [
      {
        "built": "2024-11-21T01:33:24Z",
        "checksum": "sha256:6d86fc3fcf8f131791bcf3fe0d5f402c1bb363954cae02cc98314e82e91ae177",
        "url": "https://grype.anchore.io/databases/vulnerability-db_v5_2024-11-21T01:33:24Z_1732422136.tar.gz",
        "version": 5
      },
      {
        "built": "2024-11-21T01:33:24Z",
        "checksum": "sha256:f046fa2aeb77df72cd815b24c625ff69acb14f295ec14ed12b8bdb284af0a326",
        "url": "https://grype.anchore.io/databases/vulnerability-db_v5_2024-11-21T01:33:24Z_1732335731.tar.gz",
        "version": 5
      },
      {
        "built": "2024-11-21T01:33:24Z",
        "checksum": "sha256:aaf5dd62d19d0d9cfd9b8ee81f5a3476b73812202c6a5ed83e485d404bcbe4f7",
        "url": "https://grype.anchore.io/databases/vulnerability-db_v5_2024-11-21T01:33:24Z_1732249415.tar.gz",
        "version": 5
      }
    ]

How to reproduce it (as minimally and precisely as possible):
you can check it by running
grype db list and see the last releases

Anything else we need to know?:

Environment:

  • Output of grype version: 0.85.0
  • OS (e.g: cat /etc/os-release or similar): mac
@tomersein tomersein added the bug Something isn't working label Nov 25, 2024
@popey
Copy link
Contributor

popey commented Nov 25, 2024

Hi @tomersein - thanks for the issue. Yes, it looks like there were some failures in the github actions that publish the database. We'll get on that.

https://github.com/anchore/grype-db/actions?query=is%3Afailure

@tomersein
Copy link
Contributor Author

hi thanks!
suggest to have some monitoring on issues related to that :)
maybe if a new db is not published in x days send some alerts

@popey
Copy link
Contributor

popey commented Nov 25, 2024

Yeah, we do get alerts, but like many places on the planet, it was the weekend :)

@wagoodman
Copy link
Contributor

wagoodman commented Nov 25, 2024

@tomersein wanted to clarify the current state of things. We are building a new DB nightly with the latest data for all providers excluding NVD: their API does not appear to be functioning at the moment:

curl -v -H 'Accept: application/json' 'https://services.nvd.nist.gov/rest/json/cves/2.0?cveId=CVE-2019-1010218'
* Host services.nvd.nist.gov:443 was resolved.
* IPv6: (none)
* IPv4: 54.85.30.225
*   Trying 54.85.30.225:443...
* Connected to services.nvd.nist.gov (54.85.30.225) port 443
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* (304) (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256 / [blank] / UNDEF
* ALPN: server accepted http/1.1
* Server certificate:
*  subject: CN=*.nvd.nist.gov
*  start date: Sep 28 01:12:50 2024 GMT
*  expire date: Dec 27 01:12:49 2024 GMT
*  subjectAltName: host "services.nvd.nist.gov" matched cert's "*.nvd.nist.gov"
*  issuer: C=US; O=Let's Encrypt; CN=R11
*  SSL certificate verify ok.
* using HTTP/1.x
> GET /rest/json/cves/2.0?cveId=CVE-2019-1010218 HTTP/1.1
> Host: services.nvd.nist.gov
> User-Agent: curl/8.7.1
> Accept: application/json
> 
* Request completely sent off
< HTTP/1.1 503 Service Unavailable
< content-length: 107
< cache-control: no-cache
< content-type: text/html
< 
<html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>
* Connection #0 to host services.nvd.nist.gov left intact

Today we mark the DB age based on the data in the DB, not when it was built. But something that is not obvious is that we use that same date (the data pull timestamp) in the DB archive and the built field in the metadata. I understand this is really not clear and is something that is already being adjusted to be more clear in the v6 schema. But there are a couple of ways to tell if the DB workflow has been running:

  1. From the grype db list looking at the suffix
 "5": [
      {
        "built": "2024-11-21T01:33:24Z",
        "checksum": "sha256:6d86fc3fcf8f131791bcf3fe0d5f402c1bb363954cae02cc98314e82e91ae177",
        "url": "https://grype.anchore.io/databases/vulnerability-db_v5_2024-11-21T01:33:24Z_1732422136.tar.gz",
        "version": 5
      },
      {
        "built": "2024-11-21T01:33:24Z",
        "checksum": "sha256:f046fa2aeb77df72cd815b24c625ff69acb14f295ec14ed12b8bdb284af0a326",
        "url": "https://grype.anchore.io/databases/vulnerability-db_v5_2024-11-21T01:33:24Z_1732335731.tar.gz",
        "version": 5
      },
      {
        "built": "2024-11-21T01:33:24Z",
        "checksum": "sha256:aaf5dd62d19d0d9cfd9b8ee81f5a3476b73812202c6a5ed83e485d404bcbe4f7",
        "url": "https://grype.anchore.io/databases/vulnerability-db_v5_2024-11-21T01:33:24Z_1732249415.tar.gz",
        "version": 5
      }
    ]

Note the suffixes are epoch formatted timestamps:

  • 1732422136 -> Sunday, November 24, 2024 4:22:16 AM
  • 1732335731 -> Saturday, November 23, 2024 4:22:11 AM
  • 1732249415 -> Friday, November 22, 2024 4:23:35 AM
  1. From github actions:

The notable thing is that the data sync has been failing https://github.com/anchore/grype-db/actions/workflows/daily-data-sync.yaml :

Screenshot 2024-11-25 at 9 55 58 AM

This is the job that feeds the DB build job, persisting the latest vulnerability data as OCI images
The failure mode is as follows: when a provider fails, use the last functioning provider cache as the data to build the DB.

So we've been building a DB with all of the latest available vulnerability data, but the metadata we're providing could be a lot more clear here (again, will be adjusted in v6 soon).

All that being said, we do have monitoring and alerts for these failures and we've been monitoring the failures since last week but took no action since upstream provider API outages tend restore within a day or two. However, we will be taking operational action today since the NVD API has been down long enough, the DB age is old enough where we'll soon be tripping the eldest allowable data age for the DB (thus automatically failing builds).

We'll rebuild todays DB with different metadata shortly -- stay tuned.

@willmurphyscode willmurphyscode moved this to In Progress in OSS Nov 25, 2024
@wagoodman wagoodman added the changelog-ignore Don't include this issue in the release changelog label Nov 25, 2024
@wagoodman wagoodman changed the title no new db from the 21-11-2024 No new DB published since 21-11-2024 Nov 25, 2024
@luhring
Copy link
Contributor

luhring commented Nov 25, 2024

Thanks for the detailed walkthrough! I have a few lingering questions, and sorry in advance if I'm not grasping the obvious...

Today we mark the DB age based on the data in the DB, not when it was built.

Since there are multiple data providers, how does the age of each provider's data factor into the final built metadata value for the Grype DB?

Like for this case specifically, are you saying that the built value is 2024-11-21 01:33:24 +0000 UTC because that's the last data the Grype DB build process received from NVD before its API issues began? If so, for the other providers that have continued providing newer data in the past few days, do those "last updated" timestamps not affect the Grype DB built value?

the metadata we're providing could be a lot more clear here (again, will be adjusted in v6 soon).

This makes sense, and I'm excited for v6. The metadata here also impacts whether Grype clients will update their local data, is that right? As in, when Grype goes to do a DB update, it won't grab new data currently (and hasn't since Nov 22).

@wagoodman
Copy link
Contributor

wagoodman commented Nov 25, 2024

I'll answer the questions a couple different ways:

For v5 (today)

how does the age of each provider's data factor into the final built metadata value for the Grype DB?

We read through all vunnel provider workspaces to get basic information like when the data was compiled by vunnel, provider name, vunnel results expected digest, etc. This looks something like this:

cat data/vunnel/nvd/metadata.json
{
  "provider": "nvd",
  "urls": [
    "https://services.nvd.nist.gov/rest/json/cves/2.0",
    "https://github.com/anchore/nvd-data-overrides/archive/refs/heads/main.tar.gz"
  ],
  "store": "sqlite",
  "timestamp": "2024-11-14T01:33:39.311909+00:00",
  "version": 2,
  "distribution_version": 1,
  "listing": {
    "digest": "a04fb96e1730afc3",
    "path": "checksums",
    "algorithm": "xxh64"
  },
  "schema": {
    "version": "1.0.2",
    "url": "https://raw.githubusercontent.com/anchore/vunnel/main/schema/provider-workspace-state/schema-1.0.2.json"
  },
  "stale": false
}

In grype DB we loop through all of these metadata files, gathering the timestamp from each. The DB timestamp is the eldest from the whole collection of timestamps for all providers.

Answer: there is no other provider information in the DB

for the other providers that have continued providing newer data in the past few days, do those "last updated" timestamps not affect the Grype DB built value?

correct 👍

For v6

We will have a providers table with the following information:

// Provider is the upstream data processor (usually Vunnel) that is responsible for vulnerability records. Each provider
// should be scoped to a specific vulnerability dataset, for instance, the "ubuntu" provider for all records from
// Canonicals' Ubuntu Security Notices (for all Ubuntu distro versions).
type Provider struct {
	// Name of the Vunnel provider (or sub processor responsible for data records from a single specific source, e.g. "ubuntu")
	ID string `gorm:"column:id;primaryKey"`

	// Version of the Vunnel provider (or sub processor equivalent)
	Version string `gorm:"column:version"`

	// Processor is the name of the application that processed the data (e.g. "vunnel")
	Processor string `gorm:"column:processor"`

	// DateCaptured is the timestamp which the upstream data was pulled and processed
	DateCaptured *time.Time `gorm:"column:date_captured"`

	// InputDigest is a self describing hash (e.g. sha256:123... not 123...) of all data used by the provider to generate the vulnerability records
	InputDigest string `gorm:"column:input_digest"`
}

There will not be a metadata.json with this information, it will require either a grype command or sqlite query.

edit: hit enter too early :)

@wagoodman
Copy link
Contributor

An update: the DB has been rebuilt that ignores the NVD data age. When running grype db update you should see the new DB.

Screenshot 2024-11-25 at 1 52 42 PM Screenshot 2024-11-25 at 1 52 55 PM

@github-project-automation github-project-automation bot moved this from In Progress to Done in OSS Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working changelog-ignore Don't include this issue in the release changelog
Projects
Archived in project
Development

No branches or pull requests

4 participants