Let's Encrypt may be down for maintenance or `directoryUrl` may be wrong #1

alex996 · 2021-04-28T15:55:39Z

Earlier this week (Monday, Apr 26 around 12:30 ET) Let's Encrypt was undergoing maintenance and its ACME v2 URL https://acme-v02.api.letsencrypt.org/directory was returning an error. I have greenlock-express set up with a valid cert (issued in March, expiring in June). I needed to restart Node but I got the following error:

Listening on 0.0.0.0:80 for ACME challenges, and redirecting to HTTPS
Listening on 0.0.0.0:443 for secure traffic
Ready to Serve:
	 demo.example.com
ACME Directory URL: https://acme-v02.api.letsencrypt.org/directory
[debug] Let's Encrypt may be down for maintenance or `directoryUrl` may be wrong
set greenlockOptions.notify to override the default logger
Error cert_order:
Cannot read property 'termsOfService' of undefined
TypeError: Cannot read property 'termsOfService' of undefined
    at fin (/path/node_modules/@root/acme/acme.js:74:23)
    at /path/node_modules/@root/acme/acme.js:95:12
    at processTicksAndRejections (internal/process/task_queues.js:93:5)
    at Object.greenlock._acme (/path/node_modules/@root/greenlock/greenlock.js:393:9)
    at Object.greenlock._order (/path/node_modules/@root/greenlock/greenlock.js:421:20)
    at Object.greenlock._renew (/path/node_modules/@root/greenlock/greenlock.js:335:9)
    at Object.greenlock.get (/path/node_modules/@root/greenlock/greenlock.js:212:23)

It seems that greenlock pings the ACME endpoint every 1 hour, is that correct? From @root/greenlock/greenlock.js:387:

var dir = caches[dirUrl];
// don't cache more than an hour
if (dir && Date.now() - dir.ts < 1 * 60 * 60 * 1000) {
    return dir.promise;
}

await acme.init(dirUrl).catch(function(err) {
    // TODO this is a special kind of failure mode. What should we do?
    console.error(
        "[debug] Let's Encrypt may be down for maintenance or `directoryUrl` may be wrong"
    );
    throw err;
});

I don't fully understand the intent here but my question is - if the cert is still valid (in my case, it's expiring in June), a. why is it necessary to ping the ACME endpoint, and b. why does this ping prevent the Node server from starting (again, despite a valid cert)?

Expected: given a valid cert, greenlock should start the Node server.
Actual: given a valid cert, greenlock fails to start because ACME v2 endpoint is unavailable.

Packages:

@root/greenlock v4.0.5
@root/acme v3.1.0
@root/greenlock-express v4.0.4

Thank you.

The text was updated successfully, but these errors were encountered:

coolaj86 · 2021-04-28T16:52:06Z

Why is it necessary to ping the ACME endpoint?

Fail early. If someone is starting the server with incorrect settings, we want them to know right away.

It seems that greenlock pings the ACME endpoint every 1 hour, is that correct?

No. It caches the directory URL so that it doesn't fetch it again for at least an hour (as opposed to every time it's needed).

Why does this ping prevent the Node server from starting (again, despite a valid cert)?

// TODO this is a special kind of failure mode. What should we do?

"In the face of ambiguity, refuse the temptation to guess."

I think that it would be reasonable to make the default behavior to log the error and to continue rather than throw, now that the use case is better understood.

alex996 · 2021-04-28T17:45:03Z

Thanks. IIUC, if we remove this throw statement:

// @root/greenlock/greenlock.js:393
await acme.init(dirUrl).catch(function(err) {
    // TODO this is a special kind of failure mode. What should we do?
    console.error(
        "[debug] Let's Encrypt may be down for maintenance or `directoryUrl` may be wrong"
    );
    // throw err; // <--- this
});

and the call to ACME v2 does fail, then the metadata won't be initialized:

// @root/acme.js:69
me.init = function (opts) {
// ...
    function fin(dir) {
      me._directoryUrls = dir; // <--- this won't run
      me._tos = dir.meta.termsOfService; // <--- and this
      return dir;
    }

Which means acme._orderCert will need to call init again:

// @root/acme.js:1145
ACME._orderCert = function (me, options, kid) {
// ...
    return U._jwsRequest(me, {
        url: me._directoryUrls.newOrder, // <--- this will be missing

Alternatively, we can ping ACME v2 periodically (every 1 hour?) until it is back up. That said, I'm not sure if me._directoryUrls and me._tos are used elsewhere as well.

I think I get the general idea, so I can write up a PR if this makes sense.

mikealeonetti · 2021-05-10T17:55:53Z

Is there a setting that would allow the server to start even though let's encrypt API is down for maintenance? I did have a valid cert also and had to restart node and now the server is just down. Would love to prevent this in the future.

eloquence · 2021-10-11T23:24:39Z

This appears to be biting me today during an LE outage - had to restart Node for unrelated reasons and now the site is just down. Definitely would be nice for this module to handle such situations more gracefully.

eloquence · 2021-10-11T23:44:52Z

While the main API endpoint is down I was able to bring my server back up by temporarily switching to the staging API directory endpoint (since the cert is still valid this did not appear to have any unintended side effects, for now).

coolaj86 · 2021-10-12T18:58:31Z

I'm convinced that this is a problem that needs to be solved. Would someone like to make a PR, test it, and ping me?

alex996 mentioned this issue Oct 15, 2021

Unknown desc = failed to select one blockedKeys: commands out of sync #3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Let's Encrypt may be down for maintenance or `directoryUrl` may be wrong #1

Let's Encrypt may be down for maintenance or `directoryUrl` may be wrong #1

alex996 commented Apr 28, 2021

coolaj86 commented Apr 28, 2021

alex996 commented Apr 28, 2021 •

edited

Loading

mikealeonetti commented May 10, 2021

eloquence commented Oct 11, 2021

eloquence commented Oct 11, 2021

coolaj86 commented Oct 12, 2021

Let's Encrypt may be down for maintenance or directoryUrl may be wrong #1

Let's Encrypt may be down for maintenance or directoryUrl may be wrong #1

Comments

alex996 commented Apr 28, 2021

coolaj86 commented Apr 28, 2021

alex996 commented Apr 28, 2021 • edited Loading

mikealeonetti commented May 10, 2021

eloquence commented Oct 11, 2021

eloquence commented Oct 11, 2021

coolaj86 commented Oct 12, 2021

Let's Encrypt may be down for maintenance or `directoryUrl` may be wrong #1

Let's Encrypt may be down for maintenance or `directoryUrl` may be wrong #1

alex996 commented Apr 28, 2021 •

edited

Loading