-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accept certificates with invalid serial number encodings #232
Comments
cc @cjntaylor @tdyas |
I think what is happening here is: the serial number is incorrectly encoded, and so is negative. This is illegal -- RFC5280:
In the case of the test given here, the serial number is -0x3FC1DAC87904ACD759C456F2816694AC. It's worth noting that if this certificate were issued by a real CA, it wouldn't be allowed by the baseline requirements and would be a misissuance:
Though naturally a private CA is not subject to these requirements. |
Related: golang relaxed this in golang/go@a0ea93d to deal with the same issue -- see eg https://stackoverflow.com/questions/25526870/x509-parsing-error-negative-serial-number-while-pulling-repository and golang/go#8265 |
Thanks for the quick response! Makes total sense. Are you amenable to the argument from the golang case that "This buggy software, sadly, appears to be common enough that we should let these errors pass"? :) |
@cjntaylor @benjyw The most correct solution is to fix the broken certificate, assuming the serial number is malformed as was explained to me. Previously people have reported bugs against the webpki project because webpki's parser is really strict and we've been successful in convincing people, e.g. the AWS certificate authority team, to fix the bug on their end. Can you say which buggy software created the malformed certificate? |
According to the PR, the system is Focepoint NGFW Enterprise Firewall (https://www.forcepoint.com/product/ngfw-next-generation-firewall). Let's find a contact with them. |
I agree that we should definitely report this bug to them. The problem is
that this is enterprise software. So even if they agree that this needs
fixing, we'd be waiting on their release, which could take months, followed
by the enterprise IT department agreeing to and then executing on the
upgrade, which could take months or even years longer.
…On Fri, May 7, 2021 at 1:03 PM Brian Smith ***@***.***> wrote:
According to the PR, the system is Focepoint NGFW Enterprise Firewall (
https://www.forcepoint.com/product/ngfw-next-generation-firewall). Let's
find a contact with them.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#232 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAD5F7HF5PPEB2IV37XMASLTMRBR7ANCNFSM44KT2EBQ>
.
|
My 2 cents: tell users that encounter this problem run with system TLS on their own fork. It will be less uniform across platforms, less reproducible, and subject to a variety of issues that Rustls/webpki doesn't have. This is the line to change: You'll get whatever version of OpenSSL is on a Linux system, for instance. |
@benjyw beat me to it. Forcepoint is managed at an enterprise scale. It's not a tool I *choose* to use - it's forced upon me anytime I make ANY ssl connection inside my company's intranet. Optimistically, if you can get Forcepoint to agree to fix this issue (if they even agree) in 6 months, and I can get a whole different department I'm not even a part of with a whole different release cycle to redeploy what amounts to a critical piece of security infrastructure for more than 8000 staff in 6 months (both of those time periods are incredibly generous, especially the latter with all of the testing and verification that would absolutely have to happen), we're looking at a year, at best, before I could even realistically hope to have this fixed. In the meantime, I'm dead in the water. I can't do anything that requires using I guess I just don't understand why there can't be any flexibility here. I perfectly understand that the RFC makes the negative serial number invalid. I also know that Forcepoint isn't the only tool that generates negative serial numbers, it's incredibly common, especially amongst Microsoft SSL tooling. There is a point at which following the RFC is detrimental to the practical usage of a tool. As it stands, I'll have to put out a warning to our entire staff not to use All this said, I don't disagree that we should fix this the right way. All I'm asking for is a bit of consideration. An option? A flag? Some way to run in "slightly less strict mode"? At least during the transition period while we wait what will be years for the right fix to be put in place. |
"tell users that encounter this problem to build, manage and distribute an entire ecosystem of the tool they want to use to fix a non-obvious mostly invented problem of a downstream package that is unique to a second-order transitive dependency of the tool" FTFY Telling users to build from scratch and manage a complex ecosystem of packages to fix your own issues isn't a solution. Way to kick the can. Thanks. I hate it. |
Hey, I think our messages overlapped, and I didn't realize you were so frustrated. That said, all Cargo packages are built from scratch, and I think you would just need to maintain a patch file (either in Cargo or external to it) to remove the "rustls-tls" feature from reqwest. That approach has the downsides I mentioned, but I believe it would work. |
@cjntaylor In case it isn't clear, I didn't close this issue. It's still open. So, implementing a workaround hasn't been rejected. |
Looking at the golang PR golang/go#8265, they note that RFC 5280 states in Section 4.1.2.2:
The last paragraph suggests providing a way to deal with the gracefully would be a good option to have. To that point, would
|
Apologies, they did a bit, and I'll rein it in a bit. It's a little more complicated than that with |
As relevant background, Pants uses Rust internally by embedding a shared object file into a Python wheel. This Python wheel is then installed into a Python virtual environment that is then used to run Pants as the build tool for a user's repository. Given this distribution model, it is a hard ask to have users build custom versions of Pants. |
That said, Pants has had to fork certain crates in the past and could do so here, but I would rather find an alternative before taking on that maintenance burden. |
We don't need more advocacy discussion for or against here. |
I don't think it makes sense to add a "strict mode" vs "non-strict" mode flag for this. It's more work, more code, but the strict mode doesn't seem like more security, so it seems like a net loss. If the default were strict mode then every user that would run into such issues would have to do work to cope with it. So I think the main choice in the short term is to be lenient always or strict always. I am open to the possibility of matching what Chromium and Firefox do, since generally when we compromise on strictness that's the upper limit on our compromise. I am working on some testing infrastructure that will make it easier to accept a PR that would implement a workaround for this. That will need to be done first. |
@briansmith Thanks! Appreciate the effort and the flexibility. |
Just to confirm this: I cleared the sign bit in the serial number and webpki was able to parse it. This is likely the only problem with this cert rather than the tip of an unpleasant iceberg. |
That's a relief! |
Hi, just checking in to see what it would take to get this resolved, and how I/we might be able to help. There was mention of needing some testing infrastructure , not sure what the status is on that. Thanks! |
Hi, gentle reminder that we'd love to see this resolved and are happy to help make it happen. Thanks! |
An ASN.1 integer that has its highest bit set is a negative value. Serial numbers cannot be negative. Serial numbers are used at least for revocation (CRL, OCSP) and duplicate detection and maybe other things. We don't implement revocation checks yet but that is planned. With that in mind, we need to have a clear understanding of what to do when the serial number appears to be negative. Here's my proposal:
The above proposal assumes that its serial number comparison rules are sufficient to make it impossible for an attacker to confuse us, so that there would be no security value in a "strict" mode that rejects serial numbers that are, according to standard ASN.1 encoding rules, negative. In order to implement this, we would start by copy-pasting ring's WDYT? |
@briansmith I've read your proposal and I generally like the idea of supporting negative serials, even though the spec forbids them. But I'm a bit critical of the proposal to strip the sign bit. How does your proposal try to issue OCSP requests? Will it now issue two OCSP requests, one with a stripped leading 0 and one with a leading 0 added? If the webpki internal representation of the serial has all leading 0s stripped, does this mean that webpki will have to always issue two OCSP requests, even when the certificate has originally been encoded validly, because webpki can't distinguish between validly and invalidly encoded certificates? If, as hypothesized in your proposal, webpki ever gains an accessor and exposes the serial with the sign bit stripped, tools that use this data might create inconsistent output. E.g. think of a CA that uses two different systems for issuance and CRL creation of certificates. The issuance side has the bug that it creates negative serial certificates. This appears to happen in the real world. The CRL creation side then uses webpki to parse an incoming list of certificates the module should integrate into the CRL. Then it uses that parsed result to integrate the serial into the CRL. The CRL would now contain a different serial than originally intended. I think your proposal tries to handle this case, but it can also partake in creating this case. Personally I feel that webpki should preserve the raw bytes used to encode the serial, including whether it contains a leading zero or not. The case where CAs output different signs for their CRLs and for their certificates should be rare to non existent, and especially I think one can't blame webpki if something goes wrong if it opaquely handles the serial. If one wants to be safe, one can have OCSP handling, CRL checking etc all fail if it's enabled and a negative serial is encountered. |
@est31 wrote:
I'm not proposing the strip the sign bit. Instead, I'm proposing to, effectively, insert the missing 0x00 and continue on as if it were there. For example, if the value is encoded |
@briansmith yeah I understand, and I read up extra on how integer DER/BER encoding works (two's complement) before I wrote the comment you replied to. My argument still holds if you substitute "stripping the sign bit" with "adding a leading 0x00", this was mainly just a figure of speech. I think webpki shouln't try to improve the data here, but instead just either fully support negative serials, or fully reject them. Not try to do a smart trick during parsing. For transparency: just recently I got a bug repoort in rcgen where it couldn't (validly) sign some certs... as it turns out, rcgen's internal data representation is not powerful enough (yet) to fully preserve the subject name. It turned a printable into a utf8 string, invalidating the subject name: rustls/rcgen#59 . The same issue applies here. |
If an entry in a CRL (a revoked one) is encoded correctly with the leading zero, but the certificate isn't, or vice versa, then we should consider the certificate to be revoked. If we required the encoding in the CRL to also be a negative number then we'd be potentially forcing the CRL software maker to have to add buggy behavior in order to revoke a certificate created by a buggy serializer, which seems like a non-starter to me, especially because certificates are revoked in emergency situations. In other words, I think it is a hard requirement that a syntactically-correct CRL should be able to revoke a certificate with a syntactically-incorrect serial number, for any form of serial number we support. |
The rcgen issue is quite different, I think, for reasons that are specific to how name encodings need to be handled in certificates. |
OK, I see now why you are concerned about my proposal. Yes, I guess we'd have to request both if we were to ever implement OCSP fetching. I expect that we'll likely only have built-in support for OCSP stapling, not fetching, though. |
Oh right yeah we've talked past each other 🤣 Indeed it would work with OCSP stapling. crlite might run into problems tho, as it was only designed to support serials from a pool of known ones, and it might regard a serial as revoked if it's not known to the bloom filter cascade. Querying the crlite structure twice might thus create false negatives. Btw, openssl shows the serial as negative:
go does the same (using certinfo to print it):
|
Yes, great points. We might have to rethink this when we actually do revocation checking then. |
Hey folks: thanks a lot for the thoughtful design work! If someone is able to mentor this ticket and add some instructions to help implement it, we'd be able to get somebody to tackle it... without that though I think we won't know where to start. |
OK, I think I'm ready to make a decision. Consider that nobody really wants to do OCSP fetching, maintaining and using CRLSet/crlite/etc. infrastructure isn't practical for many people, and we don't have any revocation checking implemented now. Also Rustls doesn't provide an interface for customizing certificate verification that could be used to provide a lenient/strict switch that would be useful for most Rustls users. The most likely next step for revocation checking would be OCSP stapling, and that's the revocation checking mechanism I would add first. We could add support for OCSP stapling using the lenient approach I suggested above. We could defer the decision of what to do in the conflict between the mis-encoded certificates and revocation checking for quite a while. In the meantime products that mis-encode certificates can hopefully be fixed and have their fixed versions deployed, and/or we can provide better configuration interfaces in users of webpki (Rustls, etc.). When we get to the point when the lenient approach would be practically troublesome, we can decide what to do. That might mean releasing a new major version of webpki that stops being lenient or provides an option to be strict or lenient, or something else. So, for now, let's go with the approach I outlined above. |
@briansmith any update on the attempts to contact the vendor of that firewall? I think this issue is a blocker for rustup adoption of rustls. The maintainer asked me to use the builtin certificate store to support precisely such firewalls. Right now rustls can be turned on optionally, but is not enabled by default. |
I've decided to morph this issue from being specific to Forcepoint NGFW to being about accepting certificates with serial numbers that are missing the leading zero that would indicate that the serial number is non-negative. Regarding the above discussion about the fact that it would be hard for every possible revocation mechanism to properly handle the ambiguity that arises from accepting such certificates: Rather than storing the certificate serial number as a slice of bytes with no other information, instead we can model it like so: enum SerialNumber<'a> {
// The
SyntacticallyValid((&'a [u8]),
/// The certificate serial number is syntactically a negative number, which is problematic
/// and ambiguous but unfortunately common.
///
/// More details about how revocation logic should deal with this, based on the
/// discussion above.
SyntacticallyValidInvalid(&'a [u8])
} Then revocation checking logic, when added, would accept a WDYT? |
Further, whether to accept invalid serial numbers at all is probably a function of the trust anchor. So we should probably have per-trust-anchor configuration for whether to accept a bad serial number for that trust anchor. Then we'd accept the invalid serial number during parsing, but then path building would only succeed if a chain to a trust anchor with that "allow bad serial numbers" bit is found. |
@briansmith your suggestion sounds good. A per trust anchor setting is probably the best. |
Nothing sensible to add, just wanted to say thanks for your ongoing work on this issue! |
Hi folks, I just wanted to check in on this issue. As @stuhood mentioned above, given some guidance we could help tackle this if that would help resolve it sooner. |
Hi folks, checking in on this issue again. Thanks! |
We've noticed an issue which we think is due to webpki failing to parse a cert that was rewritten by an SSL inspection system.
I wrote up a PR that exposes the issue, and provides more details: #231
See pantsbuild/pants#12000 for context. Note that none of the participants on that issue, including yours truly, are experts at TLS (or Rust). So this is about as far as we were able to get.
Thanks!
The text was updated successfully, but these errors were encountered: