-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(validations): add validations yaml and set up repo docs #2
feat(validations): add validations yaml and set up repo docs #2
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good for our initial cut!
validations.yaml
Outdated
lei: | ||
description: must be 20 characters and only contain A-Z, and 0-9 (no special characters) | ||
examples: | ||
- 123456789TESTBANK123 | ||
- 123456789TESTBANK456 | ||
- 123456TESTSUBBANK456 | ||
link: https://regex101.com/r/5AaT18/6 | ||
regex: ^[\dA-Z]{20}$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above examples are not actually valid LEIs.
- The 5th and 6th characters are supposed to aways be
00
.
I think we can also tune up the regex to find other potentially invalid LEIs.
- The last two digits are a checksum, so they should always be numeric.
- The first four digits (the LOU code) may be only numeric. All the examples I see have it as numeric, but none of the references I've found are explicit about it. I also can't find a list of all valid LOUs, which you'd think would be something you could download off of GLEIF. 🤷
Anyway, I think we could safely do something like:
^[A-Z0-9]{4}00[A-Z0-9]{16}\d{2}$
That'd at least catch the reserved 00
and the checksum at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good @hkeeler! We're going to need an update of the seed / DB data then too since those are the examples currently in the DB. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But maybe that regex above isn't quite right @hkeeler, since it's only supposed to be 20 characters? So maybe it's:
^[A-Z0-9]{4}00[A-Z0-9]{12}\d{2}$
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha wow it took some digging and looking at the ISO format but yeah GLEIF has it at https://www.gleif.org/en/about-lei/iso-17442-the-lei-code-structure also. Thanks Hans!
I wrote cfpb/sbl-project#187.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But maybe that regex above isn't quite right @hkeeler, since it's only supposed to be 20 characters? So maybe it's:
^[A-Z0-9]{4}00[A-Z0-9]{12}\d{2}$
Yes. Yours is correct. Somehow the LEI I was testing with was 24 characters long. I must have done something weird. 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not 100% sure if the LOU number that prefaces the LEI will always be just digits? Seems like it is, but I assume they'd hop to alphanumeric if they needed to account for more LOUs that can issue LEIs. I can't find something on the GLIEF website that says digits only? The spec ISO 17442-1:2020
at least doesn't confirm only digits for that part of the LEI?
I don't know. I'm happy to start with assuming each LEI will start with a 4 digit LOU and see what the data team says? (I did get the original alphanumeric 20 char restriction after talking with the data team)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha I was using the same lib. So yeah for the sbl-project json updates we can add the correct LEIs, just the question of, along with the regex check do we want the backend to validate the checksum?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's sounding like the may be legit reasons for an LEI to break the rules. There's certainly data in our current dataset that does. I think for this PR, let's keep it simple and just make it the original ^[\dA-Z]{20}$
, and we can circle back soon with @nongarak and @Kibrael to find out more and decide if we want to do anything more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, what a ride! 🎢 I might make an issue to revisit this later. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to keep the 2 digit checksum at the end at least? It's actually part of the ISO 17442-1:2020 standard so I'm hoping it's actually something that all the LEI data has? Mind checking @hkeeler?
validations.yaml
Outdated
@@ -0,0 +1,36 @@ | |||
email: | |||
description: must conform to common email conventions using the W3C method of email validation which is only a subset of full RFC 5322 compliance as per https://html.spec.whatwg.org/multipage/input.html#email-state-(type=email) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nice to have a
references:
- https://html.spec.whatwg.org/multipage/input.html#email-state-(type=email)
...for links to any outside sources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! Should I then pop them out of the description and only put these links in the references section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Alright @hkeeler and @jcadam14! I've...
|
Heyo! I updated the PR description to get everything ready, so I think this is ready for final review @hkeeler @jcadam14. 🥳 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall look great. Just the one bit about references
.
- '9999' | ||
- '1' | ||
link: https://regex101.com/r/l3SyQi/3 | ||
references: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't realize an empty value was a valid null in YAML, though I do see in the spec that it's valid. It does feel weird, though. You normally just don't include a given attribute if it is null, though you do occasionally see explicit NULL
s. I also just leard ~
is also null in YAMLese. 🤷
Anyway, my vote would be to just exclude references
for a given entry if we don't have any references for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to have some kind of linter/checklist someday to check to make sure people don't forget fields, so that was my thought behind the null values.
Happy to just exclude it though, so put up a commit for it.
Hurray! 🎉 |
Sets up the
regtech-regex
repository with an initial validations file, sets up the Readme, and cleans up the boilerplate files. Gets this repo ready to become public! 🥳Changes
package.json
(making this a real NPM package in the future may be helpfulCONTRIBUTING.md
file to not include browser support informationINSTALL.md
fileNotes
Here's the completed open source checklist with
~
next to ones that aren't applicable:TERMS.md
included?CHANGELOG.md
present and does it contain structured, consistently formatted recent history?CONTRIBUTING.md
)?README
and tested on a clean machine?README
,requirements.txt
, and/orbuildout.cfg
?README
, if applicable?Todos
post-mvp wishlist
Checklist