Skip to content

Commit

Permalink
Fix typo in article
Browse files Browse the repository at this point in the history
  • Loading branch information
PrinsFrank committed Feb 12, 2024
1 parent 25c40d8 commit c7352e4
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ Another character looking almost identical to the character we're looking for, m

Normally when debugging regexes I use tools like regex101, but it gives the same result and not any info on why. I have been looking for official documentation about PCRE2 regexes [as PHP uses PCRE (PCRE2 starting from 7.3)](https://www.php.net/manual/en/book.pcre.php){:target="_blank" rel="noreferrer noopener"}, but have not stumbled on any good in-depth documentation before. When looking at [pcre.org](http://pcre.org){:target="_blank" rel="noreferrer noopener"}, I did find a link to [Philip Hazels' PCRE2 repository](https://github.com/PhilipHazel/pcre2){:target="_blank" rel="noreferrer noopener"}, which is apparently the official source now. At an astonishing star count of 72, I reckon this has to be one of the least starred repositories for the amount of developers counting on it.

Looking through the few issues I found a very interesting one: ["Caseless ASCII matching"](https://github.com/PhilipHazel/pcre2/issues/11){:target="_blank" rel="noreferrer noopener"}, talking about a particular interesting feature in unicode: "Case folding", with a [link to a list af characters that can be 'folded'](http://www.unicode.org/Public/12.1.0/ucd/CaseFolding.txt){:target="_blank" rel="noreferrer noopener"}. Including a line saying `00B5; C; 03BC; # MICRO SIGN`. Our mystery character!
Looking through the few issues I found a very interesting one: ["Caseless ASCII matching"](https://github.com/PhilipHazel/pcre2/issues/11){:target="_blank" rel="noreferrer noopener"}, talking about a particular interesting feature in unicode: "Case folding", with a [link to a list of characters that can be 'folded'](http://www.unicode.org/Public/12.1.0/ucd/CaseFolding.txt){:target="_blank" rel="noreferrer noopener"}. Including a line saying `00B5; C; 03BC; # MICRO SIGN`. Our mystery character!

With the underlying issue now found, we can now fix the problem. When we get the code point using the mb_ord function and convert it to hex, we can see the difference in the characters:

Expand Down

0 comments on commit c7352e4

Please sign in to comment.