A collection of Elixir libraries that bring the Unicode standard to the BEAM.
The libraries here implement parts of the Unicode Character Database, CLDR and several Unicode Technical Standards. They are designed to compose: lower-level packages expose the data and primitives, higher-level packages build locale-aware string operations on top.
| Library | Hex | Description |
|---|---|---|
| unicode | Introspection of the Unicode character database with fast codepoint lookups and guards. | |
| unicode_guards | Unicode Set-based guards for matching codepoints in function clauses. | |
| unicode_set | Unicode Sets and regexes for use in guards, compiled patterns, nimble_parsec combinators and regexes. |
|
| unicode_string | Locale-aware case folding and mapping, case-insensitive equality, and word, line, grapheme and sentence breaking with streaming. | |
| unicode_transform | Script transliteration, normalization, case mapping and arbitrary CLDR transforms. | |
| unicode_idna | Pure-Elixir UTS #46 (IDNA 2008) with Punycode (RFC 3492), bidi (RFC 5893) and CONTEXTJ joiner rules. | |
| unicode_unihan | Introspection of the Unicode Unihan character database. |
- Looking up properties of a codepoint — start with
unicode. - Matching codepoints in function guards —
unicode_guardson top ofunicode_set. - Case mapping, folding, or segmenting strings —
unicode_string. - Transliterating between scripts or running CLDR transforms —
unicode_transform. - Encoding internationalized domain names —
unicode_idna. - Working with CJK ideographs —
unicode_unihan.