Welcome the the T13N - a transliteration library for Cyrilic languages written for Javascript applications.
Transliteration can be made for different purposes, such as:
- Friendly URLs for Content Management Systems that publish content using Cyrilic languages;
- Longer SMS;
- Passport names;
- Text interpretations for languages that has a dedicated alternative latin alphabets etc.
IN DEVELOPMENT
/ PHASE 0 PREVIEW
Implementation is split on so-called "phases" for better prioritization.
- Define a basic set of rules for each letter;
- Define a set of flags calculated for each letter for better context;g
- Define alternative variations for some letters that require it (like 'г');
- Support the most basic in-between-words separators (dash, underscore) for URL creation support and resolve "similar" symbols ("’" into "'");
- Ignore already available latin symbols and digits;
- Extend configurations via settings;
- Pack everything as v0.1
- Switch to Typescript;
- Schematize a language JSON;
- Reorganize code to support other variations of one language;
- Add Belarusian Latin alphabet ("Łacinka");
- Add ICAO standard;
- Add ISO 9 standard;
- Reorganize code to support multiple languages;
- Add Ukrainian alphabet and transliteration rules;
- Add Russian alphabet and transliteration rules.
(Other languages to be supported later on)
Every transformation rule is explicit and described in a so-called Ruleset
It's a compilation of rule that explains transliteration behavior of the script. It may be compact and descriptive at the same time, depending on needs.
A result of Ruleset
compilation is a Dictionary
, that's used for pre-processing analysis and later transliteration.
There are three types of Rules which can possibly be used:
Rule Type | Description |
---|---|
L | Describing a rule for a letter that should be altered on a Latin manner |
S | Every special symbol that should be kept as-is or transformed / corrected |
R | There are some common sets of characters (like latin letters or digits) that described one after one and should be labeled in the same way |