Skip to content

Commit

Permalink
默认使用 v 表示 ü & 新增 passportName 方法 #191
Browse files Browse the repository at this point in the history
  • Loading branch information
overtrue committed Apr 27, 2023
1 parent 1c5e073 commit 4f2a6b7
Show file tree
Hide file tree
Showing 9 changed files with 133 additions and 120 deletions.
49 changes: 0 additions & 49 deletions .php-cs-fixer.dist.php

This file was deleted.

48 changes: 40 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,11 +66,11 @@ json_encode($pinyin); // '["nǐ","hǎo","shì","jiè"]'
use Overtrue\Pinyin\Pinyin;

echo Pinyin::sentence('带着希望去旅行,比到达终点更美好');
// dài zhe xī wàng qù lyu xíng , bǐ dào dá zhōng diǎn gèng měi hǎo
// dài zhe xī wàng qù xíng , bǐ dào dá zhōng diǎn gèng měi hǎo

// 去除声调
echo Pinyin::sentence('带着希望去旅行,比到达终点更美好', 'none');
// dai zhe xi wang qu lyu xing , bi dao da zhong dian geng mei hao
// dai zhe xi wang qu lv xing , bi dao da zhong dian geng mei hao
```

### 生成用于链接的拼音字符串
Expand All @@ -87,9 +87,8 @@ echo Pinyin::permalink('带着希望去旅行', '.'); // dai.zhe.xi.wang.qu.lyu.
通常用于创建搜索用的索引,可以使用 `abbr` 方法转换:

```php
echo Pinyin::abbr('带着希望去旅行'); // d z x w q l x
Pinyin::abbr('带着希望去旅行'); // ['d', 'z', 'x', 'w', 'q', 'l', 'x']
echo Pinyin::abbr('带着希望去旅行')->join('-'); // d-z-x-w-q-l-x

echo Pinyin::abbr('你好2018!')->join(''); // nh2018
echo Pinyin::abbr('Happy New Year! 2018!')->join(''); // HNY2018
```
Expand All @@ -99,7 +98,7 @@ echo Pinyin::abbr('Happy New Year! 2018!')->join(''); // HNY2018
将首字作为姓氏转换,其余作为普通词语转换:

```php
echo Pinyin::nameAbbr('欧阳'); // o y
Pinyin::nameAbbr('欧阳'); // ['o', 'y']
echo Pinyin::nameAbbr('单单单')->join('-'); // s-d-d
```

Expand All @@ -109,11 +108,21 @@ echo Pinyin::nameAbbr('单单单')->join('-'); // s-d-d
姓名的姓的读音有些与普通字不一样,比如 ‘单’ 常见的音为 `dan`,而作为姓的时候读 `shan`

```php
echo Pinyin::name('单某某'); // shàn mǒu mǒu
echo Pinyin::name('单某某', 'none'); // shan mou mou
echo Pinyin::name('单某某', 'none')->join('-'); // shan-mou-mou
Pinyin::name('单某某'); // ['shàn', 'mǒu', 'mǒu']
Pinyin::name('单某某', 'none'); // ['shan', 'mou', 'mou']
Pinyin::name('单某某', 'none')->join('-'); // shan-mou-mou
```

### 护照姓名转换

根据国家规定 [关于中国护照旅行证上姓名拼音ü(吕、律、闾、绿、女等)统一拼写为YU的提醒](http://sg.china-embassy.gov.cn/lsfw/zghz1/hzzxdt/201501/t20150122_2022198.htm) 的规则,将 `ü` 转换为 `yu`

```php
Pinyin::passportName('吕小布'); // ['lyu', 'xiao', 'bu']
Pinyin::passportName('女小花'); // ['nyu', 'xiao', 'hua']
Pinyin::passportName('律师'); // ['lyu', 'shi']
```

### 多音字

多音字的返回值为关联数组的集合:
Expand Down Expand Up @@ -154,6 +163,29 @@ $pinyin->toArray();
更多使用请参考 [测试用例](https://github.com/overtrue/pinyin/blob/master/tests/PinyinTest.php)

## nv/lv/lyu/lǚ 的问题

根据国家语言文字工作委员会的规定,`lv``lyu``` 都是正确的,但是 `lv` 是最常用的,所以默认使用 `lv`,如果你需要使用其他的,可以在初始化时传入:

```php
echo Pinyin::sentence('旅行');
// lǚ xíng

echo Pinyin::sentence('旅行', 'none');
// lv xing

echo Pinyin::yuToYu()->sentence('旅行', 'none');
// lyu xing

echo Pinyin::yuToU()->sentence('旅行', 'none');
// lu xing

echo Pinyin::yuToV()->sentence('旅行', 'none');
// lv xing
```

> {Warning} 仅在拼音风格为非 `none` 模式下有效。
## 命令行工具

你可以使用命令行来实现拼音的转换:
Expand Down
7 changes: 4 additions & 3 deletions bin/utils.php
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
* <pre>
* // U+4E2D: zhōng,zhòng # 中
* </pre>
*
* @throws Exception
*/
function parse_chars(string $path, callable $fn = null): Generator
Expand All @@ -14,9 +15,9 @@ function parse_chars(string $path, callable $fn = null): Generator
foreach (file($path) as $line) {
preg_match('/^U\+(?<code>[0-9A-Z]+):\s+(?<pinyin>\S+)\s+#\s*(?<char>\S+)/', $line, $matched);

if ($matched && !empty($matched['pinyin'])) {
if ($matched && ! empty($matched['pinyin'])) {
yield $matched['char'] => $fn(explode(',', $matched['pinyin']));
} elseif (!str_starts_with($line, '#')) {
} elseif (! str_starts_with($line, '#')) {
throw new Exception("行解析错误:$line");
}
}
Expand All @@ -37,7 +38,7 @@ function parse_words(string $path, callable $fn = null): Generator
foreach (file($path) as $line) {
preg_match('/^(?<word>[^#\s]+):\s+(?<pinyin>[\p{L} ]+)#?/u', $line, $matched);

if ($matched && !empty($matched['pinyin'])) {
if ($matched && ! empty($matched['pinyin'])) {
yield $matched['word'] => $fn(explode(' ', trim($matched['pinyin'])));
}
}
Expand Down
16 changes: 8 additions & 8 deletions composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -32,17 +32,18 @@
"phpunit/phpunit": "^9.5",
"brainmaestro/composer-git-hooks": "^2.7",
"friendsofphp/php-cs-fixer": "^3.2",
"nunomaduro/termwind": "^1.13"
"nunomaduro/termwind": "^1.13",
"laravel/pint": "^1.10"
},
"extra": {
"hooks": {
"pre-commit": [
"composer test",
"composer fix-style"
"composer pint",
"composer test"
],
"pre-push": [
"composer test",
"composer check-style"
"composer pint",
"composer test"
]
}
},
Expand All @@ -56,14 +57,13 @@
"cghooks update"
],
"cghooks": "vendor/bin/cghooks",
"check-style": "php-cs-fixer fix --using-cache=no --diff --dry-run --ansi",
"fix-style": "php-cs-fixer fix --using-cache=no --ansi",
"pint": "vendor/bin/pint",
"fix-style": "vendor/bin/pint ./src ./tests",
"test": "vendor/bin/phpunit --colors=always",
"build": "php ./bin/build"
},
"scripts-descriptions": {
"test": "Run all tests.",
"check-style": "Run style checks (only dry run - no fixing!).",
"fix-style": "Run style checks and fix violations."
}
}
41 changes: 28 additions & 13 deletions src/Converter.php
Original file line number Diff line number Diff line change
Expand Up @@ -5,25 +5,33 @@
class Converter
{
private const SEGMENTS_COUNT = 10;

private const WORDS_PATH = __DIR__.'/../data/words-%s.php';

private const CHARS_PATH = __DIR__.'/../data/chars.php';

private const SURNAMES_PATH = __DIR__.'/../data/surnames.php';

public const TONE_STYLE_SYMBOL = 'symbol';

public const TONE_STYLE_NUMBER = 'number';

public const TONE_STYLE_NONE = 'none';

protected bool $polyphonic = false;

protected bool $asSurname = false;

protected bool $noWords = false;

protected string $yuTo = 'yu';
protected string $yuTo = 'v';

protected string $toneStyle = self::TONE_STYLE_SYMBOL;

protected array $regexps = [
'separator' => '\p{Z}',
'mark' => '\p{M}',
'tab' => "\t"
'tab' => "\t",
];

public const REGEXPS = [
Expand Down Expand Up @@ -116,6 +124,13 @@ public function useNumberTone(): static
return $this;
}

public function yuToYu(): static
{
$this->yuTo = 'yu';

return $this;
}

public function yuToV(): static
{
$this->yuTo = 'v';
Expand Down Expand Up @@ -143,7 +158,7 @@ public function convert(string $string, callable $beforeSplit = null): Collectio
{
// 把原有的数字和汉字分离,避免拼音转换时被误作声调
$string = preg_replace_callback('~[a-z0-9_-]+~i', function ($matches) {
return "\t" . $matches[0];
return "\t".$matches[0];
}, $string);

// 过滤掉不保留的字符
Expand Down Expand Up @@ -198,7 +213,7 @@ protected function convertSurname(string $name): string

foreach ($surnames as $surname => $pinyin) {
if (\str_starts_with($name, $surname)) {
return $pinyin . \mb_substr($name, \mb_strlen($surname));
return $pinyin.\mb_substr($name, \mb_strlen($surname));
}
}

Expand All @@ -218,24 +233,24 @@ protected function split(string $item): Collection

protected function formatTone(string $pinyin, string $style): string
{
if ($style === self::TONE_STYLE_SYMBOL) {
return $pinyin;
}

$replacements = [
'üē' => ['ue', 1], 'üé' => ['ue', 2], 'üě' => ['ue', 3], 'üè' => ['ue', 4],
'ā' => ['a', 1], 'ē' => ['e', 1], 'ī' => ['i', 1], 'ō' => ['o', 1], 'ū' => ['u', 1], 'ǖ' => ['yu', 1],
'á' => ['a', 2], 'é' => ['e', 2], 'í' => ['i', 2], 'ó' => ['o', 2], 'ú' => ['u', 2], 'ǘ' => ['yu', 2],
'ǎ' => ['a', 3], 'ě' => ['e', 3], 'ǐ' => ['i', 3], 'ǒ' => ['o', 3], 'ǔ' => ['u', 3], 'ǚ' => ['yu', 3],
'à' => ['a', 4], 'è' => ['e', 4], 'ì' => ['i', 4], 'ò' => ['o', 4], 'ù' => ['u', 4], 'ǜ' => ['yu', 4],
'ā' => ['a', 1], 'ē' => ['e', 1], 'ī' => ['i', 1], 'ō' => ['o', 1], 'ū' => ['u', 1], 'ǖ' => ['v', 1],
'á' => ['a', 2], 'é' => ['e', 2], 'í' => ['i', 2], 'ó' => ['o', 2], 'ú' => ['u', 2], 'ǘ' => ['v', 2],
'ǎ' => ['a', 3], 'ě' => ['e', 3], 'ǐ' => ['i', 3], 'ǒ' => ['o', 3], 'ǔ' => ['u', 3], 'ǚ' => ['v', 3],
'à' => ['a', 4], 'è' => ['e', 4], 'ì' => ['i', 4], 'ò' => ['o', 4], 'ù' => ['u', 4], 'ǜ' => ['v', 4],
];

foreach ($replacements as $unicode => $replacement) {
if (\str_contains($pinyin, $unicode)) {
$umlaut = $replacement[0];

if ($umlaut !== 'yu' && $style === self::TONE_STYLE_SYMBOL) {
continue;
}

// https://zh.wikipedia.org/wiki/%C3%9C
if ($this->yuTo !== 'yu') {
if ($this->yuTo !== 'v' && $umlaut === 'v') {
$umlaut = $this->yuTo;
}

Expand Down
7 changes: 6 additions & 1 deletion src/Pinyin.php
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,11 @@ public static function name(string $name, string $toneStyle = Converter::TONE_ST
return self::surname()->withToneStyle($toneStyle)->convert($name);
}

public static function passportName(string $name, string $toneStyle = Converter::TONE_STYLE_NONE): Collection
{
return self::surname()->yuToYu()->withToneStyle($toneStyle)->convert($name);
}

public static function phrase(string $string, string $toneStyle = Converter::TONE_STYLE_SYMBOL): Collection
{
return self::noPunctuation()->withToneStyle($toneStyle)->convert($string);
Expand All @@ -48,7 +53,7 @@ public static function chars(string $string, string $toneStyle = Converter::TONE

public static function permalink(string $string, string $delimiter = '-'): string
{
if (!in_array($delimiter, ['_', '-', '.', ''], true)) {
if (! in_array($delimiter, ['_', '-', '.', ''], true)) {
throw new InvalidArgumentException("Delimiter must be one of: '_', '-', '', '.'.");
}

Expand Down
23 changes: 12 additions & 11 deletions tests/ConverterTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -39,26 +39,26 @@ public function test_chars()

public function test_onlyHans()
{
$this->assertPinyin('dài zhe xī wàng qù lyu xíng , bǐ dào dá zhōng diǎn gèng měi hǎo', Converter::make()->convert('带着希望去旅行,比到达终点更美好'));
$this->assertPinyin('dài zhe xī wàng qù lyu xíng bǐ dào dá zhōng diǎn gèng měi hǎo', Converter::make()->onlyHans()->convert('带着希望去旅行,比到达终点更美好'));
$this->assertPinyin('dài zhe xī wàng qù xíng , bǐ dào dá zhōng diǎn gèng měi hǎo', Converter::make()->convert('带着希望去旅行,比到达终点更美好'));
$this->assertPinyin('dài zhe xī wàng qù xíng bǐ dào dá zhōng diǎn gèng měi hǎo', Converter::make()->onlyHans()->convert('带着希望去旅行,比到达终点更美好'));
}

public function test_noAlpha()
{
$this->assertPinyin('abc dài zhe xī def wàng qù lyu xíng jkl', Converter::make()->convert('abc带着希def望去旅行jkl'));
$this->assertPinyin('dài zhe xī wàng qù lyu xíng', Converter::make()->noAlpha()->convert('abc带着希def望去旅行jkl'));
$this->assertPinyin('abc dài zhe xī def wàng qù xíng jkl', Converter::make()->convert('abc带着希def望去旅行jkl'));
$this->assertPinyin('dài zhe xī wàng qù xíng', Converter::make()->noAlpha()->convert('abc带着希def望去旅行jkl'));
}

public function test_noNumber()
{
$this->assertPinyin('123 dài zhe xī 456 wàng qù lyu xíng 789', Converter::make()->convert('123带着希456望去旅行789'));
$this->assertPinyin('dài zhe xī wàng qù lyu xíng', Converter::make()->noNumber()->convert('123带着希456望去旅行789'));
$this->assertPinyin('123 dài zhe xī 456 wàng qù xíng 789', Converter::make()->convert('123带着希456望去旅行789'));
$this->assertPinyin('dài zhe xī wàng qù xíng', Converter::make()->noNumber()->convert('123带着希456望去旅行789'));
}

public function test_noPunctuation()
{
$this->assertPinyin('123 dài , zhe " xī wàng " qù lyu xíng 789?', Converter::make()->convert('123带,着"希望"去旅行789?'));
$this->assertPinyin('123 dài zhe xī 456 wàng qù lyu xíng 789', Converter::make()->noPunctuation()->convert('123带着希456望去旅行789'));
$this->assertPinyin('123 dài , zhe " xī wàng " qù xíng 789?', Converter::make()->convert('123带,着"希望"去旅行789?'));
$this->assertPinyin('123 dài zhe xī 456 wàng qù xíng 789', Converter::make()->noPunctuation()->convert('123带着希456望去旅行789'));
}

public function test_tone_style()
Expand All @@ -74,9 +74,10 @@ public function test_tone_style()

public function test_yu()
{
$this->assertPinyin('lyu xíng', Converter::make()->convert('旅行'));
$this->assertPinyin('lv xíng', Converter::make()->yuToV()->convert('旅行'));
$this->assertPinyin('lu xíng', Converter::make()->yuToU()->convert('旅行'));
$this->assertPinyin('lǚ xíng', Converter::make()->convert('旅行'));
$this->assertPinyin('lyu xing', Converter::make()->yuToYu()->noTone()->convert('旅行'));
$this->assertPinyin('lv xing', Converter::make()->yuToV()->noTone()->convert('旅行'));
$this->assertPinyin('lu xing', Converter::make()->yuToU()->noTone()->convert('旅行'));
}

public function test_when()
Expand Down
Loading

0 comments on commit 4f2a6b7

Please sign in to comment.