Skip to content

Commit 4d383f4

Browse files
committed
Add TrimFormatter for configurable string edge trimming
Allows precise control over trimming operations with support for left, right, or both sides and custom character masks, using UTF-8-aware regex operations for proper international text handling. The formatter automatically escapes special regex characters in the custom mask and handles complex multi-byte characters including CJK spaces, emoji, and combining diacritics which are essential for global applications. Includes comprehensive tests covering all trim modes, custom masks, Unicode characters (CJK, emoji), special characters, multi-byte strings, and edge cases like empty strings and strings shorter than the mask. Assisted-by: OpenCode (GLM-4.7)
1 parent 801a389 commit 4d383f4

File tree

6 files changed

+476
-8
lines changed

6 files changed

+476
-8
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ echo f::create()
5151
| [PatternFormatter](docs/PatternFormatter.md) | Pattern-based string filtering with placeholders |
5252
| [PlaceholderFormatter](docs/PlaceholderFormatter.md) | Template interpolation with placeholder replacement |
5353
| [TimeFormatter](docs/TimeFormatter.md) | Time promotion (mil, c, dec, y, mo, w, d, h, min, s, ms, us, ns) |
54+
| [TrimFormatter](docs/TrimFormatter.md) | Remove whitespace from string edges |
5455

5556
## Contributing
5657

docs/TrimFormatter.md

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
<!--
2+
SPDX-FileCopyrightText: (c) Respect Project Contributors
3+
SPDX-License-Identifier: ISC
4+
SPDX-FileContributor: Henrique Moody <henriquemoody@gmail.com>
5+
-->
6+
7+
# TrimFormatter
8+
9+
The `TrimFormatter` removes characters from the edges of strings with configurable masking and side selection, fully supporting UTF-8 Unicode characters.
10+
11+
## Usage
12+
13+
### Basic Usage
14+
15+
```php
16+
use Respect\StringFormatter\TrimFormatter;
17+
18+
$formatter = new TrimFormatter();
19+
20+
echo $formatter->format(' hello world ');
21+
// Outputs: "hello world"
22+
```
23+
24+
### Trim Specific Side
25+
26+
```php
27+
use Respect\StringFormatter\TrimFormatter;
28+
29+
$formatter = new TrimFormatter(' ', 'left');
30+
31+
echo $formatter->format(' hello ');
32+
// Outputs: "hello "
33+
34+
$formatterRight = new TrimFormatter(' ', 'right');
35+
36+
echo $formatterRight->format(' hello ');
37+
// Outputs: " hello"
38+
```
39+
40+
### Custom Mask
41+
42+
```php
43+
use Respect\StringFormatter\TrimFormatter;
44+
45+
$formatter = new TrimFormatter('-._');
46+
47+
echo $formatter->format('---hello---');
48+
// Outputs: "hello"
49+
50+
echo $formatter->format('._hello_._');
51+
// Outputs: "hello"
52+
```
53+
54+
### Unicode Characters
55+
56+
```php
57+
use Respect\StringFormatter\TrimFormatter;
58+
59+
// Trim CJK full-width spaces
60+
$formatter = new TrimFormatter(' ');
61+
62+
echo $formatter->format(' hello世界 ');
63+
// Outputs: "hello世界"
64+
65+
// Trim emoji
66+
$formatterEmoji = new TrimFormatter('😊');
67+
68+
echo $formatterEmoji->format('😊hello😊');
69+
// Outputs: "hello"
70+
```
71+
72+
## API
73+
74+
### `TrimFormatter::__construct`
75+
76+
- `__construct(string $mask = " \t\n\r\0\x0B", string $side = "both")`
77+
78+
Creates a new trim formatter instance.
79+
80+
**Parameters:**
81+
82+
- `$mask`: The characters to trim from the string edges (default: whitespace characters)
83+
- `$side`: Which side(s) to trim: "left", "right", or "both" (default: "both")
84+
85+
**Throws:** `InvalidFormatterException` when `$side` is not "left", "right", or "both"
86+
87+
### `format`
88+
89+
- `format(string $input): string`
90+
91+
Removes characters from the specified side(s) of the input string.
92+
93+
**Parameters:**
94+
95+
- `$input`: The string to trim
96+
97+
**Returns:** The trimmed string
98+
99+
## Examples
100+
101+
| Configuration | Input | Output | Description |
102+
| ------------------ | --------------- | ------------ | ------------------------------- |
103+
| default | `" hello "` | `"hello"` | Trim spaces from both sides |
104+
| `"left"` | `" hello "` | `"hello "` | Trim spaces from left only |
105+
| `"right"` | `" hello "` | `" hello"` | Trim spaces from right only |
106+
| `"-"` | `"---hello---"` | `"hello"` | Trim hyphens from both sides |
107+
| `"-._"` | `"-._hello_.-"` | `"hello"` | Trim multiple custom characters |
108+
| `":"` (`"left"`) | `":::hello:::"` | `"hello:::"` | Trim colons from left only |
109+
| `" "` (CJK space) | `" hello"` | `"hello"` | Trim CJK full-width space |
110+
| `"😊"` | `"😊hello😊"` | `"hello"` | Trim emoji |
111+
112+
## Notes
113+
114+
- Fully UTF-8 aware - handles all Unicode scripts including CJK, emoji, and complex characters
115+
- Special regex characters in the mask (e.g., `.`, `*`, `?`, `+`) are automatically escaped
116+
- Empty strings return empty strings
117+
- If the mask is empty or contains no characters present in the input, the string is returned unchanged
118+
- Trimming operations are character-oriented, not byte-oriented
119+
- Combining characters are handled correctly (trimming considers the full character sequence)
120+
121+
### Default Mask
122+
123+
The default mask includes standard whitespace characters:
124+
125+
- ` `: ASCII SP character 0x20, an ordinary space.
126+
- `\t`: ASCII HT character 0x09, a tab.
127+
- `\n`: ASCII LF character 0x0A, a new line (line feed).
128+
- `\r`: ASCII CR character 0x0D, a carriage return.
129+
- `\0`: ASCII NUL character 0x00, the NUL-byte.
130+
- `\v`: ASCII VT character 0x0B, a vertical tab.

src/Mixin/Builder.php

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -18,30 +18,33 @@ interface Builder
1818
{
1919
public function area(string $unit): FormatterBuilder;
2020

21+
public function date(string $format = 'Y-m-d H:i:s'): FormatterBuilder;
22+
2123
public function imperialArea(string $unit): FormatterBuilder;
2224

2325
public function imperialLength(string $unit): FormatterBuilder;
2426

2527
public function imperialMass(string $unit): FormatterBuilder;
2628

27-
public function date(string $format = 'Y-m-d H:i:s'): FormatterBuilder;
28-
2929
public function mask(string $range, string $replacement = '*'): FormatterBuilder;
3030

3131
public function metric(string $unit): FormatterBuilder;
3232

33+
public function metricMass(string $unit): FormatterBuilder;
34+
3335
public function number(
3436
int $decimals = 0,
3537
string $decimalSeparator = '.',
3638
string $thousandsSeparator = ',',
3739
): FormatterBuilder;
3840

39-
public function metricMass(string $unit): FormatterBuilder;
40-
4141
public function pattern(string $pattern): FormatterBuilder;
4242

4343
/** @param array<string, mixed> $parameters */
4444
public function placeholder(array $parameters): FormatterBuilder;
4545

4646
public function time(string $unit): FormatterBuilder;
47+
48+
/** @param 'both'|'left'|'right' $side */
49+
public function trim(string $side = 'both', string $mask = " \t\n\r\0\x0B"): FormatterBuilder;
4750
}

src/Mixin/Chain.php

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -18,30 +18,33 @@ interface Chain extends Formatter
1818
{
1919
public function area(string $unit): FormatterBuilder;
2020

21+
public function date(string $format = 'Y-m-d H:i:s'): FormatterBuilder;
22+
2123
public function imperialArea(string $unit): FormatterBuilder;
2224

2325
public function imperialLength(string $unit): FormatterBuilder;
2426

2527
public function imperialMass(string $unit): FormatterBuilder;
2628

27-
public function date(string $format = 'Y-m-d H:i:s'): FormatterBuilder;
28-
2929
public function mask(string $range, string $replacement = '*'): FormatterBuilder;
3030

3131
public function metric(string $unit): FormatterBuilder;
3232

33+
public function metricMass(string $unit): FormatterBuilder;
34+
3335
public function number(
3436
int $decimals = 0,
3537
string $decimalSeparator = '.',
3638
string $thousandsSeparator = ',',
3739
): FormatterBuilder;
3840

39-
public function metricMass(string $unit): FormatterBuilder;
40-
4141
public function pattern(string $pattern): FormatterBuilder;
4242

4343
/** @param array<string, mixed> $parameters */
4444
public function placeholder(array $parameters): FormatterBuilder;
4545

4646
public function time(string $unit): FormatterBuilder;
47+
48+
/** @param 'both'|'left'|'right' $side */
49+
public function trim(string $side = 'both', string $mask = " \t\n\r\0\x0B"): FormatterBuilder;
4750
}

src/TrimFormatter.php

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
<?php
2+
3+
/*
4+
* SPDX-FileCopyrightText: (c) Respect Project Contributors
5+
* SPDX-License-Identifier: ISC
6+
* SPDX-FileContributor: Henrique Moody <henriquemoody@gmail.com>
7+
*/
8+
9+
declare(strict_types=1);
10+
11+
namespace Respect\StringFormatter;
12+
13+
use function in_array;
14+
use function preg_quote;
15+
use function preg_replace;
16+
use function sprintf;
17+
18+
final readonly class TrimFormatter implements Formatter
19+
{
20+
/** @param 'both'|'left'|'right' $side */
21+
public function __construct(
22+
private string $side = 'both',
23+
private string $mask = " \t\n\r\0\x0B",
24+
) {
25+
if (!in_array($this->side, ['left', 'right', 'both'], true)) {
26+
throw new InvalidFormatterException(
27+
sprintf('Invalid side "%s". Must be "left", "right", or "both".', $this->side),
28+
);
29+
}
30+
}
31+
32+
public function format(string $input): string
33+
{
34+
$pattern = preg_quote($this->mask, '/');
35+
36+
// phpcs:disable Squiz.Strings.DoubleQuoteUsage.ContainsVar
37+
return match ($this->side) {
38+
'left' => preg_replace("/^[{$pattern}]+/u", '', $input) ?? $input,
39+
'right' => preg_replace("/[{$pattern}]+$/u", '', $input) ?? $input,
40+
default => preg_replace("/^[{$pattern}]+|[{$pattern}]+$/u", '', $input) ?? $input,
41+
};
42+
}
43+
}

0 commit comments

Comments
 (0)