Skip to content

Commit

Permalink
Revert "Test width = sum(grapheme cluster widths)"
Browse files Browse the repository at this point in the history
This reverts commit a7a1056.
  • Loading branch information
Jules-Bertholet committed May 13, 2024
1 parent 6edfc60 commit ded852c
Show file tree
Hide file tree
Showing 4 changed files with 5 additions and 1,291 deletions.
3 changes: 1 addition & 2 deletions scripts/unicode.py
Original file line number Diff line number Diff line change
Expand Up @@ -754,9 +754,8 @@ def main(module_path: str):
{EffectiveWidth.NARROW, EffectiveWidth.AMBIGUOUS},
)

# Download files for use by tests
# Download normalization test file for use by tests
fetch_open("NormalizationTest.txt", "../tests/")
fetch_open("auxiliary/GraphemeBreakTest.txt", "../tests/")

print("------------------------")
total_size = 0
Expand Down
23 changes: 4 additions & 19 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,7 @@
//! # Rules for determining width
//!
//! This crate currently uses the following rules to determine the width of a
//! character or string, in order of decreasing precedence. These may be tweaked in the future;
//! however see [guarantees](#guarantees) below.
//! character or string, in order of decreasing precedence. These may be tweaked in the future.
//!
//! 1. [Emoji presentation sequences] have width 2.
//! 2. Outside of an East Asian context, [text presentation sequences] have width 1
Expand Down Expand Up @@ -77,16 +76,10 @@
//!
//! [Enclosed Ideographic Supplement]: https://unicode.org/charts/PDF/U1F200.pdf
//!
//! ## Guarantees
//! ## Canonical equivalence
//!
//! - Any two canonically equivalent strings have the same non-CJK width.
//! This will not change in any future semver-compatible version.
//! (This guarantee does not currently hold for the CJK width variants.)
//! - The width of any string equals the sum of the widths of its [extended grapheme clusters].
//! This is unlikely to change in any future semver-compatible version.
//! (This guarantee holds for both CJK and non-CJK width.)
//!
//! [extended grapheme clusters]: https://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries
//! The non-CJK width methods guarantee that canonically equivalent strings are assigned the same width.
//! However, this guarantee does not currently hold for the CJK width variants.

#![forbid(unsafe_code)]
#![deny(missing_docs)]
Expand All @@ -102,14 +95,6 @@ pub use tables::UNICODE_VERSION;
mod tables;

/// Methods for determining displayed width of Unicode characters.
///
/// **NB:** the width of a string may differ from the sum of the widths of its characters;
/// see the [crate-level documentation](crate#rules-for-determining-width) for more.
/// Instead of working with individual characters, consider using [extended grapheme clusters],
/// perhaps with the [`unicode-segmentation`] crate.
///
/// [extended grapheme clusters]: https://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries
/// [`unicode-segmentation`]: https://docs.rs/unicode-segmentation/latest/unicode_segmentation/trait.UnicodeSegmentation.html#tymethod.graphemes
pub trait UnicodeWidthChar {
/// Returns the character's displayed width in columns, or `None` if the
/// character is a control character.
Expand Down
Loading

0 comments on commit ded852c

Please sign in to comment.