You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The documentation of the text::cluster::Token module does not explain what a code unit is. From the example code in the shape module it seems that the offset property is index of the character in the text and len its length when represented as UTF8, but is it?
In my code I don't use UTF8 strings because I have extra information and I keep an array of "chars" like this:
(char 'A') (char 'B')(kern -0.5pt)(char '🙃')
I suppose this is three tokens but what values for offset and len should one use?
The documentation of the
text::cluster::Token
module does not explain what a code unit is. From the example code in theshape
module it seems that theoffset
property is index of the character in the text andlen
its length when represented as UTF8, but is it?In my code I don't use UTF8 strings because I have extra information and I keep an array of "chars" like this:
I suppose this is three tokens but what values for
offset
andlen
should one use?Should the offset of the third token be 2 (logical index into the characters) or 3 (index into my array)?
The text was updated successfully, but these errors were encountered: