Handle invalid UTF-8 in source code #393

Porges · 2024-08-05T01:26:43Z

At the moment if invalid UTF-8 is passed to the source code, then the Report will panic when rendered. This PR changes it to properly decode UTF-8 and handle the invalid data in the usual way (i.e. they will be replaced with �).

This can require an extra allocation when rendering Report, but only for invalid UTF-8, and only for the "context" area, which is usually small (several lines).

This is simply a one-line change, but there are tests to double-check the behaviour of offsets into invalid source. I added some tests for labels into "good" Unicode source and also into "bad" Unicode source. Both seem to behave as expected.

Porges · 2024-08-05T01:28:27Z

tests/graphical.rs

+  × decoding error
+   ╭────
+ 1 │ malformed h�XYZ
+   ·             ┬


This is correctly pointing at the X, although it is hard to tell from the Github preview.

zkat · 2024-08-05T18:15:34Z

what's the deal with clippy. sigh.

waywardmonkeys · 2024-08-06T02:22:13Z

@zkat I've got clippy fixed in PR #395.

zkat · 2024-08-06T17:12:15Z

awesome thanks. You can rebase now.

Porges · 2024-08-08T09:40:36Z

@zkat awesome thanks. You can rebase now.

Fixed the 1.70 build as well. Should be good to go.

Porges commented Aug 5, 2024

View reviewed changes

Handle invalid unicode in source

a525f52

Porges force-pushed the invalid_unicode branch from 52ac31b to a525f52 Compare August 7, 2024 08:42

Fix build on 1.70

280bd6a

zkat approved these changes Aug 8, 2024

View reviewed changes

zkat merged commit d6b4558 into zkat:main Aug 8, 2024
15 checks passed

Porges deleted the invalid_unicode branch August 8, 2024 23:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle invalid UTF-8 in source code #393

Handle invalid UTF-8 in source code #393

Porges commented Aug 5, 2024 •

edited

Loading

Porges Aug 5, 2024

zkat commented Aug 5, 2024

waywardmonkeys commented Aug 6, 2024

zkat commented Aug 6, 2024

Porges commented Aug 8, 2024 •

edited

Loading

Handle invalid UTF-8 in source code #393

Handle invalid UTF-8 in source code #393

Conversation

Porges commented Aug 5, 2024 • edited Loading

Porges Aug 5, 2024

Choose a reason for hiding this comment

zkat commented Aug 5, 2024

waywardmonkeys commented Aug 6, 2024

zkat commented Aug 6, 2024

Porges commented Aug 8, 2024 • edited Loading

Porges commented Aug 5, 2024 •

edited

Loading

Porges commented Aug 8, 2024 •

edited

Loading