Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 view optimizations + Regression fix for #275 #276

Merged
merged 13 commits into from
Aug 15, 2024

Conversation

aehlke
Copy link
Collaborator

@aehlke aehlke commented Jul 27, 2024

I haven't benchmarked yet, but in theory this should be faster

Haven't fully tested yet either

Note that this increases the minimum platform versions

@aehlke
Copy link
Collaborator Author

aehlke commented Jul 28, 2024

Room for improvement, but it seems MUCH faster now that it operates on utf8view and utf8 indexes

@@ -1,263 +1,187 @@
//
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops not intentional to remove this part

@aehlke
Copy link
Collaborator Author

aehlke commented Aug 2, 2024

@scinfu pardon for the interruption, I understand you're very busy - I added a commit here that resolves the regressions I introduced in #275 but those fixes could be split out from here as well

@aehlke aehlke changed the title UTF-8 view optimizations UTF-8 view optimizations + Regression fix for #275 Aug 2, 2024
@aehlke
Copy link
Collaborator Author

aehlke commented Aug 2, 2024

Sorry for entangling the improvement from the regression fix, but that can be fixed / the regression changes can be extracted from it

I need to do more profiling again and find more room for improvement, but I believe this change to utf8view is a massive improvement to SwiftSoup speed. So I hope we can get this merged too.

I think the next biggest win would likely be to change all functions from operating on String to operating on a view or slice (I forget the correct type names for this) that reflects string data from the source without needing to ever copy that data or rarely, or even arrays of them as needed, which can be turned into string at a price and at the library user's choice.

But maybe this doesn't matter as much if this utf8 change is complete (not sure it does this yet actually) enough to have String copying be fast now because of copying byte data

@aehlke
Copy link
Collaborator Author

aehlke commented Aug 4, 2024

@scinfu may I suggest you add more maintainers to the project to be able to review/merge issues with less trouble :)

@scinfu
Copy link
Owner

scinfu commented Aug 5, 2024

@aehlke yes, good idea.

@scinfu
Copy link
Owner

scinfu commented Aug 5, 2024

@aehlke Could you please add a test to prevent this bug in the future?

@aehlke
Copy link
Collaborator Author

aehlke commented Aug 15, 2024

@scinfu I added testing for the hang (by adding a timeout waiter) which repros without my fix in this PR. I also fixed the test suite (unrelated broken test)

I will merge once tests pass as a hotfix

@aehlke
Copy link
Collaborator Author

aehlke commented Aug 15, 2024

Ignoring the 5.7 failure, all other tests passed

5.7 failure issue tracked here: #282 It's unrelated, an apparently recent Xcode 15 regression

@aehlke aehlke merged commit 32f5aea into scinfu:master Aug 15, 2024
3 of 4 checks passed
@aehlke
Copy link
Collaborator Author

aehlke commented Aug 15, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants