Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly handle lengths of serialized strings #39

Draft
wants to merge 5 commits into
base: trunk
Choose a base branch
from

Conversation

mjangda
Copy link
Member

@mjangda mjangda commented Mar 7, 2024

This introduces an alternate approach to handling serialized strings.

Previously, we were running a search-replace per line and then attempting to fix the lengths of any serialized strings we found. This could be problematic in various contexts including when the serialized string contains an closing string delimiter (see #33).

In this PR, we handle each serialized string chunk separately by extracting the string based on the serialized string length and then running search-replace within that chunk, updating the length, and then replacing in place.

The approach is slower but addresses several bugs and edge cases.

Next:

  • Clean up the PR (remove debug statements)
  • Handle edge cases like out-of-boundary errors
  • Test, benchmark, and share the changes
  • Package and ship

mjangda and others added 5 commits February 7, 2024 21:43
e.g. If you have CSS in an option with `\";`, that can confuse the current way we parse stuff and lead to incorrect replacements.
To help catch various edge cases
Use the length of the serialized content to determine the scope of the search-replace action. This is to avoid problems when serialized content has a closing marker (\";), e.g. in CSS content.

This is much slower and needs to be optimized.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant