Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove regex use from gleam/uri #734

Open
lpil opened this issue Nov 12, 2024 · 2 comments · May be fixed by #739
Open

Remove regex use from gleam/uri #734

lpil opened this issue Nov 12, 2024 · 2 comments · May be fixed by #739
Labels
good first issue Good for newcomers help wanted Extra attention is needed

Comments

@lpil
Copy link
Member

lpil commented Nov 12, 2024

Convert it to manually parse the URI instead.

@lpil lpil added good first issue Good for newcomers help wanted Extra attention is needed labels Nov 12, 2024
@giacomocavalieri
Copy link
Member

I started looking into this and noticed a couple of things with the js impl: it's using a variation of this regex ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))? provided in the Uri rfc. However this will behave differently from the implementation for the Erlang target (done by uri_string:parse); for example :wibble is rejected on the Erlang target but is parsed as a uri with path ":wibble" on the js one

I'm pretty sure it's the Erlang implementation that is correct here since that regex is only meant to "break down a well-formed URI reference into its components". In any way it would be best if both targets behaved the same here.

  • If the plan is to get rid of the regex dependency to v1 stdlib as soon as possible I can go ahead and implement a parser equivalent to that regex, it would take very little time but the target differences would still be there!
  • Or since we're making this change we could try and copy Erlang's implementation but that would be a big item of work (the erlang implementation is thousands of lines of code from what I can tell and understanding it and porting it to Gleam could require a long time)

What do you think is best here?

@lpil
Copy link
Member Author

lpil commented Nov 15, 2024

Removing the regex and making an issue for making it consistent between both targets sounds find to me.

@giacomocavalieri giacomocavalieri linked a pull request Nov 18, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants