Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream fails to parse if XML file contains a leading BOM character #117

Open
edds opened this issue Mar 1, 2023 · 1 comment
Open

Stream fails to parse if XML file contains a leading BOM character #117

edds opened this issue Mar 1, 2023 · 1 comment

Comments

@edds
Copy link

edds commented Mar 1, 2023

If an XML file contains a leading BOM Saxy fails to parse the file.

iex(1)> xml = "\uFEFF<?xml version=\"1.0\" encoding=\"utf-8\"><foo bar='value'></foo>"
iex(2)> Saxy.parse_string(data, MyEventHandler, [])
"Start parsing document"
{:error,
%Saxy.ParseError{
  reason: {:token, :lt},
  binary: "\uFEFF<?xml version=\"1.0\" encoding=\"utf-8\"><foo bar='value'></foo>",
  position: 0
}}

I'm seeing this when using ExCmd to stream a gziped file into Saxy and can't see any obvious way of stripping it out before passing the stream to Saxy.

@lucacorti
Copy link
Contributor

Yes, this happens. You can use String.replace_leading(xml, "\uFEFF", "") to strip the BOM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants