Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

submissionXmlToFieldStream: handle stream end properly #1256

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

alxndrsn
Copy link
Contributor

Previously if poorly-structured or undefined/null XML was passed to this function, the iterator this function returns would hang.

alxndrsn added 4 commits October 30, 2024 08:24
Previously if poorly-structured or undefined/null XML was passed to this function, it would hang.
alxndrsn pushed a commit to alxndrsn/odk-central-backend that referenced this pull request Oct 30, 2024
Copy link
Member

@matthew-white matthew-white left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes a lot of sense to me to add onend and have it call outStream.destroy() if need be. I have a couple of questions about other changes to the control flow though.

@@ -58,18 +66,15 @@ const submissionXmlToFieldStream = (fields, xml, includeStructuralAttrs = false,
textBuffer += text;
},
onclosetag: () => {
if (stack.hasExited()) throw new Error('Stack has already exited');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel totally sure about throwing here. Is it impossible for there to be XML after the last tag for the fields in the SchemaStack? Could this function (as well as onopentag and ontext) just early-return if the stack has exited?

If it is the case that we should throw, I feel like the throw should happen in onopentag. Rather than waiting until the close tag, if we can tell at the open tag that there will be a problem, I think it'd be better to throw sooner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel totally sure about throwing here. Is it impossible for there to be XML after the last tag for the fields in the SchemaStack?

I'm not sure, but I expect having a close tag after the stack is empty represents something very unexpected happening.

If it is the case that we should throw, I feel like the throw should happen in onopentag.

I don't think there's any guarantee there will be an opentag after a dangling closetag, but maybe both cases should throw?

if (!stack.hasExited()) {
outStream.destroy(new Error('Stack has not exited yet'));
} else {
parser.reset();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel quite sure about what parser.reset() is doing here. I don't see reset() mentioned in the docs for htmlparser2 (though I know we're using an old version). If the parser has ended, and we're not reusing the parser, why do we need to reset it? Part of the reason I'm asking is that I wonder whether reset() had more of an effect before. Could reset() have had more of an effect when it was called in onclosetag, like causing the parser to stop parsing early?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel quite sure about what parser.reset() is doing here.

Copied from current line 70. I'm also not sure why it's required, but my guess is similar to yours. Would you rather remove it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants