-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a to_markdown method and simplify fix for sections being repeated #108
base: main
Are you sure you want to change the base?
Conversation
@ansukla let me know what do you think |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge
Thank you so much @jpbalarini. This is a great addition to the library. |
@ansukla @udai-kiran is there anything else need from my end to get this PR merged? |
|
||
# MacOS | ||
*.DS_Store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue that this doesn't belong in a project .gitignore, as this is specific to your environment - instead, you should set up a global core.excludesfile
. See the relevant GitHub docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't say this is specific to my environment since it's already here in the project and shouldn't be.
*.DS_Store are temporary MacOS files that are created automatically, so anyone contributing to the project could inadvertently add them. So I think it's a good idea to try to avoid that in the future.
I like the introduction of this method, but this PR seems to introduce a lot of changes that (a) don't belong here or (b) are a matter of personal style, where you shouldn't overwrite project defaults. An example for (a) is highlighted above (inclusion of I think you have better chances getting the PR merged if you cleanup the PR to forego these personal changes and instead focus on what's essential |
|
||
html_str = html_str + "</html>" | ||
for child in self.root_node.children: | ||
html_str += child.to_html(include_children=True, recurse=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this fix, I was trying to figure out why I was seeing duplicate sections in my HTML.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're welcome!
Thanks for your comments, @hellerphilipp . I replied to the DS_Store file comment above. |
Agree that unifying the quotes is probably a good idea. However, it should be its own PR or at least its own commit to separate concerns (and make reviewing less troublesome) |
@hellerphilipp I made the changes you requested and will create a new PR with the style changes after this one gets merged. Thanks! |
Description
to_text()
returns emtpy text when the document doesn't have sections #73 (comment). The fix for that issue was introduced in 89e4bc7#diff-f5675ab91d217e33d82e4c00035b574f60ccfa0dbd9678da3bd2ae3d18f8a92cR543, but it's not needed to get the top level sections to fix that issue; you could get the root element and iterate the children recursively.Also, I cannot think of a use case where you want duplicate elements in your document.to_html(), to_text() or to_markdown() methods, so the logic gets simplified a lot by removing the
_get_top_sections
method and avoid querying on every method if you want to include duplicates or not.