Skip to content

Testing config along with minimal changes to resolve failing tests #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

shaunhegarty
Copy link

Not the most elegant changes.
There are a fair few type coercions to make sure that we have bytes going with bytes and unicode with unicode.
Rest of the fixes tended to be swapping out StringIOs for six.BytesIO, or updating for BeautifulSoup changes or fiddling with newlines

I didn't update the setup.py requirements or url.

The test config uses tox to configure the dependencies and run tests against both python 2.7 and 3.6.
Dockerfile base image comes with tox and a bunch of python versions installed so it's relatively easy to include others. It starts failing once you get to 3.8

- Update README with command to run tests with tox and docker.

Not all tests are passing. Failures (that aren't XFAILS) are mainly due to line endings. Test results expect \r\n but are only getting \n. Guessing previous tests were all run on windows?
- Tried to apply smallest possible change.

I will eventually do something about the formatting.
Main fixes:
- Requires lxml to pass these tests
- The rest was all changes to the BeautifulSoup API
- convertEntities is no longer a valid parameter, and the value passed to it is unnecessary. The functionality is replaced by the features kwarg, which in this case I set to xml.
- Silenced some deprecation warnings by changing fromEncoding to from_encoding
-- from_encoding is effectively ignored in python 3, but probably still required in python 2
- NavigabeString class exists on the beautifulsoup4 module, so reference was updated.
- Added latex tests to tox.ini

Most of these were bytes vs unicode issues. Tried using the unicode_literals import where possible, but turns out there are some peculiar interactions between it and raw strings. Ended up not using the import in the Latex writer.
- Install pdftohtml in test Dockerfile
- Add pdf tests to tox.ini
- Use six.BytesIO instead of StringIO
- coerce some strings to bytes where needed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant