Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/ Add xls support #169

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

yeungadrian
Copy link

@yeungadrian yeungadrian commented Dec 20, 2024

PR for Issue #137

  • Adding new XLS converter using same pandas -> html -> md flow, except with xlrd engine instead of openpyxl
  • New test file, which copies the existing xlsx and converts it to xls
  • Explicitly use openpxyl for xlsx files

@yeungadrian
Copy link
Author

@microsoft-github-policy-service agree

Copy link
Member

@afourney afourney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks clean and simple enough. Will run tests and merge if successful.

@afourney
Copy link
Member

Looks like some formatting errors. Can you please run pre-commit checks. Alternatively, I can run them later tonight.

@yeungadrian
Copy link
Author

Linting errors should be resolved now

@scruel
Copy link

scruel commented Dec 21, 2024

Thanks for creating this PR which related to my issue #137!
I already can parse most .xls files with xlrd lib, but will lose all images and some extra info in them, and some .xls files won't be able to parse with xlrd, just got errors raised. (xlrd repo is archived and won't update anymore.)
Currently, to avoid such weakness and errors, I have to first convert .xls to .xlsx, then parse it again.

FYI, I already tried many libs, but only aspose's lib can meet my requirements -- process all .xls files that I have without losing anything and raising errors:
https://products.aspose.com/cells/python-net/conversion/xls-to-xlsx/
Though this not an open-sourced lib and we are still finding other solutions.

I just thought its a Microsoft's project, so I hope maybe developers who built it can provide something better than xldr:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants