Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError when last page is blank #15

Open
jonathan-s opened this issue Apr 1, 2014 · 2 comments
Open

IndexError when last page is blank #15

jonathan-s opened this issue Apr 1, 2014 · 2 comments

Comments

@jonathan-s
Copy link

If the last page is completely blank an index error occurs.

Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "scrape_pdf2.py", line 159, in test
    pdflist = get_table_pages(pages)
  File "scrape_pdf2.py", line 95, in get_table_pages
    cells = [pdf.process_page("example.pdf",p) for p in pages]
  File "/Users/jonathan/.virtualenvs/elance/lib/python2.7/site-packages/pdf_table_extract-0.1-py2.7.egg/pdftableextract/core.py", line 211, in process_page
    if vd[i+1]-vd[i] > maxdiv :
IndexError: index out of bounds
@eelsirhc
Copy link
Contributor

eelsirhc commented Apr 3, 2014

I'm not really sure what you want to happen. There is a page, so an IOError seems wrong. The code should raise an explicit error rather than fail by assuming that vd exists, but if there was no table on that page what behavior are you expecting?

@jonathan-s
Copy link
Author

I think an option for failing silently would be nice, that is if it fails it would just skip that page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants