-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
broken chapters in an ebook #289
Comments
Okay, small update, I fixed the table of contents with calibre, although the problem remains, you lose enough time to fix it by hand and it would be nice if you could merge chapter text automatically respecting the toc. I am writing to add one thing though, in another book I adjusted the toc to add subchapters and this though generated an error in ebooklib which I am sharing: piece of code book = epub.read_epub("book.epub")
for a in book.toc:
print(type(a))
print(type(a.title), a.title) output with error
you can see that the object from class ebooklib.epub.Link becomes a tuple that does not contain the title property |
If it is not a problem could you send me sample of your input and output files to my email [email protected] so I can double check what is going on there. I want to check how is TOC organised in the original book. |
ok i re-sent you an email :) |
@aerkalov I have a related issue/question (but let me know if I should open a separate issue): I am trying to map the chapter/section info from the TOC to text elements for partitioning epub books. I've noticed a few things...
Looking at the |
Hi, for the past couple of days I've been trying to write a small script that takes text from an ebook, one chapter at a time and reworks it (I'm doing a dictation script).
The first version of my script works though I realized that the TOC doesn't match with the number of files.html inside the book.epub.
Basically there are 50 chapters in the TOC and 78 html pages.
When I look at the book from an ebookreader (librera) there is no noticeable problem because the toc works and even the program tells me how many pages (reflowed) are missing to the end of the chapter including also the text of the chapters that are split into multiple related pages.html in the book.epub.
However if I try to fetch the text with ebooklib by iterating on the EpubBook class with the get_body_content() method, only the partial text of the chapter that has been split into multiple pages.html is fetched.
So I don't know much about how an epub is supposed to be formed, but I assume that my ebook was created incorrectly (I think calibre was used), however having toc the right references, how could I merge those chapters that were fragmented into multiple files.html?
The text was updated successfully, but these errors were encountered: