Truncated ( ... ) download still happening even after #133 fix #150

mukuntharajaa · 2019-11-03T13:49:01Z

I am on master branch and currently updated to Oct 14 2019 commit. Still I am seeing truncated chapter downloads.

Book id: 9781491908419

Chapter 2: Item 5: second page shows "..." and Item 6 is altogether missing.

Please let me know, if any further information is required.

brookscl · 2019-11-03T15:10:55Z

Same here.

vavdoshka · 2019-11-04T13:27:34Z

+1

spac-valentin · 2019-11-06T12:52:33Z

+1

phamhoangtuan · 2019-11-06T15:08:06Z

+1

sorinescu · 2019-11-07T09:42:31Z

+1

manfredlotz · 2019-11-09T20:56:55Z

I have the same issue

Azarakhsh · 2019-11-10T12:58:15Z

+1

ghistes · 2019-11-10T19:11:40Z

I have the same problem.

It seems to me that the problem is the login. Even though you get a 200 response-code when logging in, you never get a sessionid-cookie, and for that reason when requesting the chapters you are treated as if you are not logged in, resulting in the truncations - at least that's how it looked to me when I was trying to understand what is going on (not sure if it helps...).

elrob · 2019-11-11T10:12:22Z

The above PR fixes this issue for me

manfredlotz · 2019-11-11T15:14:40Z

Your fix worked fine for me too. Thanks a lot for your work!

manfredlotz · 2019-11-11T16:31:17Z

I tried a couple of downloads, and mostly the epubs are not really usable. @elrob : However, this is not the fault of your fix.

milktea02 · 2019-11-11T20:25:16Z

Still having issues :( even with #152

elrob · 2019-11-12T06:20:01Z

I tried a couple of downloads, and mostly the epubs are not really usable. @elrob : However, this is not the fault of your fix.

@manfredlotz This issue is about the truncation of output as if you're not logged in. The PR I created doesn't change anything in the epub creation. I have had no issues with three books I've since tested with. Definitely usable for me. Can you give me an example of a book you've had issues with? And what those issues are?

elrob · 2019-11-12T06:21:23Z

Still having issues :( even with #152

@milktea02 What issues are you having? Are they related to truncation (this github issue tracks the truncation problem)?

mukuntharajaa · 2019-11-12T09:37:17Z

Still having issues :( even with #152

@milktea02 What issues are you having? Are they related to truncation (this github issue tracks the truncation problem)?

I have tried the same book again ( 9781491908419 ). I am able to see contents now without ellipsis. But when I click chapter 6, it takes me to last page of chapter 6 properly, but shows chapter 5 as highlighted on the left hand side layout.

Guess this is some minor stuff.

elrob · 2019-11-12T12:08:33Z

I have tried the same book again ( 9781491908419 ). I am able to see contents now without ellipsis. But when I click chapter 6, it takes me to last page of chapter 6 properly, but shows chapter 5 as highlighted on the left hand side layout.

Guess this is some minor stuff.

@mukuntharajaa Thanks for the response. If it is an issue you would like to raise and get fixed then I recommend creating a new github issue for it. This github issue was around the truncation of chapters due to authentication issues. So for now, if/when @lorenzodifuccia accepts #152 then github issue would be fixed.

varta2014 · 2019-11-12T12:37:50Z

can we try this code please
thank you

elrob · 2019-11-12T12:50:53Z

can we try this code please
thank you

@varta2014 If you want to try my change before it is merged into this repository then you can just pull it from https://github.com/elrob/safaribooks

manfredlotz · 2019-11-12T18:13:55Z

@elrob Unfortunately, I don't remember which book download I tried. I know that FBReader crashed when opening the epub. The last downloads I did were ok.

milktea02 · 2019-11-12T22:28:39Z

Still having issues :( even with #152

@milktea02 What issues are you having? Are they related to truncation (this github issue tracks the truncation problem)?

@elrob Tried Clean Code (9780136083238) and still get truncation. I'm logging in via SSO if that might be the issue.

AsimShakour · 2019-11-12T23:12:02Z

I am having truncation with book: 9781119449270 in this area: https://learning.oreilly.com/library/view/professional-c-7/9781119449270/fintro.xhtml

Thanks

elrob · 2019-11-13T04:57:40Z

Still having issues :( even with #152

@milktea02 What issues are you having? Are they related to truncation (this github issue tracks the truncation problem)?

@elrob Tried Clean Code (9780136083238) and still get truncation. I'm logging in via SSO if that might be the issue.

@milktea02 I have updated my change to restore the code that I thought was unnecessary. It was unnecessary for me but I'm not using SSO. Maybe you can try the latest version of my branch and see if it works for you now. I don't have SSO so I can't test it myself.

@AsimShakour Are you using SSO too? Maybe that's the problem. Can you also try with the latest change I have made (updated just now).

varta2014 · 2019-11-13T06:19:28Z

elrob thank you code work perfect !

vikdean · 2019-11-13T08:37:08Z

Still having issues :( even with #152

@milktea02 What issues are you having? Are they related to truncation (this github issue tracks the truncation problem)?

@elrob Tried Clean Code (9780136083238) and still get truncation. I'm logging in via SSO if that might be the issue.

@milktea02 I have updated my change to restore the code that I thought was unnecessary. It was unnecessary for me but I'm not using SSO. Maybe you can try the latest version of my branch and see if it works for you now. I don't have SSO so I can't test it myself.

@AsimShakour Are you using SSO too? Maybe that's the problem. Can you also try with the latest change I have made (updated just now).

Just tested it with 9780135262047; SSO works, but it still downloads the books partially.

brookscl · 2019-11-13T12:51:36Z

For those of you still having trouble: delete the Books directory that is created for the downloads. Then retry your download. I found that the tool will not re-download chapters it thinks are already there. I was able to download book 9781119558439 without any problems. Not familiar, but it seemed complete.

vikdean · 2019-11-13T13:03:44Z

For those of you still having trouble: delete the Books directory that is created for the downloads. Then retry your download. I found that the tool will not re-download chapters it thinks are already there. I was able to download book 9781119558439 without any problems. Not familiar, but it seemed complete.

Tried it 3 times in a row, issue is still the same for 9780135262047

mukuntharajaa · 2019-11-14T09:18:03Z

For those of you still having trouble: delete the Books directory that is created for the downloads. Then retry your download. I found that the tool will not re-download chapters it thinks are already there. I was able to download book 9781119558439 without any problems. Not familiar, but it seemed complete.

Tried it 3 times in a row, issue is still the same for 9780135262047

I have also tried downloading this ebook and accessed random pages, @elrob`s fix is working fine.

vikdean · 2019-11-14T09:35:40Z

For those of you still having trouble: delete the Books directory that is created for the downloads. Then retry your download. I found that the tool will not re-download chapters it thinks are already there. I was able to download book 9781119558439 without any problems. Not familiar, but it seemed complete.

Tried it 3 times in a row, issue is still the same for 9780135262047

I have also tried downloading this ebook and accessed random pages, @elrob`s fix is working fine.

Check the Chapter beginnings... it only captures a couple of lines, the rest is truncated...
Also, whats the epub size for you? Mine is 3MB

munish259272 · 2020-06-02T08:55:40Z

another way is
https://stackoverflow.com/questions/1324421/how-to-get-past-the-login-page-with-wget#answer-37780143

azmatsiddique · 2020-06-28T15:54:51Z

please provide images or video to get cookies.json file from inpection in mac

obar1 · 2020-07-07T13:10:43Z

@azmatsiddique ahhaha are you serious

darshanmnyk · 2020-09-03T11:01:16Z

Thanks a lot guys! Works splendidly.

MuhammedElGanzory · 2021-01-15T01:12:04Z

Thanks a lot guys! Works splendidly.

can you help me I'm trying to download also !!

MuhammedElGanzory · 2021-01-15T02:08:30Z

how to solve !!
Traceback (most recent call last):
File "C:\Users\LuckyMoon\Downloads\safarinew\safaribooks.py", line 10, in
import requests
ModuleNotFoundError: No module named 'requests'

dan-r95 · 2021-02-09T17:19:40Z

The only cookie which you really needs is the orm-jwt I think.
In chrome, navigate to the cookies tab and search for orm-jwt.

EmanuelMtzV · 2021-02-15T21:05:01Z

@azmatsiddique ahhaha are you serious

i cant get the cookies.json file either. Any clue where to get it ?

akriaueno · 2021-02-16T08:25:35Z

Below script works for me.
Paste this script into console of Chrome DevTools and get cookies.

console.log(JSON.stringify(document.cookie.split(';').map(c => c.split('=')).map(i => [i[0].trim(), i[1].trim()]).reduce((r, i) => {r[i[0]] = i[1]; return r;}, {})))

albertocavalcante · 2021-05-08T21:52:10Z

@vikdean
I think I've found the problem.
Using document.cookie from the console does not include the HttpOnly cookies and they are definitely required.
I can't work out how to access these via the console but I was able to find a way to get them that isn't too painful.

Login as usual to https://learning.oreilly.com/

Open the developer tools with F12

Go to Network tab in the developer tools

Access the profile page in the browser: https://learning.oreilly.com/profile/

In the Network tab, click on the request to /profile/ (it should be the first one)

Click on the Cookies tab in the request information

Right-click on the Request cookies text and choose Copy All

Paste this into the cookies.json file and then remove the outer section of the JSON document

Run the script without passing credentials: python3 safaribooks.py 9780135262047

p.s. sudo is not necessary.

I was unable to right click Request cookies and find a Copy all option.
Instead, I went to the Headers tab,
scrolled to Request Headers,
right clicked cookie and clicked on copy value.

Then, I executed sso_cookies.py passing the clipboard content as argument, wrapped in double quotes.

munish259272 · 2021-06-21T20:19:57Z

@vikdean
I think I've found the problem.
Using document.cookie from the console does not include the HttpOnly cookies and they are definitely required.
I can't work out how to access these via the console but I was able to find a way to get them that isn't too painful.

Login as usual to https://learning.oreilly.com/

Open the developer tools with F12

Go to Network tab in the developer tools

Access the profile page in the browser: https://learning.oreilly.com/profile/

In the Network tab, click on the request to /profile/ (it should be the first one)

Click on the Cookies tab in the request information

Right-click on the Request cookies text and choose Copy All

Paste this into the cookies.json file and then remove the outer section of the JSON document

Run the script without passing credentials: python3 safaribooks.py 9780135262047

p.s. sudo is not necessary.

I think something is not right or just changed. I tried but this repo master and yours @elrob with no success. The Developer Tools Network tab inside the cookies section (profile page) won’t show any httpOnly cookie. Don’t know if this is just me.

@villancikos
I have just tested again with my fork of the repo. It is working fine for me when I download the cookies following the instructions above. httpOnly doesn't refer to the name of a cookie. Two of the cookies groot_sessionid and orm-rt are set to httpOnly=true so it means some other methods of downloading the cookies don't work. The method above does work for me in firefox. If you're still having an issue, can you explain where it is going wrong and I might be able to help.

Hi @elrob . Thanks for your answer. First, I know httpOnly is a type of cookie. It is strange that my Chrome does not "tick" them in the dev tools.
I tried adding the common cookies using the javascript output and manually these two cookies: groot_sessionid and orm-rt but I am still getting a truncated epub.
BTW I am using your repo on the master branch.
Just for reference, the book id is 9781491973783.

This does not work anymore

domrany64 · 2022-02-15T02:39:10Z

I could use it perfectly fine, after trouble I had to obtain the cookies in the right structure.
To get the cookies, I'm using a Chrome extension called Cookie-Editor.

Open the O'Reilly website in the Chrome and log in using SSO
Open Cookie-Editor.
Click EXPORT which is the right icon in the bottom of Cookie-Editor's window.
Paste the cookies (which are in the clipboard, now) to an editor.
Find the "name": "orm-jwt", among the text and copy the value from that section
Create the cookies.json file like this {"orm-jwt": "XXX"} where XXX is copied value from step 5
Run python3 safaribooks.py XXXXXXXXXXX and enjoy the EPUB book.

Marakai · 2022-10-14T04:22:39Z

Below script works for me. Paste this script into console of Chrome DevTools and get cookies.
console.log(JSON.stringify(document.cookie.split(';').map(c => c.split('=')).map(i => [i[0].trim(), i[1].trim()]).reduce((r, i) => {r[i[0]] = i[1]; return r;}, {})))

Using this with Chrome in mid 2022 and it seems to be the easiest approach by far. Copy-pasted into cookies.json in the script directory and it works like a charm!

MrDandas · 2022-11-07T12:49:11Z

might add PR later, but for now, I've provided my own solution based on browser_cookie3. I just log in through browser and let library to grab cookies from browser:

#            self.session.cookies.update(json.load(open(COOKIES_FILE)))
            self.session.cookies = browser_cookie3.firefox(domain_name='oreilly.com')

Metal-Milonga · 2023-07-22T14:32:08Z

Below script works for me. Paste this script into console of Chrome DevTools and get cookies.
console.log(JSON.stringify(document.cookie.split(';').map(c => c.split('=')).map(i => [i[0].trim(), i[1].trim()]).reduce((r, i) => {r[i[0]] = i[1]; return r;}, {})))
Using this with Chrome in mid 2022 and it seems to be the easiest approach by far. Copy-pasted into cookies.json in the script directory and it works like a charm!

worked again with Chrome in July 2023. Copy the returned json to cookies.json, and download was successful.

KonScanner · 2023-10-20T22:35:48Z

@domrany64's answer still works!

eurubkov · 2024-01-08T01:47:31Z

I am not finding the "name": "orm-jwt", or just "orm-jwt" in the cookies at all. Is it still working for others?

eurubkov · 2024-01-08T01:53:47Z

Looks like it's the Cookie-Editor extension that didn't get all the cookies.
Here's how I got it instead:
Right-Click -> Inspect Element -> Application -> Cookies -> selecting the orelly website -> orm-jwt and copying the value from there.
Then created the cookies.json as mentioned above and it worked.

yuletide · 2024-05-30T20:06:47Z

This method generally worked for me per @albertocavalcante's post

This tool is a lifesaver for anyone who gets access to this site via their library (like SFPL) since the institutional login UI is completely useless and doesn't save your reading progress, lists, or anything else. Not sure how anyone uses the site at all without a tool like this one

henryheim · 2024-08-05T23:33:34Z

Like some others, I have an SSO login to O'Reilly and had trouble with the script as it is in the master branch. @albertocavalcante 's method, combined with the dev tools one-liner by @akriaueno fixed everything and the tool works easily now. Just save the output of the dev tools script into cookies.json in the safaribooks directory and run the script without any auth parameters.

Thank you both for your debugging and guides.

elrob mentioned this issue Nov 11, 2019

Handle orm-rt cookie issue #152

Closed

McPatate mentioned this issue Nov 13, 2019

Handled orm tokens #153

Closed

vinayakg mentioned this issue May 6, 2020

epub generated is partial #215

Closed

danielctrl mentioned this issue Jun 17, 2020

sso_cookies.py not working when "=" equals symbol is present #220

Closed

lorenzodifuccia removed work in progress bug labels Jul 21, 2020

lorenzodifuccia mentioned this issue Oct 15, 2020

book download using company mail id #228

Closed

0Ky mentioned this issue Nov 17, 2021

unable to perform auth login to Safari Books Online #301

Closed

lorenzodifuccia mentioned this issue Jan 21, 2022

Add info to readme about ORLY_BASE_HOST for SSO #304

Closed

lorenzodifuccia mentioned this issue Jan 5, 2023

SSO, Company, University, etc., Login Problems: *READ BEFORE NEW ISSUE* #334

Open

pnhuy mentioned this issue Aug 20, 2023

Is it normal normal that the program can't login after 10 minutes? #346

Open

bgaprogrammer mentioned this issue Sep 6, 2024

Downloaded epub files corrupted kirinnee/oreilly-downloader#9

Open

Truncated ( ... ) download still happening even after #133 fix #150

Truncated ( ... ) download still happening even after #133 fix #150

Comments

mukuntharajaa commented Nov 3, 2019

brookscl commented Nov 3, 2019

vavdoshka commented Nov 4, 2019

spac-valentin commented Nov 6, 2019

phamhoangtuan commented Nov 6, 2019

sorinescu commented Nov 7, 2019

manfredlotz commented Nov 9, 2019

Azarakhsh commented Nov 10, 2019

ghistes commented Nov 10, 2019

elrob commented Nov 11, 2019

manfredlotz commented Nov 11, 2019

manfredlotz commented Nov 11, 2019

milktea02 commented Nov 11, 2019

elrob commented Nov 12, 2019 • edited Loading

elrob commented Nov 12, 2019

mukuntharajaa commented Nov 12, 2019

elrob commented Nov 12, 2019

varta2014 commented Nov 12, 2019

elrob commented Nov 12, 2019

manfredlotz commented Nov 12, 2019

milktea02 commented Nov 12, 2019 • edited Loading

AsimShakour commented Nov 12, 2019

elrob commented Nov 13, 2019

varta2014 commented Nov 13, 2019

vikdean commented Nov 13, 2019

brookscl commented Nov 13, 2019

vikdean commented Nov 13, 2019

mukuntharajaa commented Nov 14, 2019

vikdean commented Nov 14, 2019

munish259272 commented Jun 2, 2020

azmatsiddique commented Jun 28, 2020

obar1 commented Jul 7, 2020

darshanmnyk commented Sep 3, 2020

MuhammedElGanzory commented Jan 15, 2021

MuhammedElGanzory commented Jan 15, 2021

dan-r95 commented Feb 9, 2021 • edited Loading

EmanuelMtzV commented Feb 15, 2021

akriaueno commented Feb 16, 2021

albertocavalcante commented May 8, 2021

munish259272 commented Jun 21, 2021

domrany64 commented Feb 15, 2022

Marakai commented Oct 14, 2022

MrDandas commented Nov 7, 2022

Metal-Milonga commented Jul 22, 2023

KonScanner commented Oct 20, 2023

eurubkov commented Jan 8, 2024

eurubkov commented Jan 8, 2024

yuletide commented May 30, 2024 • edited Loading

henryheim commented Aug 5, 2024

elrob commented Nov 12, 2019 •

edited

Loading

milktea02 commented Nov 12, 2019 •

edited

Loading

dan-r95 commented Feb 9, 2021 •

edited

Loading

yuletide commented May 30, 2024 •

edited

Loading