-
-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update notebook header #95
Conversation
Looks nice, thanks. Not sure what's up with CI. Do you have time to look into it? |
Hmm I'm able to reproduce the slow behavior when running |
Running the corresponding notebook on the example's binder also works fine, so it's not an issue with https://archive.analytics.mybinder.org/ |
Downgrading the |
Hmm, which snippet is failing? (cc @martindurant) |
OK, so in --- a/fsspec/implementations/http.py
+++ b/fsspec/implementations/http.py
@@ -234,11 +234,12 @@ class HTTPFile(AbstractBufferedFile):
This is only called when position is still at zero,
and read() is called without a byte-count.
"""
- r = self.session.get(self.url, **self.kwargs)
- r.raise_for_status()
- out = r.content
- self.cache = AllBytes(out)
- self.size = len(out)
+ if not isinstance(self.cache, AllBytes):
+ r = self.session.get(self.url, **self.kwargs)
+ r.raise_for_status()
+ out = r.content
+ self.cache = AllBytes(out)
+ self.size = len(out) I am open to suggestions for something more general. I am thinking that the current BytesCache should just be simplified to be read-ahead only, instead of the complex logic that has caused so many problems and situations like this. |
Without fully understanding the situation, +1 to simplifying BytesCache :)
…On Thu, Sep 5, 2019 at 2:05 PM Martin Durant ***@***.***> wrote:
OK, so in readuntil() (used in readline/iter), the code is extending the
current block out to a full block, which is then always a line beyond the
end of the current block, so we are requesting a full 5MB for every line.
The following code greatly improves this for the case that the data is less
than a blocksize (this case):
--- a/fsspec/implementations/http.py+++ b/fsspec/implementations/http.py@@ -234,11 +234,12 @@ class HTTPFile(AbstractBufferedFile):
This is only called when position is still at zero,
and read() is called without a byte-count.
"""- r = self.session.get(self.url, **self.kwargs)- r.raise_for_status()- out = r.content- self.cache = AllBytes(out)- self.size = len(out)+ if not isinstance(self.cache, AllBytes):+ r = self.session.get(self.url, **self.kwargs)+ r.raise_for_status()+ out = r.content+ self.cache = AllBytes(out)+ self.size = len(out)
I am open to suggestions for something more general. I am thinking that
the current BytesCache should just be simplified to be read-ahead only,
instead of the complex logic that has caused so many problems and
situations like this.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#95?email_source=notifications&email_token=AAKAOIRKBGLSRNMY3UPJTIDQIFJ53A5CNFSM4ITYQOY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6ALLSI#issuecomment-528528841>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKAOIVY7GT5R656RVQQPYDQIFJ53ANCNFSM4ITYQOYQ>
.
|
@TomAugspurger , can you please have a careful look at the code in fsspec/filesystem_spec#128 ? |
@TomAugspurger and I were chatting and we came to the conclusion that perhaps ignoring the CI failure for now would be ok. The improvement is really great -- downgrading seems like the wrong course given other recent improvements to Note: @martindurant has already added a PR with the suggested code changes: fsspec/filesystem_spec#128 |
I'm not sure we can ignore the CI failure since that's where the sphinx docs are built / deployed. Is there a way to tell I don't have a full understanding of the changes in fsspec/filesystem_spec#128, but FWIW I can confirm it fixes the problem we're seeing in |
You can ignore errors https://nbsphinx.readthedocs.io/en/0.2.7/allow-errors.html but that might not be want you want. |
We would like to have the rendered notebook. And we might not want to
share the example at all if it doesn't work.
…On Thu, Sep 5, 2019 at 4:17 PM Benjamin Zaitlen ***@***.***> wrote:
You can ignore errors
https://nbsphinx.readthedocs.io/en/0.2.7/allow-errors.html but that might
not be want you want.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#95?email_source=notifications&email_token=AACKZTCSIHQKD36KYRN6OITQIFSO3A5CNFSM4ITYQOY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6AT5GY#issuecomment-528563867>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AACKZTCNPDDK7ZMVG3SJAE3QIFSO3ANCNFSM4ITYQOYQ>
.
|
@jrbourbeau , @quasiben , @mrocklin all very welcome to look at fsspec/filesystem_spec#128 - I have not yet decided if the simpler caching should be the default (it makes no difference for this HTTP case, because the data is less than one blocksize big) |
If you are installing fsspec from master, then this can be closed. |
All checks passed! Hooray. @jrbourbeau should we merge ? |
If the merge happens we should open up another issue to replace the dev fsspec with the released version |
Great idea : ) xref #96 |
Also, thanks for fixing upstream @martindurant! |
This PR highlights our existing header (with binder and GitHub links) using the sphinx
.. admonition
directive.While we're thinking about this, we could also try:
Further customizing the title of the blue box (I went with "Live Notebook", but am totally open to other suggestions)
Using a different binder button that makes more sense if you don't know what binder is (e.g. something like )
xref #94