-
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bulk download satellite data concurrently via asyncio #2
Comments
Very experimental and buggy get_cryosat.py at the moment. Much reliant on riohttp framework from https://hackernoon.com/asyncio-for-the-working-python-developer-5c468e6e2e8e. TODO port to rioftp instead... Temporarily backport to using atom stable 1.18.0 instead of 1.19 beta due to hydrogen truncated output bubble issue nteract/hydrogen#898
In order to be a bit nice on people's server infrastructure, and prevent FTP error 421 "Too many connections", use semaphores to limit number of simultaneous FTP connections. Helpful examples of Python 3 implementations of Semaphores in asyncio:
Official Python3 API docs on the implementation: Note use of |
…emaphore limit Better code execution workflow using asyncio.new_event_loop(). This creates a new async event loop so we don't have to shut down hydrogen kernels every time we run the script. I.e. You can re-run the script over and over again in the same IPython console. Change from python os module to pathlib module for more high level and elegant filename parsing (may break windows compatibility). Migrate asynchronous function's previous dependency on ftplib to aioftp instead, so that synchronous and asynchronous code are now fully independent. Intention to deprecate synchronous codeblock on next commit, leaving both in here for record/benchmark-comparison purposes. Implement asyncio.Semaphore to prevent raising of FTP 421 Too many connections ... error. Possibility to softcode the current '7' connection limit using a smart check loop in the future?
Get working copy of new asyncio based get_cryosat.py, and an up to date copy of the atom-hydrogen-beta dockerfile (with commented out fallback atom-stable code lines)
Use python 3.5's built-in asyncio module to concurrently bulk download satellite data from http/ftp servers.
See:
Hackernoon blog post
asynctio
aioftp docs
aiohttp docs
The text was updated successfully, but these errors were encountered: