GitHub

#keepgrabbing.py#

This is a transcription of the Python script which Aaron Swartz used to download a large number of documents from JSTOR archive between 2010 and 2011.

I'm not sure what Aaron would have wanted us to do with this code, but my instinct is that he'd want it freely available, and it's worth having in an executable machine readable format under version control, rather than on a hard drive somewhere which has long since stopped spinning. I guess this is sort of a memorial in some sense.

Rest in peace.

##Todo##

Line 5 contains a redacted hostname/domain, does anyone know what that was?
sprky0#1 @speedplane points out there there was a second version of the script (keepgrabbing2.py) which is referenced in the indictment. If anyone has a copy of this please get in touch or submit a pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
build		build
README.md		README.md
keepgrabbing.py		keepgrabbing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

matiasdiez/jstor

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages