pdfSpider

Spiders pdfs from a given website :)

It's kinda like a lib that takes an inital entry point for a website and

crawls every found website fitting a given regex-1
writes all the websites to file fitting a given regex-2

Since the design is as modular as possible writers/readers/converters can be exchanged, hence you can crawl anything you like as long as you exchange a few files.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
Abhaengigkeitshoelle		Abhaengigkeitshoelle
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pdfSpider

About

Releases

Packages

Languages

e2bady/pdfSpider

Folders and files

Latest commit

History

Repository files navigation

pdfSpider

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages