Skip to content

Latest commit

 

History

History
60 lines (32 loc) · 2.6 KB

README.markdown

File metadata and controls

60 lines (32 loc) · 2.6 KB

Scrapybara

Ruby library providing DSL for describing custom Web scrapers.

Dependencies

PhantomJS — for installation please refer PhantomJS download page.

Installation

git clone https://github.com/OPLZZ/scrapybara.git
cd scrapybara

... install the required rubygems:

bundle install

Usage

You can take a look into examples directory for few annotated examples:

Each definition can be executed by scrapybara binary from bin directory

For example:

bundle exec ./bin/scrapybara examples/job-it.rb

If you run definition in interactive mode, code execution is paused for each #fetch, #extract part of the definition in Pry session. When code is paused, you can debug everything you can do in Pry session.

To continue, press CTRL+D.

To exit, type exit! and press enter.

bundle exec ./bin/scrapybara examples/job-it.rb --interactive

You can run each definition in browser by overriding Capybara driver from command line.

bundle exec ./bin/scrapybara examples/job-it.rb --driver selenium

You can combine previous options, so you can have interactive mode with selenium driver.

bundle exec ./bin/scrapybara examples/job-it.rb --driver selenium --interactive

You can enable debug for interactive mode to add even more breakpoints when definition is executed

bundle exec ./bin/scrapybara examples/job-it.rb --debug --interactive

##Funding Project of Operational Programme Human Resources and Employment No. CZ.1.04/5.1.01/77.00440. The project No. CZ.1.04/5.1.01/77.00440 was funded from the European Social Fund through the Operational Programme Human Resources and Employment and the state budget of Czech Republic.