Skip to content

Latest commit

 

History

History
130 lines (70 loc) · 2.69 KB

CHANGELOG.rdoc

File metadata and controls

130 lines (70 loc) · 2.69 KB

0.7.2 / 2012-05-30

  • Bug fixes

    • Fix bug causing anchor links to have ‘#’ converted to ‘%23’

0.7.1 / 2012-01-20

  • Minor enhancements

    • Switch from robots gem (which people reported problems with) to new robotex gem

  • Bug fixes

    • Fix incorrect default file extension for KyotoCabinet

0.7.0 / 2012-01-19

  • Major enhancements

    • Added support for SQLite3 and Kyoto Cabinet storage

  • Minor enhancements

    • Added Page#base to use base HTML element

    • Use bundler for development dependencies

  • Bug fixes

    • Encode characters in URLs

    • Fix specs to run under rake

    • Fix handling of redirect_to in storage adapters

0.6.1 / 2011-02-24

  • Bug fixes

    • Fix a bug preventing SSL connections from working

0.6.0 / 2011-02-17

  • Major enhancements

    • Added support for HTTP Basic Auth with URLs containing a username and password

    • Added support for anonymous HTTP proxies

  • Minor enhancements

    • Added read_timeout option to set the HTTP request timeout in seconds

  • Bug fixes

    • Don’t fatal error if a page request times out

    • Fix double encoding of links containing %20

0.5.0 / 2010-09-01

  • Major enhancements

    • Added page storage engines for MongoDB and Redis

  • Minor enhancements

    • Use xpath for link parsing instead of CSS (faster) (Marc Seeger)

    • Added skip_query_strings option to skip links with query strings (Joost Baaij)

  • Bug fixes

    • Only consider status code 300..307 a redirect (Marc Seeger)

    • Canonicalize redirect links (Marc Seeger)

0.4.0 / 2010-04-08

  • Major enchancements

    • Cookies can be accepted and sent with each HTTP request.

0.3.2 / 2010-02-04

  • Bug fixes

    • Fixed issue that allowed following redirects off the original domain

0.3.1 / 2010-01-22

  • Minor enhancements

    • Added an attr_accessor to Page for the HTTP response body

  • Bug fixes

    • Fixed incorrect method calls in CLI scripts

0.3.0 / 2009-12-15

  • Major enchancements

    • Option for persistent storage of pages during crawl with TokyoCabinet or PStore

  • Minor enhancements

    • Options can be set via methods on the Core object in the crawl block

0.2.3 / 2009-11-01

  • Minor enhancements

    • Options are now applied per-crawl, rather than module-wide.

  • Bug fixes

    • Fixed a bug which caused deadlock if an exception occurred when crawling the last page in the queue.

0.2.2 / 2009-10-26

  • Minor enhancements

    • When the :verbose option is set to true, exception backtraces are printed to aid debugging.

0.2.1 / 2009-10-24

  • Major enhancements

    • Added HTTPS support.

    • CLI program ‘anemone’, which is a frontend for several tasks.

  • Minor enhancements

    • HTTP request response time recorded in Page.

    • Use of persistent HTTP connections.