Skip to content

FastCrawler

Mohammad Sadegh Majidi Kadkani edited this page Jul 28, 2023 · 1 revision

FastCrawler

image

FastCrawler is a modern, fast (high-performance), web crawling framework for building Scrapers with Python 3.11+ based on standard Python type hints.

Documentation: https://github.com/fast-crawler/ Source Code: https://github.com/fast-crawler/

Key Features

  • Fast: Very high performance, on par with NodeJS and Go (thanks to Starlette and Pydantic). One of the fastest Python frameworks available.
  • Fast to code: Increase the speed to develop features by about 200% to 300%. *
  • Fewer bugs: Reduce about 40% of human (developer) induced errors. *
  • Intuitive: Great editor support. Completion everywhere. Less time debugging.
  • Easy: Designed to be easy to use and learn. Less time reading docs.
  • Short: Minimize code duplication. Multiple features from each parameter declaration. Fewer bugs.
  • Robust: Get production-ready code. With automatic interactive documentation.
  • Standards-based: Based on (and fully compatible with) the open standards for APIs: OpenAPI (previously known as Swagger) and JSON Schema.
  • estimation based on tests on an internal development team, building production applications

Requirements

  • python 3.11
  • pydantic

Installation

pip install fastcrawler

Example

  • Create a file main.py with:
from fastcrawler import Spider, Crawler, Parser

class CategoryUrlParser(Parser):
    class Config:
        urls_resolver = parser.JsonField("category.url")

class ProductParser(Parser):
    name: str = parser.JsonField("product.name")
    price: int = parser.JsonField("product.price")

class DigikalaList(Spider):
    url = "https://google.com"
    parser = CategoryUrlParser

Crawler(
    DigikalaList >> DigikalaDetail
)
  • Run the server with:
python main.py
Clone this wiki locally