Skip to content

Latest commit

 

History

History
31 lines (27 loc) · 1.06 KB

news-scraper.md

File metadata and controls

31 lines (27 loc) · 1.06 KB

BBC Most Popular News Scraper

Build a console application that scrapes the BBC news homepage and returns a JSON array of the most popular shared articles table.

Only the Shared section is to be scraped. Additionally you need to follow each link and get the size of the linked HTML (no assets) and the word (from the article) with the highest number of usages (excluding the, a, is, and or I)

Each element in the JSON array should container ‘href’, ‘title’, ‘size’ and ‘most_used_word’ keys corresponding with the most popular shared list.

Example JSON:

{
    "results": [
        {
            "title":"Thousands join central London protest",
            "href":"http://www.bbc.co.uk/news/uk-england-london-29919083",
            "size": "90.6kb",
            "most_used_word": "dog"
        },
        {
            "title":"Doughnut burger busts day's calories",
            "href":"http://www.bbc.co.uk/news/health-30000934",
            "size": "87kb",
            "most_used_word": "burger"
        }
    ]
}