The New York Times (NYT) is the nation's newspaper of record. It is both well-regarded and popular. It has won more Pulitzer awards than any other newspaper. And it is the 30th most visited website in the U.S. (as of October, 2017).
We explore some patterns in production of NYT between 1987 and 2007 using the annotated New York Times Corpus.
-
Not News
Has the proportion of news stories about topics unrelated to politics or the economy, such as, cooking, travel, fashion, music, etc., gone up over time?We measure kinds of news stories using news.desk and online.section. (See the script for other ideas for how we can measure the kind of news.)
- Proportion of Apolitical News Over Time: Script and Figure: Entire Newspaper (Using News Desk), Figure: Section A1 (Using News Desk), and Figure: Entire Newspaper (Using News Desk and Online Section)
-
Urban Vs. Rural
We use the locations (hand indexed), online.locations (algorithmically generated), and dateline fields to estimate rural vs. urban coverage within the US.- Script and Figure
-
National Vs. International We use the news.desk field Foreign News to estimate coverage of foreign news. We can also use the locations (hand indexed), online.locations (algorithmically generated), and dateline fields to estimate national vs. international coverage.
-
Corrections
We use the correction.date and correction.text to estimate rate of corrections over time, and what is being corrected (later). -
Length of Articles
We use the word.count field to estimate average length of articles and how it has changed over time.-
Article Word Count Over Time: Script, Figure: Average Word Count, and Figure: Median Word Count.
-
-
Number of Authors per Article
We use normalized.byline to estimate number of authors per article and how that has changed over time. -
No. of Articles per Author per Year
One common conjecture is that people are producing more. Is that true? We use the normalized.byline field to estimate average number of articles per year per author and how that metric has evolved over time.- No. of articles per author per year over time: Script and Figure.
-
Proportion of Wire Stories
Using byline.
- Proportion of AP and Reuters Stories Over Time: Script and Figure -
Race and Gender of Reporters
We use normalized.byline to get the names of the reporters. And we use the gender package and the ethnicolr package to impute gender and race of reporters.
Gaurav Sood
Released under CC BY 2.0.