Media Attention to Science


In this work, we try to answer three main questions: (i) has media attention to science increased over the years? , (ii) Does media attention mean scientific attention? and, (iii) Can we predict media attention based on simple properties of a paper, like number of downloads, title, abstract, authors, affiliations, etc.


We collected data for 44,000 papers from the last 13 years from PNAS journal.

Aggregate and processed stats from the dataset are available on Dropbox. Complete raw html files (containing the paper text, and other metadata) are also available on request (couple of gigs compressed).

Description of the files: Each paper has a unique identifier, indicating the year and week in which it was published. 101:2004, 102:2005, ... 113:2016 and 1 = week1, ... 52 = week 52.

  • paperid_media_metrics.txt contains all the media metrics, including mentions in news, and social media.
  • paperid_metadata.txt contains the metadata information such as title, authors, author affiliation, abstract, and field
  • paperid_num_citations.txt contains the number of citations.


The code used in the paper is on Github.