Motivation 
In the past few years, voices accusing established media outlets of being biased have grown louder. Perhaps the loudest of these belongs to Donald J. Trump, who often claims that he is treated unfairly by the "mainstream" media. 
This project will take a look at one of the outlets that Trump regularly criticises: The New York Times. Specifically, I wanted to explore how the New York Times uses its power as a gatekeeper of opinions. With this in mind, I analysed the part of the paper that does not aim to be neutral, the "Opinion" section. 
To find out what kind of opinions are welcome and how the readership reacts to them, I examined the sentiments expressed in over 700 articles about the President and the 500'000 corresponding comments. Furthermore, I examined whether readers treat female and male authors differently, as an ever-growing body of research would suggest.
Method
As described, I am dealing with text data that voices peoples' opinions. Thus, I choose to perform sentiment analysis on the text, or, how it might be better known to some, opinion mining. Sentiment analysis aims to uncover the attitude of the author on a particular topic and is therefore an ideal method for this project. 
I used the "Lexicoder Sentiment Dictionary" by Young & Soroka, which was originally developed to analyse news content, and is thus very fitting for this project. Before I applied the method, I ran the pre-processing script that is recommended by Lexicoders creators. I also removed all punctuation and stop words, stemmed the words and set them to lowercase. Further, I removed all words that were not at least in two documents and weighted the document feature matrix by proportion.
Data 
I scraped all NYTimes Op-Eds published between 2017 and 2018 and used the NYTimes API to get the corresponding comments. This resulted in a dataset consisting of 1'528 Op-Eds and about 550'000 corresponding comments.
Opinion articles can be either written by one of the New York Times staff writers or by outsiders. To determine their gender (which I need in the second part of the analysis), I used a "genderizer" module to derive the gender from their first names and manually coded those that could not be classified with a sufficiently high probability. About one-quarter of the articles about Trump were written by a female author.
Articles
Just about half of the articles (739 to be exact) were about Donald Trump, with about 350'000 corresponding comments. The following word-cloud shows the prevalence of the topics covered in these articles: 
The next figure shows that there has been a shift in topics over the years: In 2017, the discussion was dominated by the Republicans’ attempt to overturn the Affordable Care Act, while 2018 was mainly dominated by the investigation of Special Counsel Robert Mueller. 
Comments
Interestingly, I found that articles about Donald Trump get consistently more comments than those that do not. Averaged over both years, the articles that mention trump get 475 comments each, while articles that do not mention him get only 258. 
This is not very surprising for many reasons. Among other things, it is well known that the "Trump" brand has a high media value and anything "Trump" attracts a lot of attention.
What do New York Times commentators think about Trump?
In a first step, I examined the sentiment in the 739 opinion articles that dealt with Donald Trump. A positive sentiment score implies a positive opinion of the author and vice versa. 
To make sure that I don’t just measure the general sentiment of the article, which might well be negative, seeing the topics that were most frequently addressed, I also calculated a targeted sentiment score, by only looking at the words in the proximity to mentions of President Trump.
Opinions of all Authors
I found that the general sentiment of the articles is negative, with a mean sentiment score of -0.99. Again, this is not too surprising, as we have seen above that many articles deal with topics that in themselves might be negative, i.e. the Russia investigation. But what about the targeted sentiment around Mr Trump? Interestingly, it is even more negative: The average targeted sentiment score is -2.5 and is statistically different from the general sentiment score. 
Opinions of Staff Writers
The 739 articles of interest were written by 196 different authors. However, most of these articles were written by a small number of people: Only 15 people – all New York Times staff writers – wrote over 10 articles. These 15 writers wrote 513, or roughly 70% of the articles.
A closer look at the 15 staff writers paints a gloomy picture for President Trump: on average, none of them uses more positive than negative words when talking about him.
The left plot shows their general sentiment in articles about Trump, while the right plot shows the local sentiment. The plots show that the articles themselves tend to be rather negative, as the average article sentiment score is negative for nearly all authors. Furthermore, we learn that the sentiment around the President is even more negative. This indicates that the negativity of the articles might be driven by opinions about Trump.
What about the readers?
Apparently Trump is indeed not portrayed in a positive way by the opinion authors. But how do the readers react to such a negative description?
The table above shows that readers opinions about Trump are the main driver for the negativity in the NYTimes comment section: Comments referring to Trump were much more negative than the ones that did not, regardless of whether the article itself was about the President.
To validate this result,  I looked at the relative frequency of keywords (or keyness) by sentiment.
A high keyness value for a term means that this word is particularly important (or key) for a document to fall into one of the two categories, in this case, negative and positive comments. Remarkably, the term “Trump” has the second-highest keyness value for the negative category. This confirms the finding that negative comments are driven by the President.
Is sentiment contagious?
Other studies found that emotions can be contagious outside of in-person interactions, for example in online markets. Might this be happening here too?
The graph above indeed shows a clear positive correlation between article and comment sentiment: Negative articles get more negative comments and vice versa. Does that mean that we are observing emotion contagion? Possibly.
However, the correlation could also simply be due to the topics of articles: Imagine an article about a "negative" topic, like Trump's relationship with North Korea, compared to an article about maternity leave. The two topics will probably, regardless of the sentiment that the article is written in, attract different kinds of reader comments.
Attitudes Towards Female Authors 
Of course, the Times can be a gatekeeper in different ways, one of which is their choice of people that get their opinion published. About 25% of the articles were written by a female author. ​​​​​​​​​​​​​
The figure above shows a clear gender divide when it comes to article topics: Apparently, women tend to write about education, children and birth control. On the other hand, men tend to voice their opinion about the economy, healthcare and politics. 
Interestingly, the sentiment for comments on female-authored Op-Eds about Trump is slightly more positive (-0.21) than the ones for male-authored Op-Eds (-0.53). However, the sentiment of articles written by women is, in general, less negative: -0.64 for women versus -1.10 for men (local sentiment -1.9 vs. -2.7), which could indicate that I am mostly measuring the sentiment around the topic they address. 
However, previous studies have found that women (and POC) are much more likely to be attacked by readers. Thus, I examined the comments that explicitly mention the author of the Op-Ed. 
The table above indicates that male authors are more likely to be mentioned or addressed by readers. Women are more frequently addressed by their first name, while men are more likely to be addressed by their surname. 
Next, I looked into whether the comments that addressed the authors directly differed in sentiment.  
Surprisingly, the opposite of what I would have expected is happening: Mentioning the author is mostly done in a positive manner (keep in mind that the comments about Trump articles are generally negative, with a sentiment score of -0.9). Most interestingly, this effect is much bigger for female authors.
Conclusion
The results of the analysis do suggest that President Trump is mainly portrayed in a negative light, both by the authors and the readers. Is this a confirmation that Trump is indeed treated unfairly by the times? Well, that is a bit complicated: Some people might think that it would be fair if people that defend Trump's actions would have their opinions published in the New York Times just as often as those that criticise them. However, there might be a practical problem with that: In some cases, actions could be so hard to defend that the proposed arguments might not be factually sound enough for the standards of a newspaper. 
Furthermore, there is cautious evidence of emotional contagion happening between writers and commentators. However, this result needs to be examined further. Specifically, one would need to control for article topic to be able to make a definite statement about it.
This project found that, if a comment mentions the author of the Op-Ed, the comment is more positive if the article was written by a female author. This result is somewhat surprising, as previous studies have found that women are treated worse than men by online communities and evoke strong reactions. One possible explanation for this finding might be the rigorous moderation rules set up by the NYTimes. It might even be that the female authors get just as many or even more negative comments than male authors, but that comments about female authors are phrased in a manner that prevents them from being published.
_____________

Methods Used
Data Scraping, Data Manipulation, Text Mining, Natural Language Processing (NLP), Data Visualisation


Back to Top