Sie sind auf Seite 1von 32

9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

VARIANCE EXPLAINED

menu

Text analysis of Trump's tweets con rms he writes only the (angrier)
Android half
I dont normally post about politics (Im not particularly savvy about polling, which is where data science has had
the largest impact on politics). But this weekend I saw a hypothesis about Donald Trumps twitter account that
simply begged to be investigated with data:

Todd Vaziri Follow


@tvaziri

Every non-hyperbolic tweet is from iPhone (his staff).

http://varianceexplained.org/r/trump-tweets/ 1/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

Every hyperbolic tweet is from Android (from him).


12:20 PM - Aug 6, 2016

264 10,331 14,479

When Trump wishes the Olympic team good luck, hes tweeting from his iPhone. When hes insulting a rival, hes
usually tweeting from an Android. Is this an artifact showing which tweets are Trumps own and which are by some
handler?
Others have explored Trumps timeline and noticed this tends to hold up- and Trump himself does indeed
tweet from a Samsung Galaxy. But how could we examine it quantitatively? Ive been writing about text mining
and sentiment analysis recently, particularly during my development of the tidytext R package with Julia Silge, and
this is a great opportunity to apply it again.
My analysis, shown below, concludes that the Android and iPhone tweets are clearly from different
people, posting during different times of day and using hashtags, links, and retweets in distinct ways. Whats
more, we can see that the Android tweets are angrier and more negative, while the iPhone tweets tend to be
benign announcements and pictures. Overall Id agree with @tvaziris analysis: this lets us tell the difference
between the campaigns tweets (iPhone) and Trumps own (Android).

The dataset
First well retrieve the content of Donald Trumps timeline using the userTimeline function in the twitteR
package:1

library(dplyr)
library(purrr)
library(twitteR)

# You'd need to set global options with an authenticated app


setup_twitter_oauth(getOption("twitter_consumer_key"),
getOption("twitter_consumer_secret"),
getOption("twitter_access_token"),
getOption("twitter_access_token_secret"))

http://varianceexplained.org/r/trump-tweets/ 2/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained
# We can request only 3200 tweets at a time; it will return fewer
# depending on the API
trump_tweets <- userTimeline("realDonaldTrump", n = 3200)
trump_tweets_df <- tbl_df(map_df(trump_tweets, as.data.frame))

# if you want to follow along without setting up Twitter authentication,


# just use my dataset:
load(url("http://varianceexplained.org/files/trump_tweets_df.rda"))

We clean this data a bit, extracting the source application. (Were looking only at the iPhone and Android tweets- a
much smaller number are from the web client or iPad).

library(tidyr)

tweets <- trump_tweets_df %>%


select(id, statusSource, text, created) %>%
extract(statusSource, "source", "Twitter for (.*?)<") %>%
filter(source %in% c("iPhone", "Android"))

Overall, this includes 628 tweets from iPhone, and 762 tweets from Android.
One consideration is what time of day the tweets occur, which wed expect to be a signature of their user.
Here we can certainly spot a difference:

library(lubridate)
library(scales)

tweets %>%
count(source, hour = hour(with_tz(created, "EST"))) %>%
mutate(percent = n / sum(n)) %>%
ggplot(aes(hour, percent, color = source)) +
geom_line() +
scale_y_continuous(labels = percent_format()) +
labs(x = "Hour of day (EST)",
y = "% of tweets",
color = "")
http://varianceexplained.org/r/trump-tweets/ 3/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

http://varianceexplained.org/r/trump-tweets/ 4/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

Trump on the Android does a lot more tweeting in the morning, while the campaign posts from the iPhone
more in the afternoon and early evening.
Another place we can spot a difference is in Trumps anachronistic behavior of manually retweeting people
by copy-pasting their tweets, then surrounding them with quotation marks:

Donald J. Trump Follow


@realDonaldTrump

"@trumplican2016: @realDonaldTrump @DavidWohl stay the


course mr trump your message is resonating with the PEOPLE"
9:00 PM - Jul 27, 2016

4,304 6,544 23,603

Almost all of these quoted tweets are posted from the Android:

http://varianceexplained.org/r/trump-tweets/ 5/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

In the remaining by-word analyses in this text, Ill lter these quoted tweets out (since they contain text from
followers that may not be representative of Trumps own tweets).
Somewhere else we can see a difference involves sharing links or pictures in tweets.
http://varianceexplained.org/r/trump-tweets/ 6/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

tweet_picture_counts <- tweets %>%


filter(!str_detect(text, '^"')) %>%
count(source,
picture = ifelse(str_detect(text, "t.co"),
"Picture/link", "No picture/link"))

ggplot(tweet_picture_counts, aes(source, n, fill = picture)) +


geom_bar(stat = "identity", position = "dodge") +
labs(x = "", y = "Number of tweets", fill = "")

http://varianceexplained.org/r/trump-tweets/ 7/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

It turns out tweets from the iPhone were 38 times as likely to contain either a picture or a link. This also
makes sense with our narrative: the iPhone (presumably run by the campaign) tends to write announcement
tweets about events, like this:
http://varianceexplained.org/r/trump-tweets/ 8/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

Donald J. Trump Follow


@realDonaldTrump

Thank you Windham, New Hampshire! #TrumpPence16 #MAGA


7:19 PM - Aug 6, 2016 Windham, NH
2,337 6,044 19,779

While Android (Trump himself) tends to write picture-less tweets like:

Donald J. Trump Follow


@realDonaldTrump

The media is going crazy. They totally distort so many things on


purpose. Crimea, nuclear, "the baby" and so much more. Very
dishonest!
2:31 PM - Aug 7, 2016
10,860 13,336 39,398

Comparison of words
http://varianceexplained.org/r/trump-tweets/ 9/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

Now that were sure theres a difference between these two accounts, what can we say about the difference in the
content? Well use the tidytext package that Julia Silge and I developed.
We start by dividing into individual words using the unnest_tokens function (see this vignette for more), and
removing some common stopwords2:

library(tidytext)

reg <- "([^A-Za-z\\d#@']|'(?![A-Za-z\\d#@]))"


tweet_words <- tweets %>%
filter(!str_detect(text, '^"')) %>%
mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")) %>%
unnest_tokens(word, text, token = "regex", pattern = reg) %>%
filter(!word %in% stop_words$word,
str_detect(word, "[a-z]"))

tweet_words

## # A tibble: 8,753 x 4
## id source created word
## <chr> <chr> <time> <chr>
## 1 676494179216805888 iPhone 2015-12-14 20:09:15 record
## 2 676494179216805888 iPhone 2015-12-14 20:09:15 health
## 3 676494179216805888 iPhone 2015-12-14 20:09:15 #makeamericagreatagain
## 4 676494179216805888 iPhone 2015-12-14 20:09:15 #trump2016
## 5 676509769562251264 iPhone 2015-12-14 21:11:12 accolade
## 6 676509769562251264 iPhone 2015-12-14 21:11:12 @trumpgolf
## 7 676509769562251264 iPhone 2015-12-14 21:11:12 highly
## 8 676509769562251264 iPhone 2015-12-14 21:11:12 respected
## 9 676509769562251264 iPhone 2015-12-14 21:11:12 golf
## 10 676509769562251264 iPhone 2015-12-14 21:11:12 odyssey
## # ... with 8,743 more rows

What were the most common words in Trumps tweets overall?

http://varianceexplained.org/r/trump-tweets/ 10/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

These should look familiar for anyone who has seen the feed. Now lets consider which words are most common
from the Android relative to the iPhone, and vice versa. Well use the simple measure of log odds ratio, calculated
for each word as:3

http://varianceexplained.org/r/trump-tweets/ # in Android 11/32


9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

# in Android+1

Total Android+1
log 2 ( )
# in iPhone+1

Total iPhone+1

android_iphone_ratios <- tweet_words %>%


count(word, source) %>%
filter(sum(n) >= 5) %>%
spread(source, n, fill = 0) %>%
ungroup() %>%
mutate_each(funs((. + 1) / sum(. + 1)), -word) %>%
mutate(logratio = log2(Android / iPhone)) %>%
arrange(desc(logratio))

Which are the words most likely to be from Android and most likely from iPhone?

http://varianceexplained.org/r/trump-tweets/ 12/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

A few observations:

Most hashtags come from the iPhone. Indeed, almost no tweets from Trumps Android contained
hashtags, with some rare exceptions like this one. (This is true only because we ltered out the quoted
retweets, as Trump does sometimes quote tweets like this that contain hashtags).

http://varianceexplained.org/r/trump-tweets/ 13/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

Words like join and tomorrow, and times like 7pm, also came only from the iPhone. The iPhone
is clearly responsible for event announcements like this one (Join me in Houston, Texas tomorrow night at
7pm!)

A lot of emotionally charged words, like badly, crazy, weak, and dumb, were
overwhelmingly more common on Android. This supports the original hypothesis that this is the
angrier or more hyperbolic account.

Sentiment analysis: Trumps tweets are much more negative than his campaigns
Since weve observed a difference in sentiment between the Android and iPhone tweets, lets try quantifying it.
Well work with the NRC Word-Emotion Association lexicon, available from the tidytext package, which associates
words with 10 sentiments: positive, negative, anger, anticipation, disgust, fear, joy, sadness, surprise, and
trust.

nrc <- sentiments %>%


filter(lexicon == "nrc") %>%
dplyr::select(word, sentiment)

nrc

## # A tibble: 13,901 x 2
## word sentiment
## <chr> <chr>
## 1 abacus trust
## 2 abandon fear
## 3 abandon negative
## 4 abandon sadness
## 5 abandoned anger
## 6 abandoned fear
## 7 abandoned negative
## 8 abandoned sadness
## 9 abandonment anger

http://varianceexplained.org/r/trump-tweets/ 14/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

## 10 abandonment fear
## # ... with 13,891 more rows

To measure the sentiment of the Android and iPhone tweets, we can count the number of words in each category:

sources <- tweet_words %>%


group_by(source) %>%
mutate(total_words = n()) %>%
ungroup() %>%
distinct(id, source, total_words)

by_source_sentiment <- tweet_words %>%


inner_join(nrc, by = "word") %>%
count(sentiment, id) %>%
ungroup() %>%
complete(sentiment, id, fill = list(n = 0)) %>%
inner_join(sources) %>%
group_by(source, sentiment, total_words) %>%
summarize(words = sum(n)) %>%
ungroup()

head(by_source_sentiment)

## # A tibble: 6 x 4
## source sentiment total_words words
## <chr> <chr> <int> <dbl>
## 1 Android anger 4901 321
## 2 Android anticipation 4901 256
## 3 Android disgust 4901 207
## 4 Android fear 4901 268
## 5 Android joy 4901 199
## 6 Android negative 4901 560

(For example, we see that 321 of the 4901 words in the Android tweets were associated with anger). We then
want to measure how much more likely the Android account is to use an emotionally-charged term relative to the
iPhone account. Since this is count data, we can use a Poisson test to measure the difference:
http://varianceexplained.org/r/trump-tweets/ 15/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

library(broom)

sentiment_differences <- by_source_sentiment %>%


group_by(sentiment) %>%
do(tidy(poisson.test(.$words, .$total_words)))

sentiment_differences

## Source: local data frame [10 x 9]


## Groups: sentiment [10]
##
## sentiment estimate statistic p.value parameter conf.low
## (chr) (dbl) (dbl) (dbl) (dbl) (dbl)
## 1 anger 1.492863 321 2.193242e-05 274.3619 1.2353162
## 2 anticipation 1.169804 256 1.191668e-01 239.6467 0.9604950
## 3 disgust 1.677259 207 1.777434e-05 170.2164 1.3116238
## 4 fear 1.560280 268 1.886129e-05 225.6487 1.2640494
## 5 joy 1.002605 199 1.000000e+00 198.7724 0.8089357
## 6 negative 1.692841 560 7.094486e-13 459.1363 1.4586926
## 7 positive 1.058760 555 3.820571e-01 541.4449 0.9303732
## 8 sadness 1.620044 303 1.150493e-06 251.9650 1.3260252
## 9 surprise 1.167925 159 2.174483e-01 148.9393 0.9083517
## 10 trust 1.128482 369 1.471929e-01 350.5114 0.9597478
## Variables not shown: conf.high (dbl), method (fctr), alternative (fctr)

And we can visualize it with a 95% con dence interval:

http://varianceexplained.org/r/trump-tweets/ 16/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

http://varianceexplained.org/r/trump-tweets/ 17/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

Thus, Trumps Android account uses about 40-80% more words related to disgust, sadness, fear, anger, and
other negative sentiments than the iPhone account does. (The positive emotions werent different to a
statistically signi cant extent).
Were especially interested in which words drove this different in sentiment. Lets consider the words with the
largest changes within each category:

http://varianceexplained.org/r/trump-tweets/ 18/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

This con rms that lots of words annotated as negative sentiments (with a few exceptions like crime and
terrorist) are more common in Trumps Android tweets than the campaigns iPhone tweets.

Conclusion: the ghost in the political machine


I was fascinated by the recent New Yorker article about Tony Schwartz, Trumps ghostwriter for The Art of the
Deal. Of particular interest was how Schwartz imitated Trumps voice and philosophy:

In his journal, Schwartz describes the process of trying to make Trumps voice palatable in the book. It was kind of
a trick, he writes, to mimic Trumps blunt, staccato, no-apologies delivery while making him seem almost
boyishly appealing. Looking back at the text now, Schwartz says, I created a character far more winning than
Trump actually is.

Like any journalism, data journalism is ultimately about human interest, and theres one human Im interested in:
who is writing these iPhone tweets?
The majority of the tweets from the iPhone are fairly benign declarations. But consider cases like these, both
posted from an iPhone:

Donald J. Trump Follow


@realDonaldTrump

Like the worthless @NYDailyNews, looks like @politico will be


going out of business. Bad reporting- no money, no cred!
6:00 AM - Feb 10, 2016

628 2,070 6,568

Donald J. Trump Follow


@realDonaldTrump

Failing @NYTimes will always take a good story about me and


make it bad. Every article is unfair and biased. Very sad!
http://varianceexplained.org/r/trump-tweets/ 19/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

9:11 AM - May 20, 2016 United States


1,681 3,640 11,693

These tweets certainly sound like the Trump we all know. Maybe our above analysis isnt complete: maybe Trump
has sometimes, however rarely, tweeted from an iPhone (perhaps dictating, or just using it when his own battery
ran out). But what if our hypothesis is right, and these werent authored by the candidate- just someone trying
their best to sound like him?
Or what about tweets like this (also iPhone), which defend Trumps slogan- but doesnt really sound like
something hed write?

Donald J. Trump Follow


@realDonaldTrump

Our country does not feel 'great already' to the millions of


wonderful people living in poverty, violence and despair.
7:42 PM - Jul 27, 2016

9,496 16,330 48,130

A lot has been written about Trumps mental state. But Id really rather get inside the head of this anonymous
staffer, whose job is to imitate Trumps unique cadence (Very sad!), or to put a positive spin on it, to millions of
followers. Are they a true believer, or just a cog in a political machine, mixing whatever mainstream appeal they
can into the @realDonaldTrump concoction? Like Tony Schwartz, will they one day regret their involvement?

1. To keep the post concise I dont show all of the code, especially code that generates gures. But you can nd the full code
here.
2. We had to use a custom regular expression for Twitter, since typical tokenizers would split the # off of hashtags and @ off of
usernames. We also removed links and ampersands ( &amp; ) from the text.
3. The plus ones, called Laplace smoothing are to avoid dividing by zero and to put more trust in common words.

David Robinson
Data Scientist at Stack Over ow, works in R and Python.

http://varianceexplained.org/r/trump-tweets/ 20/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

Email Twitter Github Stack Over ow

Subscribe

Your email
Subscribe to this blog

Recommended Blogs

DataCamp
R Bloggers
RStudio Blog
R4Stats
Simply Statistics

Text analysis of Trump's tweets con rms he writes only the (angrier) Android half was published on August 09, 2016.

188 Comments Variance Explained


1 Login

Sort by Best
Recommend 88 Share

Join the discussion

LOG IN WITH
OR SIGN UP WITH DISQUS ?

Name

Adam Zeller a year ago


Would be interested to know if the Trump tweets get more engagement (i.e. RTs and Favorites) than staff
authored tweets. Would say a lot for authenticity especially since his team is using social marketing best practices
and he is not.
41 Reply Share
http://varianceexplained.org/r/trump-tweets/ 21/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained
41 Reply Share

David Robinson Mod > Adam Zeller a year ago


Here's a breakdown. The biggest difference wasn't between iPhone and Android, but regarding manual RTs
and pictures with pictures/links, which got fewer. The typical "just text" tweet got about the same
engagement between iPhone and Android.

(Error bars are 95% confidence intervals on a geometric scale)


13 Reply Share

IstiakGani > David Robinson a year ago


Any chance you could show the code for this? It would be super helpful in making me better at R!
2 Reply Share

mimiUSA > David Robinson a year ago


Fascinating. Confirms one track mind of vindictive narcissist.
2 Reply Share

disqus_xEFHDaxwEP > David Robinson a year ago


How to compare your Sisyphean with the Jewish mysticists, searching the covert second Bible
sense? Your primitive politically correct brains does not comprehend the Trump`s understanding of
the disasters in American life. Dig your tweets poor chap!
Reply Share

David Robinson Mod > disqus_xEFHDaxwEP a year ago


What.
29 Reply Share

Gabriel Jones > David Robinson a year ago


Day 203, still don't understand message. Please send help
http://varianceexplained.org/r/trump-tweets/ 11 Reply Share 22/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained
11 Reply Share

jimshaw54 > disqus_xEFHDaxwEP a year ago


But a lot of people do comprehend that Trump could very well be a disaster if allowed to sit
in the Oval office.
10 Reply Share

tonewilliamsVERIFIED > jimshaw54 8 months ago


#guesswhathappened :(
1 Reply Share

Kristi Johnson > disqus_xEFHDaxwEP a year ago


Seriously, you should take your meds.Trump could care less about anything disastrous in a
life in America. He's a racist and wants to shut down and hurt and slow down small business
so that they will be bought out by bigger corporations which he backs. The only reason that
he's not completely bankrupt is because he makes these deals with the banks where he has
borrowed millions and millions of dollars and they just kind of let him create these deals as
leverage in order not to lose more money than they already are. You see, they feel better
getting something then absolutely nothing. Which is exactly what would happen if they
refused to allow him more money or called in his debt. Please get educated before you call
people politically correct. I am extremely not politically correct but I can see the obvious bad
nature and dangerous ideas and hatred he spreads throughout the country to hate other
people, currently just putting the ideas of the people needing to make a people's military. In
order to overthrow anything political that he is not agreeing with. First he wants to kick out
the Mexicans, now he wants to kick out the Muslims, it's radical, and who's next the Italians?
Because they got a better bid on a job and he did? His real estate company was a successful
business that he inherited from his father, and then within a few years time he caused it to
be bankrupted. And then jumped into something he has no idea about and began to create
casinos which, he created to entice New York to allow casinos because he tried and they told
him no so he pouted and then went to New Jersey. So he kept creating bigger and better and
see more

8 Reply Share

TXDem1945 > disqus_xEFHDaxwEP a year ago


http://varianceexplained.org/r/trump-tweets/ RWNJ. 23/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

RWNJ.
ciao baby
1 Reply Share

JT Scott > Adam Zeller a year ago


I too would be very interested to see which tweets get more public engagement. Very Interested.
13 Reply Share

Avatar
This comment was deleted.

Bill the Lizard > Guest a year ago


If you read his @replies (or any national politician's, for that matter), you'll note that most of them
are from 'bot accounts. If you were able to filter those out, you might get better results.
4 Reply Share

Avatar
This comment was deleted.

jaime_arg > Guest a year ago


What about the iPad data? It seems to have a lot more interaction with users.
1 Reply Share

Nicole Radziwill a year ago


We came to a related conclusion from analyzing the interarrival times of Trump's tweets! It looks like there are
two processes going on: one "regular" managed process, and then very human-like patterns of tweeting off the
cuff. Check out (and cite! yay) our paper at https://arxiv.org/abs/1605....
40 Reply Share

pittipat a year ago


Fascinating analysis. However, as a Samsung Galaxy user, I just want to assure you all that we are not *all* angry
lunatics.
36 Reply Share

Snorre Milde > pittipat a year ago


http://varianceexplained.org/r/trump-tweets/ 24/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained
Snorre Milde > pittipat a year ago
Very sad!
11 Reply Share

Durand Sinclair > pittipat 6 months ago


Not with that attitude!
1 Reply Share

samhammer a year ago


So now for the million-dollar question: does the Android or the iPhone retweet white supremacists?

http://fortune.com/donald-t...
55 Reply Share

Hlynur > samhammer a year ago


Super interesting question. Answer: Android

https://twitter.com/search?...
37 Reply Share

Hlynur > Hlynur a year ago


Another example: https://twitter.com/search?...
9 Reply Share

Hilary Parker > samhammer a year ago


yeah those are often the weird manual quoted RTs, which are more likely to be android
3 Reply Share

Frosty dog a year ago


The daytime I Phone tweets that match Trump are dictation he " shouts out to the "girls" in the office to tweet for
him" That was his answer on a town hall meeting in the spring. I think it was the family one with Anderson
Cooper.
23 Reply Share

Michelle B. a year ago


http://varianceexplained.org/r/trump-tweets/ 25/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

"Thus, Trumps Android account uses about 40-80% more words related to disgust, sadness, fear, anger, " - Well 4
of the 5 characters from Inside Out are there. Just missing Joy. Missing Joy is not suprising in this campaign tho.
18 Reply Share

Melanie Moore > Michelle B. a year ago


Michelle B, you made me very Joyful just now.
6 Reply Share

Jamie Gray a year ago


It'd be cool if you built a constantly updating web site that tracked his and other campaigns / politicians lexical
tendencies... Just to give people a better understanding of their representatives
18 Reply Share

Systems Analyst > Jamie Gray 6 months ago


Agreed. I'm reading this for the first time and I'm dying for an update.
1 Reply Share

urig a year ago


Great work. very interesting indeed!!

Further suggestions:
Compare vocabulary (/grade level) between staffer and Trump (I bet Trump uses 4th grade words, and repeats
them again and again. Very sad...)

Compare specific vocabulary (e.g. foreign policy) --> I bet the staffer is more qualified to handle an international
crises than DJT is ;)
14 Reply Share

Kristi Johnson > urig a year ago


I would most certainly love to see the data on that as well.
1 Reply Share

jmdeleon a year ago


I guess this really is a case of Apples vs Oranges.
16 Reply Share
http://varianceexplained.org/r/trump-tweets/ 26/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained
16 Reply Share

Michael Cohen a year ago


Extremely interesting analysis. I am a professor at George Washington University's Graduate School of Political
Management and would love to talk with you. I'm having trouble getting R to get through some of the code,
however. It get stuck ... Error: could not find function "str_detect"


11 Reply Share

David Robinson Mod > Michael Cohen a year ago


That's my fault- it should have library(stringr) in an earlier step! Not sure how I missed that; I'll fix in the
post.
7 Reply Share

Ellis Valentiner > David Robinson a year ago


Would you mind also posting which package versions you're using? There are some important
difference between dplyr 0.4.3 and 0.5.
3 Reply Share

Michael Cohen > David Robinson a year ago


Thanks, David! Again, very, very interesting work.
2 Reply Share

Michael Cohen > Michael Cohen a year ago


Going through this, I was able to spot a few other places where the code can be easily fixed.
I'll send you a copy if you'd like. Also there is missing R code for the last four charts. Please
share.
Reply Share

David Robinson Mod > Michael Cohen a year ago


As noted in footnote 1, all of the code can be found in this Rmd file:
http://varianceexplained.org/r/trump-tweets/ 27/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

https://github.com/dgrtwo/d...

I removed the code for most of the graphs to keep the post relatively concise.
4 Reply Share

Michael Cohen > David Robinson a year ago


Thanks, David. I missed the footnote.
Reply Share

Michael Cohen > David Robinson a year ago


Hmmm ... I'm getting an error ...

> tweets %>%


+ count(source, hour = hour(with_tz(created, "EST"))) %>%
+ mutate(percent = n / sum(n)) %>%
+ ggplot(aes(hour, percent, color = source)) +
+ geom_line() +
+ scale_y_continuous(labels = percent_format()) +
+ labs(x = "Hour of day (EST)",
+ y = "% of tweets",
+ color = "")

Error in count(., source, hour = hour(with_tz(created, "EST"))) :


unused argument (hour = hour(with_tz(created, "EST")))
Reply Share

David Robinson Mod > Michael Cohen a year ago


Only explanation I can think of is that you've loaded another package after dplyr that has a
"count" function (you can just type "count" in the console to see which namespace it comes
from). Try again with a fresh session, or change count to dplyr::count
1 Reply Share

Michael Cohen > David Robinson a year ago


Yep, you nailed it. Hey, I'm going to do a similar analysis on Clinton's tweets. Interested in
collaborating?
http://varianceexplained.org/r/trump-tweets/ 28/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained
collaborating?
Reply Share

Tom Steele > Michael Cohen 8 months ago


Did you ever do this? I would be interested in seeing it. Off to google it now.
Reply Share

David Robinson Mod > Michael Cohen a year ago


Sure. I looked into it a little last week. Ping me an email please and I'll send you what I have
so far
Reply Share

mouthybint a year ago


This is fascinating stuff. I also love the comment below about Trump dictating his tweets -- this would fit with
something that has puzzled me. We know Trump is thin skinned, and cannot resist firing back when insulted,
right? And he is probably insulted on Twitter more than anywhere else. So why, when you look at his tweets and
replies, are there plenty of the former, and none of the latter? He NEVER responds. Not one reply, that I could see.
Given his personality, and propensity to rise to a slight, I would expect to see THOUSANDS of tweets -- "loser",
"sad", etc. But not a one. He has zero impulse control, so it's not very likely that he manages to rise above being
insulted by the hoi polloi. I can only think he's either too stupid to realize he can read everything everyone tweets
about him, or he isn't actually physically tweeting at all. I'd guess the latter.
11 Reply Share

David N > mouthybint a year ago


Alternatively, he doesn't spend that much time on Twitter and just uses it to talk and relies on his staff (or
doesn't care) to listen.
14 Reply Share

mouthybint > David N a year ago


You are probably right, although I do remember Penn Gilette saying (post-Apprentice) that Trump
reads every little thing about himself, even the most inconsequential blog post.
2 Reply Share

Jen Beaven > mouthybint a year ago


Hm. Lots more people have to write long to read and inconsequential stuff about him. Sort
http://varianceexplained.org/r/trump-tweets/ 29/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained
Hm. Lots more people have to write long to read and inconsequential stuff about him. Sort
of a human DDOS attack :-)
3 Reply Share

Joe_Winfield_IL > mouthybint a year ago


He's always had a strangely high volume of press coverage, but nothing like this last year. It
would be an exhausting task to still be reading every little thing today!
1 Reply Share

jimshaw54 > Joe_Winfield_IL a year ago


You might be interested in reading the Harvard study of the Republican primaries and how
(and why) Trump got so much publicity then (to the tune of almost 2 BILLION dollars worth
(translate that into thousands of millions) and it boggles the mind how the press fawned
over Trump then. No so much now.
My take is that many in the press are lazy, and Trump was an easy write at the time. Now
they are used to him and his antics so he is no longer "newsworthy". Good example of how
the press can "elect" our president for us.
5 Reply Share

Load more comments

ALSO ON VARIANCE EXPLAINED

Analysis of the #7FavPackages hashtag useR and JSM 2016 conferences: a story in tweets
3 comments a year ago 3 comments a year ago
David Robinson You are right! I foolishly was running David Robinson Thanks! I write the posts in RStudio
with a non-clean session. Fixing now, thank directly into an Rmd file, in a format pretty

Examining the arc of 100,000 stories: a tidy Tidying computational biology models with
analysis biobroom: a case
5 comments 4 months ago 13 comments a year ago
David Robinson Sure, you could do something like klmr Correct, which is why Im less enthusiastic than
that with:plot_words_numbered <- most people about the sheer quantity of CRAN

Subscribe d Add Disqus to your siteAdd DisqusAdd Privacy


http://varianceexplained.org/r/trump-tweets/ 30/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

http://varianceexplained.org/r/trump-tweets/ 31/32
9/7/2017 Text analysis of Trump's tweets confirms he writes only the (angrier) Android half Variance Explained

YOU MIGHT ALSO ENJOY (VIEW ALL POSTS)

2017 David Robinson. Powered by Jekyll using the Minimal Mistakes theme.

http://varianceexplained.org/r/trump-tweets/ 32/32

Das könnte Ihnen auch gefallen