9.4 9.3 Sentiment Analysis
Question 1: how often positive or negative words appeared in the Usenet data? Question 1a: which words contributed the most within each newsgroup? Question 2: what were the most positive/negative messages?
9.4.1 Question 2:
<- usenet_words %>%
sentiment_messages inner_join(get_sentiments("afinn"), by = "word") %>%
group_by(newsgroup, id) %>%
summarize(sentiment = mean(value),
words = n()) %>%
ungroup() %>%
filter(words >= 5)
%>%
sentiment_messages arrange(desc(sentiment))
Clearly message id 53560 was the most positive in the whole dataset. What was it?!
<- function(group, message_id) {
print_message <- cleaned_text %>%
result filter(newsgroup == group, id == message_id, text != "")
cat(result$text, sep = "\n")
}
print_message("rec.sport.hockey", 53560)
What about the most negative?
print_message("rec.sport.hockey", 53907)