9.4 9.3 Sentiment Analysis
Question 1: how often positive or negative words appeared in the Usenet data? Question 1a: which words contributed the most within each newsgroup? Question 2: what were the most positive/negative messages?
9.4.1 Question 2:
sentiment_messages <- usenet_words %>%
inner_join(get_sentiments("afinn"), by = "word") %>%
group_by(newsgroup, id) %>%
summarize(sentiment = mean(value),
words = n()) %>%
ungroup() %>%
filter(words >= 5)
sentiment_messages %>%
arrange(desc(sentiment))Clearly message id 53560 was the most positive in the whole dataset. What was it?!
print_message <- function(group, message_id) {
result <- cleaned_text %>%
filter(newsgroup == group, id == message_id, text != "")
cat(result$text, sep = "\n")
}
print_message("rec.sport.hockey", 53560)What about the most negative?
print_message("rec.sport.hockey", 53907)