Chapter 40 Phoenix City Analysis
40.1 Top Ten Business in Phoenix
We list the Top Ten business in Toronto giving importance to the number of reviews and then to the number of stars obtained by the business.
city_biz = business %>%
filter(city == "Phoenix") %>%
arrange(desc(review_count,stars)) %>%
select(name,neighborhood,address,review_count,stars) %>%
head(10)
datatable(city_biz, style="bootstrap", class="table-condensed", options = list(dom = 'tp',scrollX = TRUE))
40.2 Topic Modelling for Phoenix City
We do a Topic Modelling on the reviews of a sample of Ten Thousand Words of Phoenix City.
CityCoords = business %>%
filter(city == "Phoenix")
city_words = inner_join(CityCoords,reviews) %>% select(date,text,review_id) %>% sample_n(10000)
custom_stop_words <- tibble(word = c("restaurant","food"))
create_LDA_topics(city_words,custom_stop_words)
We observe the themes of Service and time
being very dominant. The occurence of the word chicken
among food items is present.
40.3 Word Cloud of Phoenix City
createWordCloud(city_words)
40.4 Top Ten most common Words of the business Phoenix City
We examine the Top Ten Most Common words and show them in a bar graph.
city_words %>%
unnest_tokens(word, text) %>%
filter(!word %in% stop_words$word) %>%
filter(!word %in% c('food','restaurant')) %>%
count(word,sort = TRUE) %>%
ungroup() %>%
mutate(word = factor(word, levels = rev(unique(word)))) %>%
head(10) %>%
ggplot(aes(x = word,y = n)) +
geom_bar(stat='identity',colour="white", fill =fillColor) +
geom_text(aes(x = word, y = 1, label = paste0("(",n,")",sep="")),
hjust=0, vjust=.5, size = 4, colour = 'black',
fontface = 'bold') +
labs(x = 'Word', y = 'Word Count',
title = 'Word Count') +
coord_flip() +
theme_bw()
40.5 Sentiment Analysis - Postive and Not So Postive Words of Phoenix City
We display the Positive and Not So Positive words used by reviewers for Phoenix City
.We have used the AFINN sentiment lexicon, which provides numeric positivity scores for each word, and visualize it with a bar plot.
positiveWordsBarGraph(city_words)
40.6 Calculate Sentiment for the reviews
We calculate the sentiment scores for all the reviews using the AFINN sentiment lexicon. We display the Top Six sentiments here.
sentiment_lines = calculate_sentiment(city_words)
head(sentiment_lines)
40.7 Negative Reviews
We examine the Top Ten most negative reviews.
display_neg_sentiments(sentiment_lines,city_words)
40.8 Positive Reviews
We examine the Top Ten most postive reviews.
display_pos_sentiments(sentiment_lines,city_words)