Chapter 40 Phoenix City Analysis

40.1 Top Ten Business in Phoenix

We list the Top Ten business in Toronto giving importance to the number of reviews and then to the number of stars obtained by the business.

city_biz = business %>%
  filter(city == "Phoenix") %>%
  arrange(desc(review_count,stars)) %>%
  select(name,neighborhood,address,review_count,stars) %>%
  head(10)

datatable(city_biz, style="bootstrap", class="table-condensed", options = list(dom = 'tp',scrollX = TRUE))

40.2 Topic Modelling for Phoenix City

We do a Topic Modelling on the reviews of a sample of Ten Thousand Words of Phoenix City.

CityCoords = business %>%
  filter(city == "Phoenix")

city_words = inner_join(CityCoords,reviews) %>% select(date,text,review_id) %>% sample_n(10000)

custom_stop_words <- tibble(word = c("restaurant","food"))

create_LDA_topics(city_words,custom_stop_words)

We observe the themes of Service and time being very dominant. The occurence of the word chicken among food items is present.

40.3 Word Cloud of Phoenix City

createWordCloud(city_words)

40.4 Top Ten most common Words of the business Phoenix City

We examine the Top Ten Most Common words and show them in a bar graph.

city_words %>%
  unnest_tokens(word, text) %>%
  filter(!word %in% stop_words$word) %>%
  filter(!word %in% c('food','restaurant')) %>%
  count(word,sort = TRUE) %>%
  ungroup() %>%
  mutate(word = factor(word, levels = rev(unique(word)))) %>%
  head(10) %>%
  
  ggplot(aes(x = word,y = n)) +
  geom_bar(stat='identity',colour="white", fill =fillColor) +
  geom_text(aes(x = word, y = 1, label = paste0("(",n,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Word', y = 'Word Count', 
       title = 'Word Count') +
  coord_flip() + 
  theme_bw()

40.5 Sentiment Analysis - Postive and Not So Postive Words of Phoenix City

We display the Positive and Not So Positive words used by reviewers for Phoenix City.We have used the AFINN sentiment lexicon, which provides numeric positivity scores for each word, and visualize it with a bar plot.

positiveWordsBarGraph(city_words)

40.6 Calculate Sentiment for the reviews

We calculate the sentiment scores for all the reviews using the AFINN sentiment lexicon. We display the Top Six sentiments here.

sentiment_lines = calculate_sentiment(city_words)

head(sentiment_lines)

40.7 Negative Reviews

We examine the Top Ten most negative reviews.

display_neg_sentiments(sentiment_lines,city_words)

40.8 Positive Reviews

We examine the Top Ten most postive reviews.

display_pos_sentiments(sentiment_lines,city_words)