Chapter 14 Who got Highest Votes after Hidden Gem Declaration

The notebooks which got the highest votes after the Hidden Gem declaration are shown below

Tutorial on reading large datasets, Dive into dplyr (tutorial #1), Writing Hamilton Lyrics with Tensorflow/R, Petfinder Pawpularity EDA & fastai starter , Recommendation engine with networkx got the highest votes after the Hidden Gem declaration

Kaggle, 2020, Structured Data are the most popular words for the kernels which got the highest upvotes after it was declared as a Hidden Gem.

kernels_gems_versions = left_join(kernels_gems,kernel_versions,by = c("KernelId" = "ScriptId"))


kernels_gems_versions = kernels_gems_versions %>%
  rename(KernelVersionId = Id)

kernel_gems_votes = left_join(kernels_gems_versions,kernel_votes)

kernel_gems_votes$VoteDate = as.Date(kernel_gems_votes$VoteDate,format = "%m/%d/%Y")
kernel_gems_votes$date = as.Date(kernel_gems_votes$date,format = "%m/%d/%Y")

kernel_gems_votes %>%
  filter(VoteDate > date) %>%
  group_by(CurrentUrlSlug) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  head(15) %>%
  ungroup() %>%
  mutate(CurrentUrlSlug = reorder(CurrentUrlSlug,Count)) %>%
  
  ggplot(aes(x = CurrentUrlSlug,y = Count)) +
  geom_bar(stat='identity',colour="white", fill = fillColor2) +
  geom_text(aes(x = CurrentUrlSlug, y = 1, label = paste0("(",Count,")",sep="")),
            hjust=0, vjust=.5, size = 4, colour = 'black',
            fontface = 'bold') +
  labs(x = 'Notebook', 
       y = 'Count', 
       title = 'Highest UpVotes') +
  coord_flip() + 
  theme_fivethirtyeight(base_size = 15)

  s = 
  kernel_gems_votes %>%
  filter(VoteDate > date) %>%
  group_by(CurrentUrlSlug) %>%
  summarise(Count = n())

s = left_join(gems,s)

a = s %>%
  arrange(desc(Count)) %>%
  head(15) %>%
  select(notebook,title,review)


  gem_review <- a %>%
    select(notebook,review) %>%
    unnest_tokens(word, review) %>% 
    anti_join(stop_words)
  
  gem_review %>%
  count(word,sort = TRUE) %>%
  ungroup()  %>%
  head(30) %>%
  
  with(wordcloud(word, n, max.words = 30,colors=brewer.pal(8, "Dark2")))

  review_word_pairs <- gem_review %>% 
    pairwise_count(word, notebook, sort = TRUE, upper = FALSE) %>%
    filter( item1 != "www.kaggle.com") %>%
    filter( item2 != "www.kaggle.com") %>%
    filter( item1 != "https")
  
  occur = 2
  set.seed(1234)
  review_word_pairs %>%
    filter(n >= occur) %>%
    graph_from_data_frame() %>%
    ggraph(layout = "fr") +
    geom_edge_link(aes(edge_alpha = n, edge_width = n), edge_colour = "darkred") +
    geom_node_point(size = 5) +
    geom_node_text(aes(label = name), repel = TRUE,
                   point.padding = unit(0.2, "lines")) +
    theme_void(base_size = 15)

  a %>%
  gt() %>%
  tab_header(
    title = "Highest Votes after the Hidden Gem Declaration")
Highest Votes after the Hidden Gem Declaration
notebook title review
https://www.kaggle.com/rohanrao/tutorial-on-reading-large-datasets Tutorial on reading large datasets An impressively clean and accessible primer on Python tools to read, and formats to store, large datasets. Brief and to the point; featuring Pandas, Dask, Datable, and Rapids cudf.
https://www.kaggle.com/jessemostipak/dive-into-dplyr-tutorial-1 Dive into dplyr (tutorial #1) A well-structured guide to R dplyr functionality showcased on the [palmer penguin dataset](https://www.kaggle.com/parulpandey/palmer-archipelago-antarctica-penguin-data). Does a great job in elucidating the tidyverse approach and philosophy.
https://www.kaggle.com/anasofiauzsoy/writing-hamilton-lyrics-with-tensorflow-r Writing Hamilton Lyrics with Tensorflow/R History has its eyes on this R Keras tutorial showcasing the ten NLP commandments to predict the non-stop lyrics of the musical Hamilton. And we don't even have to wait for it.
https://www.kaggle.com/tanlikesmath/petfinder-pawpularity-eda-fastai-starter Petfinder Pawpularity EDA & fastai starter A clean and concise FastAI starting point for the recently launched [Pawpularity competition](https://www.kaggle.com/c/petfinder-pawpularity-score); complete with the popular Swin Transformer model in a classification approach.
https://www.kaggle.com/yclaudel/recommendation-engine-with-networkx Recommendation engine with networkx Step-by-step guide to building a Netflix recommendation engine; resulting in insightful and visually pleasing output graphs.
https://www.kaggle.com/parulpandey/breathe-india-covid-19-effect-on-pollution Breathe India: COVID-19 effect on Pollution A detailed work studying the interaction between the big topics of COVID-19 and air pollution in past and recent data from India.
https://www.kaggle.com/michau96/education-level-affects-data-analysis Education level affects data analysis? This work provides an elegant, visual R/tidyverse investigation of the impact of formal education levels on the characteristics of our community based on the [2020 Kaggle Survey](https://www.kaggle.com/c/kaggle-survey-2020).
https://www.kaggle.com/snanilim/video-games-sales-analysis-and-visualization Video games sales analysis and visualization Fun and engaging narration provides the scaffolding for gaming insights on genres and regional levels.
https://www.kaggle.com/andradaolteanu/treasure-hunt-what-gives-to-be-really-good Treasure Hunt - what gives to be REALLY good? A delightfully creative exploration of the characteristics of successful DS/ML practitioners in the [2020 Kaggle Survey](https://www.kaggle.com/c/kaggle-survey-2020). The fun visuals are great and the treasure map is a stroke of genius.
https://www.kaggle.com/karnikakapoor/music-generation-lstm Music Generation: LSTM Exceptionally well structured and narrated, this musical experimentation learns from Chopin to write its own piano melodies using LSTMs. Note the stylish gifs and playable audio files.
https://www.kaggle.com/datafan07/mechanisms-of-action-what-do-we-have-here Mechanisms of Action: What Do We Have Here? A comprehensive and well-structured EDA that provides a big-picture overview of the dataset, while also highlighting and commenting on important individual features.
https://www.kaggle.com/ryanholbrook/the-convolutional-classifier The Convolutional Classifier The opening Notebook in Kaggle's new [Computer Vision course](https://www.kaggle.com/learn/computer-vision) starts the lessons with hands-on transfer learning on CNNs and Tensorflow/Keras. Also check out the rest of the course.
https://www.kaggle.com/janiobachmann/melbourne-comprehensive-housing-market-analysis Melbourne || Comprehensive Housing Market Analysis This EDA on the Melbourne housing market presents a number of interesting dataviz approaches via plotly. Also note the consistent use of explanations that interpret the visuals.
https://www.kaggle.com/janiobachmann/german-credit-analysis-a-risk-perspective German Credit Analysis || A Risk Perspective This work presents a thorough exploration of risky business in Germany. It deserves much credit for its well-organised structure and expertly designed visuals.
https://www.kaggle.com/andradaolteanu/unbiased-look-on-brazil-wildfires Unbiased Look on Brazil Wildfires A well-focused analysis on a globally important topic. Enriched by vivid maps, context info, and narration.