Chapter 13 Recommended Notebooks for 2021 June to 2021 December
We recommend the following Notebooks created between 2021 June to 2021 December [ This is chosen to reduce the dataset analysis purposes only ]
We choose the following criteria
Medals - Silver
We chose a Kernel which is NOT a Competition Notebook
Performance Tier of the author is Expert or Master
We chose Kernels whose Total Votes greater than 40, Total Comments greater than 10 and the Number of views is more than 3100
We removed Kernels which had common data sources such as Titanic, Breast Cancer , Heart and Diabetes
kernels$MadePublicDate = as.Date(kernels$MadePublicDate,format = "%m/%d/%Y")
kernels_subset = kernels %>% 
  filter(between(MadePublicDate, as.Date("2021-06-01"),as.Date("2021-12-31")))
kernels_subset = kernels_subset %>%
  filter(TotalVotes > 40)
kernels_subset = kernels_subset %>%
  filter(TotalComments > 10)
kernels_subset = kernels_subset %>%
  filter(TotalViews > 3100)
kernels_subset$Medal = as.integer(kernels_subset$Medal)
kernels_subset_silver = kernels_subset %>%
  filter(Medal >= 2)
kvcs_silver <- kernels_subset_silver %>%
  left_join(kernel_version_competition ,  
            by = c("CurrentKernelVersionId" = "KernelVersionId"))
kvcs_silver = kvcs_silver %>%
  filter(is.na(SourceCompetitionId))
kvcs_silver = kvcs_silver %>%
  mutate(CompNoteBook = ifelse(is.na(SourceCompetitionId),0,1))
kvcs_silver_users = kvcs_silver %>% 
  left_join(users %>% select(AuthorUserId = Id, 
                             author_kaggle = UserName,
                             DisplayName,
                             RegisterDate,
                             PerformanceTier), by = "AuthorUserId")
kvcs_silver_users_experts = kvcs_silver_users %>%
  filter(PerformanceTier %in%  c(2,3))
kvcs_silver_users_experts = kvcs_silver_users_experts %>%
  filter(!str_detect(CurrentUrlSlug, c("titanic") ))
kvcs_silver_users_experts = kvcs_silver_users_experts %>%
  filter(!str_detect(CurrentUrlSlug, c("diabetes") ))
kvcs_silver_users_experts = kvcs_silver_users_experts %>%
  filter(!str_detect(CurrentUrlSlug, c("house") ))
kvcs_silver_users_experts = kvcs_silver_users_experts %>%
  filter(!str_detect(CurrentUrlSlug, c("heart") ))
kvcs_silver_users_experts = kvcs_silver_users_experts %>%
  filter(!str_detect(CurrentUrlSlug, c("breast") ))
kvcs_silver_users_experts = kvcs_silver_users_experts %>%
  mutate( URL = paste("https://www.kaggle.com/code/",author_kaggle,"/",CurrentUrlSlug,sep =""))
kvcs_versions_info_reduced = kvcs_silver_users_experts %>%
  select("URL","Medal",
         "TotalViews","TotalComments","TotalVotes",
  ) %>%
  arrange(desc(TotalVotes))
kvcs_versions_info_reduced %>%
  gt() %>%
  tab_header(
    title = "Recommended Notebooks for 2021 June to December")| Recommended Notebooks for 2021 June to December | ||||
|---|---|---|---|---|
| URL | Medal | TotalViews | TotalComments | TotalVotes | 
| https://www.kaggle.com/code/ankitkalauni/tokyo-olympic-2021-starter-clean-eda | 2 | 4747 | 48 | 96 | 
| https://www.kaggle.com/code/sonalisingh1411/eda-on-train-test-dataset-price-prediction | 2 | 4471 | 38 | 86 | 
| https://www.kaggle.com/code/imakash3011/customer-analysis-eda-report-clustering | 2 | 6530 | 46 | 77 | 
| https://www.kaggle.com/code/mysarahmadbhat/types-of-transformations-for-better-distribution | 2 | 3551 | 64 | 73 | 
| https://www.kaggle.com/code/miguelfzzz/store-customers-clustering-analysis | 2 | 4338 | 22 | 72 | 
| https://www.kaggle.com/code/imakash3011/covid-19-india-eda-visualization-report | 2 | 3414 | 60 | 71 | 
| https://www.kaggle.com/code/mysarahmadbhat/python-from-zero-to | 2 | 5180 | 36 | 71 | 
| https://www.kaggle.com/code/kaanboke/beginner-friendly-end-to-end-ml-project-enjoy | 2 | 3211 | 24 | 69 | 
| https://www.kaggle.com/code/jonaspalucibarbosa/chest-x-ray-pneumonia-cnn-transfer-learning | 2 | 5180 | 27 | 66 | 
| https://www.kaggle.com/code/kartik2khandelwal/bitcoin-crash-prediction | 2 | 3565 | 49 | 66 | 
| https://www.kaggle.com/code/miguelfzzz/olympics-tokyo-2020-cool-eda | 2 | 4085 | 39 | 65 | 
| https://www.kaggle.com/code/maricinnamon/store-sales-time-series-forecast-visualization | 2 | 4671 | 33 | 65 | 
| https://www.kaggle.com/code/kslarwtf/eda-clustering-updated | 2 | 4404 | 45 | 64 | 
| https://www.kaggle.com/code/miguelfzzz/bellabeat-data-analysis-discovering-trends | 2 | 3110 | 12 | 63 | 
| https://www.kaggle.com/code/ankitkalauni/covid-19-india-statewise-clean-eda-deaths-pred | 2 | 3308 | 49 | 62 | 
| https://www.kaggle.com/code/gaganmaahi224/eda-detailed-explanation-of-knn-algorithm | 2 | 3351 | 39 | 62 | 
| https://www.kaggle.com/code/yuyougnchan/look-at-this-note-numeric-variable-is-easy | 2 | 3662 | 42 | 60 | 
| https://www.kaggle.com/code/zwartfreak/easiest-price-prediction-full-explanation | 2 | 4773 | 36 | 59 | 
| https://www.kaggle.com/code/thomaskonstantin/exploring-and-predicting-drinking-water-potability | 2 | 4285 | 39 | 58 | 
| https://www.kaggle.com/code/victoriamiller19/hypothesis-testing-explanation | 2 | 3107 | 27 | 58 | 
| https://www.kaggle.com/code/vardhansiramdasu/summer-olympics-eda | 2 | 3364 | 41 | 58 | 
| https://www.kaggle.com/code/mostafaalaa123/customer-personality | 2 | 4615 | 22 | 57 | 
| https://www.kaggle.com/code/ludovicocuoghi/twitter-sentiment-analysis-with-bert-roberta | 2 | 3346 | 39 | 57 | 
| https://www.kaggle.com/code/aryantiwari123/hotel-booking-eda-models | 2 | 4616 | 56 | 56 | 
| https://www.kaggle.com/code/tensorchoko/g-research-crypto-forecasting-eda | 2 | 3979 | 18 | 56 | 
| https://www.kaggle.com/code/frankmollard/a-story-about-unsupervised-learning | 2 | 5311 | 14 | 53 | 
| https://www.kaggle.com/code/aditimulye/adult-income-dataset-from-scratch | 2 | 3215 | 25 | 52 | 
| https://www.kaggle.com/code/mostafaalaa123/finished-quick-analysis-of-each-q | 2 | 5383 | 38 | 51 | 
| https://www.kaggle.com/code/anandhuh/image-classification-using-cnn-for-beginners | 2 | 5234 | 24 | 50 | 
| https://www.kaggle.com/code/frankmollard/nlp-a-gentle-introduction-lstm-word2vec-bert | 2 | 3633 | 30 | 50 | 
| https://www.kaggle.com/code/ankitkalauni/customer-personality-clean-eda-k-means | 2 | 3348 | 25 | 50 | 
| https://www.kaggle.com/code/prena0808/tokyo-olympics-data-analysis | 3 | 3331 | 20 | 50 | 
| https://www.kaggle.com/code/paulrohan2020/ml-algorithms-from-scratch-with-pure-python | 2 | 3717 | 37 | 48 | 
| https://www.kaggle.com/code/rankirsh/predicting-attrition-from-a-to-z | 2 | 3895 | 33 | 47 | 
| https://www.kaggle.com/code/atasaygin/hotel-booking-demand-eda-and-of-guest-prediction | 2 | 3213 | 20 | 46 | 
| https://www.kaggle.com/code/hijest/text-generation-for-beginners-thorough-tutorial | 2 | 6054 | 11 | 45 | 
| https://www.kaggle.com/code/aryantiwari123/handwriting-recognition-deep-learning-tensorflow | 2 | 3349 | 32 | 44 | 
| https://www.kaggle.com/code/imakash3011/water-quality-prediction-7-model | 2 | 4013 | 45 | 43 | 
| https://www.kaggle.com/code/yogidsba/personal-loan-logistic-regression-decision-tree | 2 | 7634 | 24 | 42 | 
| https://www.kaggle.com/code/jonaspalucibarbosa/default-of-credit-card-eda-catboost-w-ft-eng | 2 | 3164 | 26 | 42 | 
| https://www.kaggle.com/code/anoopashware/food-demand-forecasting-predict-orders | 2 | 3642 | 14 | 41 |