Chapter 4 Grammar of Data manipulation

dplyr is a grammar of data manipulation, providing a consistent set of verbs that help you solve the most common data manipulation challenges:

  • mutate() adds new variables that are functions of existing variables
  • select() picks variables based on their names.
  • filter() picks cases based on their values.
  • summarise() reduces multiple values down to a single summary.
  • arrange() changes the ordering of the rows.

These all combine naturally with group_by() which allows you to perform any operation “by group”.

4.1 Reorder function

We would be using the reorder function widely in our code. Let us unpack the function.

The “default” method treats its first argument as a categorical variable, and reorders its levels based on the values of a second variable, usually numeric.