Chapter 4 Grammar of Data manipulation
dplyr is a grammar of data manipulation, providing a consistent set of verbs that help you solve the most common data manipulation challenges:
- mutate() adds new variables that are functions of existing variables
- select() picks variables based on their names.
- filter() picks cases based on their values.
- summarise() reduces multiple values down to a single summary.
- arrange() changes the ordering of the rows.
These all combine naturally with group_by() which allows you to perform any operation “by group”.
4.1 Reorder function
We would be using the reorder
function widely in our code. Let us unpack the function.
The “default” method treats its first argument as a categorical variable, and reorders its levels based on the values of a second variable, usually numeric.