Chapter 10: Be the boss of your factors in Jenny Bryan’s STAT545 notes
Skim the The tidyverse style guide for inspiration - you don’t have to read the whole guide carefully
By the end of today’s class, you should be able to:
We will continue working with the gapminder dataset, so let’s first load that back in, along with the tidyverse.
library(tidyverse) library(gapminder) #install.packages("gapminder") # For being able to compare plots side by side, I'm also going to use the gridExtra package today library(gridExtra) #install.packages("gridExtra")
R uses factors to handle categorical variables, variables that have a fixed and known set of possible values. As such, this data type looks like character data type from the outset, but it can contain additional information to manage the levels and the order (or sequence) of the categorical values. Factors are important for modeling also helpful for reordering character vectors to improve display.
We’ll go over Jenny Bryan’s illustration of how a few powerful functions from the
forcats package can significantly improve our handling of factor variables and visualization of data with categorical variables. The code used in-class can be found here