Factors with forcats: CHEATSHEET The forcats package provides tools for working with factors, which are R's data structure for categorical data. Factors R represents categorical data with factors. A factor is an integer vector with a levels attribute that stores a set of mappings between integers and categorical values. When you view a factor, R displays not the integers, but the values associated with them integer vector" levels - stored 1 1 = 0 0 im 1 displayed □ □ H3 H ra3H □ □ Create a factor with factorQ factor(x = character), levels, labels = levels, exclude = NA, ordered = is.ordered(x), nmax = NA) Convert a vector to a factor. Also as_factor. f<-factor(c("a", "c", "b", "a"), levels = c("a", "b", "c")j Return its levels with levelsQ levels(x) Return/set the levels of a factor, levels(f); levels(f) <- c("x","y","z") Use unclass() to see its structure Inspect Factors □ l □ H E H 3 □ □n 01 fct_count(f, sort = FALSE) Countthe number of values with each level. fct_count(f) Change the order of levels □ in H 2=13 03 B □ 1=0 2 = h □2=n □ l □ □ i = 0 □2 = h 3=n □ Q l h H20 I3i a n2n Gl i=0 2-0. 3 = h □ E □ ■5 fct_relevel(.f,after = OL) Manually reorder factor levels. fct_relevel(f, c("b", "c" "a")) fct_infreq(f, ordered = NA) Reorder levels by the frequency in which they appear in the data (highest frequency first). f3 <- factor(c("c", "c", "a")) fct_infreq(f3) fct_inorder(f, ordered = NA) Reorder levels by order in which they appear in the data. fct_inorder(f2) fct_rev(f) Reverse level order. f4 <- factor(c("a","b";,c")) fct_rev(f4) fct_shift(f) Shift levels to left or right, wrapping around end. fct_shift(f4) Change the value of levels □ in H n H3 H □ H3H □ □ in H3 H □ ra 2 □ H3 Q □ H2 = l □ □ 2 1=2 1 2=1 3 a in ra 2 n □ □ □ im 2 - Other fct_recode(.f,...) Manually change levels. Also fct_relabel which obeys purrr::map syntax to apply a function or expression to each level. fct_recode(f,v="a",x= "b" z = "c") fct_relabel(f, ~ pasteO("x", .x)) fct_anon(f, prefix ="")) Anonymize levels with random integers. fct_anon(f) fct_collapse(.f,...) Collapse levels into manually defined groups. fct_collapse(f,x = c("a", "b")) fct_lump(f, n, prop, w = NULL, otherjevel = "Other", ties.method = c("min", "average", "first", "last", "random", "max")) Lump together least/most common levels into a single level. Also fct_lump_min. fct_lump(f, n = 1) ra E H3H □ H □ H3 H fct_unique(f) Return the unique values, removing duplicates. fct_unique(f) Combine Factors H2=b H2=a □ i □ H Q H30 □ O fct_c(...) Combine factors with different levels. fl <- factor(c("a", "c")) 12 <- factor(c("b", "a")) fct_c(fl, f2) fct_unify(fs, levels = lvls_union(fs)) Standardize levels across a list of factors. fct_unify(list(f2, fl)) Studio □ iE ra20 a3 B 13 2=h 3-0 mim 1 = B 2 = B 3 = 0 fct_shuffle(f, n = 1L) Randomly permute order of factor levels. fct_shuffle(f4) fct_reorder(.f, .x, .fun=median, .desc = FALSE) Reorder levels by their relationship with another variable. boxplot(data = iris, Sepal.Width ~ fct_reorder(Species, Sepal.Width)) fct_reorder2(.f, .x, .y, .fun = Iast2,.desc = TRUE) Reorder levels by their final values when plotted with two other variables. ggplot(data = iris, aes(Sepal.Width, Sepal.Length, color = fct_reorder2(Species, Sepal.Width, Sepal.Length))) + geom_smooth() □ 3 H □ -- □ in "2=0 H 3=™h' □ fct_other(f, keep, drop, otherjevel: "Other") Replace levels with "other." fct_other(f, keep = c("a", "b")) Add or drop levels □ 1=0 EH J □ 1 □ H2=n □ 1 b [JJ2 = |3 0 10 fct_drop(f, only) Drop unused levels. rji 2 = 0 f5<-factor(c("a","b"),c("o""b""x")) f6 <- fct_drop(f5) 0 1 = 0 fct_expand(f,...) Add levels to rjj 2 = j3 a factor. fct_expand(f6, "x") 3 = □ 1 = 0 fct_explicit_na(f, naJevel="(Missing)"] Assigns a level to NAs to ensure they appear in plots, etc. fct_explicit_na(factor(c("a", "b", NA))) RStudio® is a trademark of RStudio, Inc. • CCBYSA RStudio- info@rstudio.com • 844-448-1212 • rstudio.com • Learn more at forcats.tidvverse.ore ■ Diagrams inspired by @LVaudor forcats 0.3.0« Updated: 2019-02