Formats a data frame or vector containing categorical variables and calculates the frequency of each category.
Arguments
- x
Data frame or vector containing categorical variables.
- width
Integer specifying the maximum width for wrapping text.
- collapse
Logical specifying whether to merge categories with identical proportions.
- sort
Logical or character vector. If
TRUE, orders categories by frequency. IfFALSE, orders by names. If a character vector, renames and orders categories accordingly.- format
Logical specifying whether to format category names if the input is a vector.
Value
A tibble with one row per category and the following columns:
- f
Factor specifying the category labels, possibly wrapped to the specified width. When
collapse = TRUE, multiple categories with identical frequencies are merged into a single label separated by commas.- n
Integer specifying the frequency count for each category.
Examples
# Vector of categorical variable
k <- 5
n <- runif(k, 1, 10) %>% round()
x <- paste("Level", seq(k)) %>%
mapply(function(x, y) rep(x, y), ., n) %>%
unlist()
count_category(x)
#> # A tibble: 5 × 2
#> f n
#> <fct> <int>
#> 1 Level 1 1
#> 2 Level 4 4
#> 3 Level 3 5
#> 4 Level 2 5
#> 5 Level 5 8
# Data frame of categorical variable
df <- sapply(seq(k), function(x) runif(10) %>% round()) %>% as.data.frame()
colnames(df) <- paste("Level", seq(k))
count_category(df)
#> # A tibble: 5 × 2
#> f n
#> <fct> <int>
#> 1 Level 1 2
#> 2 Level 5 3
#> 3 Level 3 4
#> 4 Level 4 7
#> 5 Level 2 7
count_category(x, sort = FALSE, width = 5)
#> # A tibble: 5 × 2
#> f n
#> <fct> <int>
#> 1 "Level\n1" 1
#> 2 "Level\n2" 5
#> 3 "Level\n3" 5
#> 4 "Level\n4" 4
#> 5 "Level\n5" 8
count_category(x, sort = seq(k), format = FALSE)
#> # A tibble: 5 × 2
#> f n
#> <fct> <int>
#> 1 1 1
#> 2 2 5
#> 3 3 5
#> 4 4 4
#> 5 5 8
x2 <- c(x, rep("Level 6", n[1]))
count_category(x2, collapse = TRUE)
#> # A tibble: 4 × 2
#> f n
#> <fct> <int>
#> 1 "Level 6, Level\n1" 1
#> 2 "Level 4" 4
#> 3 "Level 3, Level\n2" 5
#> 4 "Level 5" 8