library(RastaRocket)
library(dplyr)
library(tidyr)
library(labelled)
library(rlang)
library(gtsummary)
library(forcats)This vignette demonstrates the different options available for the
desc_var function, accompanied by examples to illustrate
its usage.
We will generate a sample dataset to apply the desc_var
function.
# Charger le package nécessaire
set.seed(123) # Pour garantir la reproductibilité
# Création du data frame
data <- data.frame(
Age = c(rnorm(45, mean = 50, sd = 10), rep(NA, 5)), # Renommée Age
sexe = sample(c(0, 1), 50, replace = TRUE, prob = c(0.6, 0.4)), # Renommée sexe
quatre_modalites = sample(c("A", "B", "C"), 50, replace = TRUE, prob = c(0.2, 0.5, 0.3)), # Modalités sans "D"
traitement = sample(c("BRAS-A", "BRAS-B"), 50, replace = TRUE, prob = c(0.55, 0.45)), # Nouvelle variable traitement
echelle = sample(0:5, 50, replace = TRUE) # Nouvelle variable entière de 0 à 5
)
# Ajouter la modalité "D" comme niveau sans effectif
data$quatre_modalites <- factor(data$quatre_modalites, levels = c("A", "B", "C", "D"))
# Ajouter des labels à la variable sexe
data$sexe <- factor(data$sexe, levels = c(0, 1), labels = c("Femme", "Homme"))
# Aperçu des données
data <- data %>% labelled::set_variable_labels( Age = "Age",
sexe = "sexe",
traitement = "traitement",
quatre_modalites = "quatres niveaux",
echelle = "Echelle")Below, we describe the options used in the example.
The dataset is passed to the desc_var function for
analysis.
table_title: Title of the descriptive table. Here, it
is “test.”by_group: Logical indicating whether the descriptive
table should be stratified by the grouping variable (var_group). If
TRUE, the table is grouped by var_group; if FALSE, the grouping variable
is ignored and not described in the table.var_group: The variable used for grouping the data.
Here, it is “traitement.”group_title: Title of the grouping variable column.
Here, it is “traitement.”add_total: Add a Total column when
var_group is specified
data %>% RastaRocket::desc_var(table_title = "test",
by_group = TRUE,
var_group = "traitement",
group_title = "Traitement",
add_total = TRUE,
show_n_per_group = TRUE)| Characteristic | Overall N = 50 |
Traitement
|
|
|---|---|---|---|
| BRAS-A N = 29 |
BRAS-B N = 21 |
||
| Age n (d.m.) | 45 (5) | 26 (3) | 19 (2) |
| Mean (SD) | 50.7 (9.5) | 51.0 (9.6) | 50.2 (9.5) |
| Median (Q1 ; Q3) | 50.7 (44.4 ; 57.0) | 51.3 (44.4 ; 57.0) | 49.4 (43.1 ; 58.8) |
| Min ; Max | 30.3 ; 71.7 | 30.3 ; 71.7 | 33.1 ; 67.9 |
| sexe n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| Femme | 28 (56.0%) | 13 (44.8%) | 15 (71.4%) |
| Homme | 22 (44.0%) | 16 (55.2%) | 6 (28.6%) |
| quatres niveaux n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| A | 7 (14.0%) | 2 (6.9%) | 5 (23.8%) |
| B | 28 (56.0%) | 18 (62.1%) | 10 (47.6%) |
| C | 15 (30.0%) | 9 (31.0%) | 6 (28.6%) |
| Echelle n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| 0 | 13 (26.0%) | 6 (20.7%) | 7 (33.3%) |
| 1 | 11 (22.0%) | 6 (20.7%) | 5 (23.8%) |
| 2 | 5 (10.0%) | 4 (13.8%) | 1 (4.8%) |
| 3 | 3 (6.0%) | 3 (10.3%) | 0 (0.0%) |
| 4 | 12 (24.0%) | 7 (24.1%) | 5 (23.8%) |
| 5 | 6 (12.0%) | 3 (10.3%) | 3 (14.3%) |
The package support the user specification of feature type as
quantitative or qualitative features. For instance, you could chose to
describe a quantitative features as a qualitative one if it has few
values. For instance, we can do this for Age after we round
it.
data %>%
dplyr::select(Age, traitement) %>%
dplyr::mutate(Age = round(Age)) %>%
RastaRocket::desc_var(table_title = "test",
by_group = TRUE,
var_group = "traitement",
group_title = "traitement",
quali = c("Age"))| Characteristic | Overall |
traitement
|
|
|---|---|---|---|
| BRAS-A | BRAS-B | ||
| Age n (d.m.) | 45 (5) | 26 (3) | 19 (2) |
| 30 | 1 (2.2%) | 1 (3.8%) | 0 (0.0%) |
| 33 | 1 (2.2%) | 0 (0.0%) | 1 (5.3%) |
| 37 | 2 (4.4%) | 1 (3.8%) | 1 (5.3%) |
| 39 | 2 (4.4%) | 1 (3.8%) | 1 (5.3%) |
| 40 | 1 (2.2%) | 1 (3.8%) | 0 (0.0%) |
| 43 | 3 (6.7%) | 1 (3.8%) | 2 (10.5%) |
| 44 | 3 (6.7%) | 3 (11.5%) | 0 (0.0%) |
| 45 | 1 (2.2%) | 1 (3.8%) | 0 (0.0%) |
| 46 | 2 (4.4%) | 0 (0.0%) | 2 (10.5%) |
| 47 | 2 (4.4%) | 0 (0.0%) | 2 (10.5%) |
| 48 | 3 (6.7%) | 3 (11.5%) | 0 (0.0%) |
| 49 | 1 (2.2%) | 0 (0.0%) | 1 (5.3%) |
| 51 | 3 (6.7%) | 1 (3.8%) | 2 (10.5%) |
| 52 | 1 (2.2%) | 1 (3.8%) | 0 (0.0%) |
| 54 | 3 (6.7%) | 1 (3.8%) | 2 (10.5%) |
| 55 | 2 (4.4%) | 2 (7.7%) | 0 (0.0%) |
| 56 | 1 (2.2%) | 1 (3.8%) | 0 (0.0%) |
| 57 | 2 (4.4%) | 2 (7.7%) | 0 (0.0%) |
| 58 | 2 (4.4%) | 2 (7.7%) | 0 (0.0%) |
| 59 | 2 (4.4%) | 0 (0.0%) | 2 (10.5%) |
| 62 | 2 (4.4%) | 2 (7.7%) | 0 (0.0%) |
| 63 | 1 (2.2%) | 0 (0.0%) | 1 (5.3%) |
| 66 | 1 (2.2%) | 0 (0.0%) | 1 (5.3%) |
| 67 | 1 (2.2%) | 1 (3.8%) | 0 (0.0%) |
| 68 | 1 (2.2%) | 0 (0.0%) | 1 (5.3%) |
| 72 | 1 (2.2%) | 1 (3.8%) | 0 (0.0%) |
The display of missing data is controlled by the
show_missing_data argument in the
RastaRocket::desc_var function. By default, if
anyNA(data1) returns TRUE, missing data will
be displayed. If no missing data is detected, it will be hidden. Users
can override this behavior by explicitly setting
show_missing_data to TRUE or
FALSE.
iris %>% RastaRocket::desc_var(table_title = "test",
by_group = TRUE,
var_group = "Species",
group_title = "Species",
show_missing_data = TRUE)| Characteristic | Overall |
Species
|
||
|---|---|---|---|---|
| setosa | versicolor | virginica | ||
| Sepal.Length n (d.m.) | 150 (0) | 50 (0) | 50 (0) | 50 (0) |
| Mean (SD) | 5.8 (0.8) | 5.0 (0.4) | 5.9 (0.5) | 6.6 (0.6) |
| Median (Q1 ; Q3) | 5.8 (5.1 ; 6.4) | 5.0 (4.8 ; 5.2) | 5.9 (5.6 ; 6.3) | 6.5 (6.2 ; 6.9) |
| Min ; Max | 4.3 ; 7.9 | 4.3 ; 5.8 | 4.9 ; 7.0 | 4.9 ; 7.9 |
| Sepal.Width n (d.m.) | 150 (0) | 50 (0) | 50 (0) | 50 (0) |
| Mean (SD) | 3.1 (0.4) | 3.4 (0.4) | 2.8 (0.3) | 3.0 (0.3) |
| Median (Q1 ; Q3) | 3.0 (2.8 ; 3.3) | 3.4 (3.2 ; 3.7) | 2.8 (2.5 ; 3.0) | 3.0 (2.8 ; 3.2) |
| Min ; Max | 2.0 ; 4.4 | 2.3 ; 4.4 | 2.0 ; 3.4 | 2.2 ; 3.8 |
| Petal.Length n (d.m.) | 150 (0) | 50 (0) | 50 (0) | 50 (0) |
| Mean (SD) | 3.8 (1.8) | 1.5 (0.2) | 4.3 (0.5) | 5.6 (0.6) |
| Median (Q1 ; Q3) | 4.4 (1.6 ; 5.1) | 1.5 (1.4 ; 1.6) | 4.4 (4.0 ; 4.6) | 5.6 (5.1 ; 5.9) |
| Min ; Max | 1.0 ; 6.9 | 1.0 ; 1.9 | 3.0 ; 5.1 | 4.5 ; 6.9 |
| Petal.Width n (d.m.) | 150 (0) | 50 (0) | 50 (0) | 50 (0) |
| Mean (SD) | 1.2 (0.8) | 0.2 (0.1) | 1.3 (0.2) | 2.0 (0.3) |
| Median (Q1 ; Q3) | 1.3 (0.3 ; 1.8) | 0.2 (0.2 ; 0.3) | 1.3 (1.2 ; 1.5) | 2.0 (1.8 ; 2.3) |
| Min ; Max | 0.1 ; 2.5 | 0.1 ; 0.6 | 1.0 ; 1.8 | 1.4 ; 2.5 |
iris %>% RastaRocket::desc_var(table_title = "test",
by_group = TRUE,
var_group = "Species",
group_title = "Species",
show_missing_data = FALSE)| Characteristic | Overall |
Species
|
||
|---|---|---|---|---|
| setosa | versicolor | virginica | ||
| Sepal.Length | 150 | 50 | 50 | 50 |
| Mean (SD) | 5.8 (0.8) | 5.0 (0.4) | 5.9 (0.5) | 6.6 (0.6) |
| Median (Q1 ; Q3) | 5.8 (5.1 ; 6.4) | 5.0 (4.8 ; 5.2) | 5.9 (5.6 ; 6.3) | 6.5 (6.2 ; 6.9) |
| Min ; Max | 4.3 ; 7.9 | 4.3 ; 5.8 | 4.9 ; 7.0 | 4.9 ; 7.9 |
| Sepal.Width | 150 | 50 | 50 | 50 |
| Mean (SD) | 3.1 (0.4) | 3.4 (0.4) | 2.8 (0.3) | 3.0 (0.3) |
| Median (Q1 ; Q3) | 3.0 (2.8 ; 3.3) | 3.4 (3.2 ; 3.7) | 2.8 (2.5 ; 3.0) | 3.0 (2.8 ; 3.2) |
| Min ; Max | 2.0 ; 4.4 | 2.3 ; 4.4 | 2.0 ; 3.4 | 2.2 ; 3.8 |
| Petal.Length | 150 | 50 | 50 | 50 |
| Mean (SD) | 3.8 (1.8) | 1.5 (0.2) | 4.3 (0.5) | 5.6 (0.6) |
| Median (Q1 ; Q3) | 4.4 (1.6 ; 5.1) | 1.5 (1.4 ; 1.6) | 4.4 (4.0 ; 4.6) | 5.6 (5.1 ; 5.9) |
| Min ; Max | 1.0 ; 6.9 | 1.0 ; 1.9 | 3.0 ; 5.1 | 4.5 ; 6.9 |
| Petal.Width | 150 | 50 | 50 | 50 |
| Mean (SD) | 1.2 (0.8) | 0.2 (0.1) | 1.3 (0.2) | 2.0 (0.3) |
| Median (Q1 ; Q3) | 1.3 (0.3 ; 1.8) | 0.2 (0.2 ; 0.3) | 1.3 (1.2 ; 1.5) | 2.0 (1.8 ; 2.3) |
| Min ; Max | 0.1 ; 2.5 | 0.1 ; 0.6 | 1.0 ; 1.8 | 1.4 ; 2.5 |
In the previous example, no specific data management operations were applied.
In this example, we add freq_relevel = TRUE, which
orders the categories of categorical variables in descending order based
on their counts.
data %>% desc_var(table_title = "test",
by_group = TRUE,
var_group = "traitement",
group_title = "traitement",
freq_relevel = TRUE)| Characteristic | Overall |
traitement
|
|
|---|---|---|---|
| BRAS-A | BRAS-B | ||
| Age n (d.m.) | 45 (5) | 26 (3) | 19 (2) |
| Mean (SD) | 50.7 (9.5) | 51.0 (9.6) | 50.2 (9.5) |
| Median (Q1 ; Q3) | 50.7 (44.4 ; 57.0) | 51.3 (44.4 ; 57.0) | 49.4 (43.1 ; 58.8) |
| Min ; Max | 30.3 ; 71.7 | 30.3 ; 71.7 | 33.1 ; 67.9 |
| sexe n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| Femme | 28 (56.0%) | 13 (44.8%) | 15 (71.4%) |
| Homme | 22 (44.0%) | 16 (55.2%) | 6 (28.6%) |
| quatres niveaux n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| B | 28 (56.0%) | 18 (62.1%) | 10 (47.6%) |
| C | 15 (30.0%) | 9 (31.0%) | 6 (28.6%) |
| A | 7 (14.0%) | 2 (6.9%) | 5 (23.8%) |
| Echelle n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| 0 | 13 (26.0%) | 6 (20.7%) | 7 (33.3%) |
| 1 | 11 (22.0%) | 6 (20.7%) | 5 (23.8%) |
| 2 | 5 (10.0%) | 4 (13.8%) | 1 (4.8%) |
| 3 | 3 (6.0%) | 3 (10.3%) | 0 (0.0%) |
| 4 | 12 (24.0%) | 7 (24.1%) | 5 (23.8%) |
| 5 | 6 (12.0%) | 3 (10.3%) | 3 (14.3%) |
The default order of categorical features is determined by their
levels. If you want to customize this order, you can modify the levels
using a library such as forcats.
data %>%
dplyr::mutate(quatre_modalites = forcats::fct_relevel(quatre_modalites,
"A", "C", "D", "B")) %>%
desc_var(table_title = "test",
by_group = TRUE,
var_group = "traitement",
group_title = "traitement")| Characteristic | Overall |
traitement
|
|
|---|---|---|---|
| BRAS-A | BRAS-B | ||
| Age n (d.m.) | 45 (5) | 26 (3) | 19 (2) |
| Mean (SD) | 50.7 (9.5) | 51.0 (9.6) | 50.2 (9.5) |
| Median (Q1 ; Q3) | 50.7 (44.4 ; 57.0) | 51.3 (44.4 ; 57.0) | 49.4 (43.1 ; 58.8) |
| Min ; Max | 30.3 ; 71.7 | 30.3 ; 71.7 | 33.1 ; 67.9 |
| sexe n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| Femme | 28 (56.0%) | 13 (44.8%) | 15 (71.4%) |
| Homme | 22 (44.0%) | 16 (55.2%) | 6 (28.6%) |
| quatres niveaux n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| A | 7 (14.0%) | 2 (6.9%) | 5 (23.8%) |
| C | 15 (30.0%) | 9 (31.0%) | 6 (28.6%) |
| B | 28 (56.0%) | 18 (62.1%) | 10 (47.6%) |
| Echelle n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| 0 | 13 (26.0%) | 6 (20.7%) | 7 (33.3%) |
| 1 | 11 (22.0%) | 6 (20.7%) | 5 (23.8%) |
| 2 | 5 (10.0%) | 4 (13.8%) | 1 (4.8%) |
| 3 | 3 (6.0%) | 3 (10.3%) | 0 (0.0%) |
| 4 | 12 (24.0%) | 7 (24.1%) | 5 (23.8%) |
| 5 | 6 (12.0%) | 3 (10.3%) | 3 (14.3%) |
By default, zero-count levels are removed but we can explicitly specify we do not want to drop them.
data %>% desc_var(table_title = "test",
by_group = TRUE,
var_group = "traitement",
group_title = "traitement",
drop_levels = FALSE)| Characteristic | Overall |
traitement
|
|
|---|---|---|---|
| BRAS-A | BRAS-B | ||
| Age n (d.m.) | 45 (5) | 26 (3) | 19 (2) |
| Mean (SD) | 50.7 (9.5) | 51.0 (9.6) | 50.2 (9.5) |
| Median (Q1 ; Q3) | 50.7 (44.4 ; 57.0) | 51.3 (44.4 ; 57.0) | 49.4 (43.1 ; 58.8) |
| Min ; Max | 30.3 ; 71.7 | 30.3 ; 71.7 | 33.1 ; 67.9 |
| sexe n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| Femme | 28 (56.0%) | 13 (44.8%) | 15 (71.4%) |
| Homme | 22 (44.0%) | 16 (55.2%) | 6 (28.6%) |
| quatres niveaux n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| A | 7 (14.0%) | 2 (6.9%) | 5 (23.8%) |
| B | 28 (56.0%) | 18 (62.1%) | 10 (47.6%) |
| C | 15 (30.0%) | 9 (31.0%) | 6 (28.6%) |
| D | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| Echelle n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| 0 | 13 (26.0%) | 6 (20.7%) | 7 (33.3%) |
| 1 | 11 (22.0%) | 6 (20.7%) | 5 (23.8%) |
| 2 | 5 (10.0%) | 4 (13.8%) | 1 (4.8%) |
| 3 | 3 (6.0%) | 3 (10.3%) | 0 (0.0%) |
| 4 | 12 (24.0%) | 7 (24.1%) | 5 (23.8%) |
| 5 | 6 (12.0%) | 3 (10.3%) | 3 (14.3%) |
Here, we use a per-group description for the variables.
data %>% desc_var(table_title = "test",
by_group = TRUE,
var_group = "traitement",
group_title = "traitement")| Characteristic | Overall |
traitement
|
|
|---|---|---|---|
| BRAS-A | BRAS-B | ||
| Age n (d.m.) | 45 (5) | 26 (3) | 19 (2) |
| Mean (SD) | 50.7 (9.5) | 51.0 (9.6) | 50.2 (9.5) |
| Median (Q1 ; Q3) | 50.7 (44.4 ; 57.0) | 51.3 (44.4 ; 57.0) | 49.4 (43.1 ; 58.8) |
| Min ; Max | 30.3 ; 71.7 | 30.3 ; 71.7 | 33.1 ; 67.9 |
| sexe n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| Femme | 28 (56.0%) | 13 (44.8%) | 15 (71.4%) |
| Homme | 22 (44.0%) | 16 (55.2%) | 6 (28.6%) |
| quatres niveaux n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| A | 7 (14.0%) | 2 (6.9%) | 5 (23.8%) |
| B | 28 (56.0%) | 18 (62.1%) | 10 (47.6%) |
| C | 15 (30.0%) | 9 (31.0%) | 6 (28.6%) |
| Echelle n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| 0 | 13 (26.0%) | 6 (20.7%) | 7 (33.3%) |
| 1 | 11 (22.0%) | 6 (20.7%) | 5 (23.8%) |
| 2 | 5 (10.0%) | 4 (13.8%) | 1 (4.8%) |
| 3 | 3 (6.0%) | 3 (10.3%) | 0 (0.0%) |
| 4 | 12 (24.0%) | 7 (24.1%) | 5 (23.8%) |
| 5 | 6 (12.0%) | 3 (10.3%) | 3 (14.3%) |
In this example, we generate a global description of the variables.
data %>% RastaRocket::desc_var(table_title = "test",
by_group = FALSE,
var_group = "traitement",
group_title = "traitement")| Characteristic | N | Overall |
|---|---|---|
| Age n (d.m.) | 45 (5) | |
| Mean (SD) | 50.7 (9.5) | |
| Median (Q1 ; Q3) | 50.7 (44.4 ; 57.0) | |
| Min ; Max | 30.3 ; 71.7 | |
| sexe n (d.m.) | 50 (0) | |
| Femme | 28 (56.0%) | |
| Homme | 22 (44.0%) | |
| quatres niveaux n (d.m.) | 50 (0) | |
| A | 7 (14.0%) | |
| B | 28 (56.0%) | |
| C | 15 (30.0%) | |
| Echelle n (d.m.) | 50 (0) | |
| 0 | 13 (26.0%) | |
| 1 | 11 (22.0%) | |
| 2 | 5 (10.0%) | |
| 3 | 3 (6.0%) | |
| 4 | 12 (24.0%) | |
| 5 | 6 (12.0%) |
To insert intermediate titles, you can use the
intermediate_header function which takes a list of
sub-tables generated by desc_var and a vector of
titles.
tb1 <- data %>%
dplyr::select(Age, sexe) %>%
RastaRocket::desc_var(table_title = "test")
tb2 <- data %>%
dplyr::select(quatre_modalites) %>%
RastaRocket::desc_var(table_title = "test")
RastaRocket::intermediate_header(tbls = list(tb1, tb2),
group_header = c("Title A", "Title B"))| Characteristic | N | Overall |
|---|---|---|
| Title A | ||
| Age n (d.m.) | 45 (5) | |
| Mean (SD) | 50.7 (9.5) | |
| Median (Q1 ; Q3) | 50.7 (44.4 ; 57.0) | |
| Min ; Max | 30.3 ; 71.7 | |
| sexe n (d.m.) | 50 (0) | |
| Femme | 28 (56.0%) | |
| Homme | 22 (44.0%) | |
| Title B | ||
| quatres niveaux | 50 | |
| A | 7 (14.0%) | |
| B | 28 (56.0%) | |
| C | 15 (30.0%) | |
You can specify the number of digits for quantitative and qualitative
features using the digits argument.
In the example below, quantitative values are rounded to 0 decimal places, while qualitative values percentage are rounded to 1 decimal place.
data %>%
RastaRocket::desc_var(table_title = "test",
by_group = TRUE,
var_group = "traitement",
digits = list(mean_sd = 0,
median_q1_q3_min_max = 0,
pct = 1))| Characteristic | Overall |
traitement
|
|
|---|---|---|---|
| BRAS-A | BRAS-B | ||
| Age n (d.m.) | 45 (5) | 26 (3) | 19 (2) |
| Mean (SD) | 51 (9) | 51 (10) | 50 (10) |
| Median (Q1 ; Q3) | 51 (44 ; 57) | 51 (44 ; 57) | 49 (43 ; 59) |
| Min ; Max | 30 ; 72 | 30 ; 72 | 33 ; 68 |
| sexe n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| Femme | 28 (56.0%) | 13 (44.8%) | 15 (71.4%) |
| Homme | 22 (44.0%) | 16 (55.2%) | 6 (28.6%) |
| quatres niveaux n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| A | 7 (14.0%) | 2 (6.9%) | 5 (23.8%) |
| B | 28 (56.0%) | 18 (62.1%) | 10 (47.6%) |
| C | 15 (30.0%) | 9 (31.0%) | 6 (28.6%) |
| Echelle n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| 0 | 13 (26.0%) | 6 (20.7%) | 7 (33.3%) |
| 1 | 11 (22.0%) | 6 (20.7%) | 5 (23.8%) |
| 2 | 5 (10.0%) | 4 (13.8%) | 1 (4.8%) |
| 3 | 3 (6.0%) | 3 (10.3%) | 0 (0.0%) |
| 4 | 12 (24.0%) | 7 (24.1%) | 5 (23.8%) |
| 5 | 6 (12.0%) | 3 (10.3%) | 3 (14.3%) |
To have more control over rounding, you can create subtables with
different numbers of digits and combine them into a single table using
gtsummary::tbl_stack.
tb1 <- data %>%
dplyr::select(Age, sexe, traitement) %>%
RastaRocket::desc_var(table_title = "test",
by_group = TRUE,
var_group = "traitement",
digits = list(mean_sd = 2,
median_q1_q3_min_max = 2,
pct = 2))
tb2 <- data %>%
dplyr::select(quatre_modalites, traitement) %>%
RastaRocket::desc_var(table_title = "test",
by_group = TRUE,
var_group = "traitement",
digits = list(mean_sd = 0,
median_q1_q3_min_max = 0,
pct = 1))
gtsummary::tbl_stack(list(tb1, tb2))| Characteristic | Overall |
traitement
|
|
|---|---|---|---|
| BRAS-A | BRAS-B | ||
| Age n (d.m.) | 45 (5) | 26 (3) | 19 (2) |
| Mean (SD) | 50.67 (9.49) | 51.03 (9.63) | 50.17 (9.53) |
| Median (Q1 ; Q3) | 50.71 (44.40 ; 57.01) | 51.32 (44.40 ; 57.01) | 49.38 (43.05 ; 58.78) |
| Min ; Max | 30.33 ; 71.69 | 30.33 ; 71.69 | 33.13 ; 67.87 |
| sexe n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| Femme | 28 (56.00%) | 13 (44.83%) | 15 (71.43%) |
| Homme | 22 (44.00%) | 16 (55.17%) | 6 (28.57%) |
| quatres niveaux | 50 | 29 | 21 |
| A | 7 (14.0%) | 2 (6.9%) | 5 (23.8%) |
| B | 28 (56.0%) | 18 (62.1%) | 10 (47.6%) |
| C | 15 (30.0%) | 9 (31.0%) | 6 (28.6%) |
You can include statistical tests in your summary table using the
tests = TRUE argument. This automatically applies default
statistical tests for the grouped variables.
The following example adds statistical tests for all features,
grouped by the traitement variable.
data %>%
RastaRocket::desc_var(table_title = "test",
by_group = TRUE,
var_group = "traitement",
tests = TRUE)| Characteristic | Overall |
traitement
|
p-value | |
|---|---|---|---|---|
| BRAS-A | BRAS-B | |||
| Age n (d.m.) | 45 (5) | 26 (3) | 19 (2) | 0.71 |
| Mean (SD) | 50.7 (9.5) | 51.0 (9.6) | 50.2 (9.5) | |
| Median (Q1 ; Q3) | 50.7 (44.4 ; 57.0) | 51.3 (44.4 ; 57.0) | 49.4 (43.1 ; 58.8) | |
| Min ; Max | 30.3 ; 71.7 | 30.3 ; 71.7 | 33.1 ; 67.9 | |
| sexe n (d.m.) | 50 (0) | 29 (0) | 21 (0) | 0.0612 |
| Femme | 28 (56.0%) | 13 (44.8%) | 15 (71.4%) | |
| Homme | 22 (44.0%) | 16 (55.2%) | 6 (28.6%) | |
| quatres niveaux n (d.m.) | 50 (0) | 29 (0) | 21 (0) | 0.33 |
| A | 7 (14.0%) | 2 (6.9%) | 5 (23.8%) | |
| B | 28 (56.0%) | 18 (62.1%) | 10 (47.6%) | |
| C | 15 (30.0%) | 9 (31.0%) | 6 (28.6%) | |
| Echelle n (d.m.) | 50 (0) | 29 (0) | 21 (0) | 0.63 |
| 0 | 13 (26.0%) | 6 (20.7%) | 7 (33.3%) | |
| 1 | 11 (22.0%) | 6 (20.7%) | 5 (23.8%) | |
| 2 | 5 (10.0%) | 4 (13.8%) | 1 (4.8%) | |
| 3 | 3 (6.0%) | 3 (10.3%) | 0 (0.0%) | |
| 4 | 12 (24.0%) | 7 (24.1%) | 5 (23.8%) | |
| 5 | 6 (12.0%) | 3 (10.3%) | 3 (14.3%) | |
| 1 Wilcoxon rank sum exact test | ||||
| 2 Pearson’s Chi-squared test | ||||
| 3 Fisher’s exact test | ||||
For greater control, you can specify the test to use for each feature by passing a named list to the tests argument. The example below applies:
data %>%
RastaRocket::desc_var(table_title = "test",
by_group = TRUE,
var_group = "traitement",
tests = list(Age = "t.test",
sexe = "chisq.test",
echelle = "fisher.test"))| Characteristic | Overall |
traitement
|
p-value | |
|---|---|---|---|---|
| BRAS-A | BRAS-B | |||
| Age n (d.m.) | 45 (5) | 26 (3) | 19 (2) | 0.81 |
| Mean (SD) | 50.7 (9.5) | 51.0 (9.6) | 50.2 (9.5) | |
| Median (Q1 ; Q3) | 50.7 (44.4 ; 57.0) | 51.3 (44.4 ; 57.0) | 49.4 (43.1 ; 58.8) | |
| Min ; Max | 30.3 ; 71.7 | 30.3 ; 71.7 | 33.1 ; 67.9 | |
| sexe n (d.m.) | 50 (0) | 29 (0) | 21 (0) | 0.112 |
| Femme | 28 (56.0%) | 13 (44.8%) | 15 (71.4%) | |
| Homme | 22 (44.0%) | 16 (55.2%) | 6 (28.6%) | |
| quatres niveaux n (d.m.) | 50 (0) | 29 (0) | 21 (0) | 0.33 |
| A | 7 (14.0%) | 2 (6.9%) | 5 (23.8%) | |
| B | 28 (56.0%) | 18 (62.1%) | 10 (47.6%) | |
| C | 15 (30.0%) | 9 (31.0%) | 6 (28.6%) | |
| Echelle n (d.m.) | 50 (0) | 29 (0) | 21 (0) | 0.63 |
| 0 | 13 (26.0%) | 6 (20.7%) | 7 (33.3%) | |
| 1 | 11 (22.0%) | 6 (20.7%) | 5 (23.8%) | |
| 2 | 5 (10.0%) | 4 (13.8%) | 1 (4.8%) | |
| 3 | 3 (6.0%) | 3 (10.3%) | 0 (0.0%) | |
| 4 | 12 (24.0%) | 7 (24.1%) | 5 (23.8%) | |
| 5 | 6 (12.0%) | 3 (10.3%) | 3 (14.3%) | |
| 1 Welch Two Sample t-test | ||||
| 2 Pearson’s Chi-squared test | ||||
| 3 Fisher’s exact test | ||||
To have a nicer appearance of the table, it is possible to customize
it as a gt table. A dedicated function is implemented:
custom_format.
data %>%
RastaRocket::desc_var(table_title = "test",
by_group = TRUE,
var_group = "traitement",
tests = list(Age = "t.test",
sexe = "chisq.test",
echelle = "fisher.test")) %>%
custom_format()| Characteristic | Overall |
traitement
|
p-value | |
|---|---|---|---|---|
| BRAS-A | BRAS-B | |||
| Age n (d.m.) | 45 (5) | 26 (3) | 19 (2) | 0.81 |
| Mean (SD) | 50.7 (9.5) | 51.0 (9.6) | 50.2 (9.5) | |
| Median (Q1 ; Q3) | 50.7 (44.4 ; 57.0) | 51.3 (44.4 ; 57.0) | 49.4 (43.1 ; 58.8) | |
| Min ; Max | 30.3 ; 71.7 | 30.3 ; 71.7 | 33.1 ; 67.9 | |
| sexe n (d.m.) | 50 (0) | 29 (0) | 21 (0) | 0.112 |
| Femme | 28 (56.0%) | 13 (44.8%) | 15 (71.4%) | |
| Homme | 22 (44.0%) | 16 (55.2%) | 6 (28.6%) | |
| quatres niveaux n (d.m.) | 50 (0) | 29 (0) | 21 (0) | 0.33 |
| A | 7 (14.0%) | 2 (6.9%) | 5 (23.8%) | |
| B | 28 (56.0%) | 18 (62.1%) | 10 (47.6%) | |
| C | 15 (30.0%) | 9 (31.0%) | 6 (28.6%) | |
| Echelle n (d.m.) | 50 (0) | 29 (0) | 21 (0) | 0.63 |
| 0 | 13 (26.0%) | 6 (20.7%) | 7 (33.3%) | |
| 1 | 11 (22.0%) | 6 (20.7%) | 5 (23.8%) | |
| 2 | 5 (10.0%) | 4 (13.8%) | 1 (4.8%) | |
| 3 | 3 (6.0%) | 3 (10.3%) | 0 (0.0%) | |
| 4 | 12 (24.0%) | 7 (24.1%) | 5 (23.8%) | |
| 5 | 6 (12.0%) | 3 (10.3%) | 3 (14.3%) | |
| 1 Welch Two Sample t-test | ||||
| 2 Pearson’s Chi-squared test | ||||
| 3 Fisher’s exact test | ||||
This also works when using stacked tables.
tb1 <- data %>%
dplyr::select(Age, sexe, traitement) %>%
RastaRocket::desc_var(table_title = "test",
by_group = TRUE,
var_group = "traitement",
digits = list(mean_sd = 0,
median_q1_q3_min_max = 0,
pct = 0))
tb2 <- data %>%
dplyr::select(quatre_modalites, traitement) %>%
RastaRocket::desc_var(table_title = "test",
by_group = TRUE,
var_group = "traitement",
digits = list(mean_sd = 2,
median_q1_q3_min_max = 2,
pct = 2))
gtsummary::tbl_stack(list(tb1, tb2)) %>%
custom_format()| Characteristic | Overall |
traitement
|
|
|---|---|---|---|
| BRAS-A | BRAS-B | ||
| Age n (d.m.) | 45 (5) | 26 (3) | 19 (2) |
| Mean (SD) | 51 (9) | 51 (10) | 50 (10) |
| Median (Q1 ; Q3) | 51 (44 ; 57) | 51 (44 ; 57) | 49 (43 ; 59) |
| Min ; Max | 30 ; 72 | 30 ; 72 | 33 ; 68 |
| sexe n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| Femme | 28 (56%) | 13 (45%) | 15 (71%) |
| Homme | 22 (44%) | 16 (55%) | 6 (29%) |
| quatres niveaux | 50 | 29 | 21 |
| A | 7 (14.00%) | 2 (6.90%) | 5 (23.81%) |
| B | 28 (56.00%) | 18 (62.07%) | 10 (47.62%) |
| C | 15 (30.00%) | 9 (31.03%) | 6 (28.57%) |
You can customize the format by specifying the column size and the alignment.
data %>%
RastaRocket::desc_var(table_title = "test",
by_group = TRUE,
var_group = "traitement") %>%
custom_format(align = "left",
column_size = list(label ~ gt::pct(50),
gt::starts_with("stat") ~ gt::pct(25)))| Characteristic | Overall |
traitement
|
|
|---|---|---|---|
| BRAS-A | BRAS-B | ||
| Age n (d.m.) | 45 (5) | 26 (3) | 19 (2) |
| Mean (SD) | 50.7 (9.5) | 51.0 (9.6) | 50.2 (9.5) |
| Median (Q1 ; Q3) | 50.7 (44.4 ; 57.0) | 51.3 (44.4 ; 57.0) | 49.4 (43.1 ; 58.8) |
| Min ; Max | 30.3 ; 71.7 | 30.3 ; 71.7 | 33.1 ; 67.9 |
| sexe n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| Femme | 28 (56.0%) | 13 (44.8%) | 15 (71.4%) |
| Homme | 22 (44.0%) | 16 (55.2%) | 6 (28.6%) |
| quatres niveaux n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| A | 7 (14.0%) | 2 (6.9%) | 5 (23.8%) |
| B | 28 (56.0%) | 18 (62.1%) | 10 (47.6%) |
| C | 15 (30.0%) | 9 (31.0%) | 6 (28.6%) |
| Echelle n (d.m.) | 50 (0) | 29 (0) | 21 (0) |
| 0 | 13 (26.0%) | 6 (20.7%) | 7 (33.3%) |
| 1 | 11 (22.0%) | 6 (20.7%) | 5 (23.8%) |
| 2 | 5 (10.0%) | 4 (13.8%) | 1 (4.8%) |
| 3 | 3 (6.0%) | 3 (10.3%) | 0 (0.0%) |
| 4 | 12 (24.0%) | 7 (24.1%) | 5 (23.8%) |
| 5 | 6 (12.0%) | 3 (10.3%) | 3 (14.3%) |
You can customize the output format to french using the
gtsummary::theme_gtsummary_language function. The
gtsummary::reset_gtsummary_theme() reset the format to the
default behavior (i.e English). You can set the format once at the
beginning of the document, no need to specify it multiple times.
# reset theme to default
gtsummary::reset_gtsummary_theme()
# switch to French format
gtsummary::theme_gtsummary_language(language = "fr", decimal.mark = ",", big.mark = " ")
iris %>%
RastaRocket::desc_var(table_title = "test")| Caractéristique | N | Total |
|---|---|---|
| Sepal.Length | 150 | |
| Moyenne (ET) | 5,8 (0,8) | |
| Médiane (Q1 ; Q3) | 5,8 (5,1 ; 6,4) | |
| Min ; Max | 4,3 ; 7,9 | |
| Sepal.Width | 150 | |
| Moyenne (ET) | 3,1 (0,4) | |
| Médiane (Q1 ; Q3) | 3,0 (2,8 ; 3,3) | |
| Min ; Max | 2,0 ; 4,4 | |
| Petal.Length | 150 | |
| Moyenne (ET) | 3,8 (1,8) | |
| Médiane (Q1 ; Q3) | 4,4 (1,6 ; 5,1) | |
| Min ; Max | 1,0 ; 6,9 | |
| Petal.Width | 150 | |
| Moyenne (ET) | 1,2 (0,8) | |
| Médiane (Q1 ; Q3) | 1,3 (0,3 ; 1,8) | |
| Min ; Max | 0,1 ; 2,5 | |
| Species | 150 | |
| setosa | 50 (33,3%) | |
| versicolor | 50 (33,3%) | |
| virginica | 50 (33,3%) |
# you can put several tables here, it will keep French format
# back to default format
gtsummary::reset_gtsummary_theme()