Do not coerce to factor in `tbl_svysummary()` #1602

ddsjoberg · 2024-02-09T16:54:10Z

@larmarange see this SO post: https://stackoverflow.com/questions/77957551

Is this something we should address? It seems that the survey method for subset() doesn't remove rows, but puts the weights to 0, and users can't remove unobserved levels from by variables in tbl_svysummary().

The text was updated successfully, but these errors were encountered:

larmarange · 2024-02-09T17:47:46Z

Because the default (t.test) is not implemented for tbl_svysummary(). You should use "smd", cf. https://www.danieldsjoberg.com/gtsummary/reference/tests.html#tbl-svysummary-add-difference-

Currently, add_difference() does not change the default tests when applied to a tbl_svysummary()

ddsjoberg · 2024-02-09T18:05:44Z

I was thinking more about the tbl_svysummary() table itself. The unobserved columns appear in the table, even if we make the underlying column character.

library(gtsummary)
library(PNSIBGE)

pns <- get_pns(year = 2019, labels = TRUE)
pns.2 <- subset(pns, C009  %in% c("Branca", "Preta")) 
pns.2$variables$C009 <- as.character(pns.2$variables$C009)

pns.2 |> 
  gtsummary::tbl_svysummary(by = C009, include = c(C006)) |> 
  gtsummary::as_kable()

Characteristic	Amarela, N = 0	Branca, N = 91,037,722	Ignorado, N = 0	Indígena, N = 0	Parda, N = 0	Preta, N = 21,786,515
C006
Homem	0 (NA%)	42,682,905 (47%)	0 (NA%)	0 (NA%)	0 (NA%)	10,691,164 (49%)
Mulher	0 (NA%)	48,354,817 (53%)	0 (NA%)	0 (NA%)	0 (NA%)	11,095,351 (51%)

But I just tried to tabulate directly with the survey package, and it still shows all levels, even when the column has previously been converted to a character.

So what they are dealing with is a non-standard situation, and they'd just need to write their own method in add_stat() for this, and hide the unobserved columns themselves.

larmarange · 2024-02-09T18:09:44Z

Probably because somewhere the levels are still declared. pns.2$variables$C009 <- as.character(pns.2$variables$C009) did not change metadata stored within the survey object.

It is much safier to use fct_drop() through srvyr::mutate()

larmarange · 2024-02-09T18:10:30Z

But a question remains open: if this is a tbl_svysummary table, should we apply, by default, a relevant test?

ddsjoberg · 2024-02-09T18:17:47Z

Even dropping the levels with srvry, the unobserved levels appear from the survey function.

pns.2 <- 
  srvyr::as_survey_design(pns) |> 
  srvyr::filter(C009 %in% c("Branca", "Preta")) |> 
  srvyr::mutate(C009 = as.character(C009))

survey::svytable(~C009,pns.2)
#> C009
#>  Amarela   Branca Ignorado Indígena    Parda    Preta 
#>        0 91037722        0        0        0 21786515

ddsjoberg · 2024-02-09T18:17:57Z

But, yes, I better default is warrented!

larmarange · 2024-02-09T18:34:41Z

If I remember, as.character keeps the levels attributes, while forcats::fct_drop() remove unobserved levels.

ddsjoberg · 2024-02-09T19:07:58Z

Same issue with forcats::fct_drop() unfortunately

ddsjoberg closed this as completed Feb 9, 2024

ddsjoberg reopened this Feb 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not coerce to factor in `tbl_svysummary()` #1602

Do not coerce to factor in `tbl_svysummary()` #1602

ddsjoberg commented Feb 9, 2024

larmarange commented Feb 9, 2024

ddsjoberg commented Feb 9, 2024

larmarange commented Feb 9, 2024

larmarange commented Feb 9, 2024

ddsjoberg commented Feb 9, 2024

ddsjoberg commented Feb 9, 2024

larmarange commented Feb 9, 2024

ddsjoberg commented Feb 9, 2024

Do not coerce to factor in tbl_svysummary() #1602

Do not coerce to factor in tbl_svysummary() #1602

Comments

ddsjoberg commented Feb 9, 2024

larmarange commented Feb 9, 2024

ddsjoberg commented Feb 9, 2024

larmarange commented Feb 9, 2024

larmarange commented Feb 9, 2024

ddsjoberg commented Feb 9, 2024

ddsjoberg commented Feb 9, 2024

larmarange commented Feb 9, 2024

ddsjoberg commented Feb 9, 2024

Do not coerce to factor in `tbl_svysummary()` #1602

Do not coerce to factor in `tbl_svysummary()` #1602