Calculates summary statistics from outputs of generate()
or
hypothesize()
.
Learn more in vignette("infer")
.
calculate( x, stat = c("mean", "median", "sum", "sd", "prop", "count", "diff in means", "diff in medians", "diff in props", "Chisq", "F", "slope", "correlation", "t", "z"), order = NULL, ... )
x | The output from |
---|---|
stat | A string giving the type of the statistic to calculate. Current
options include |
order | A string vector of specifying the order in which the levels of
the explanatory variable should be ordered for subtraction, where |
... | To pass options like |
A tibble containing a stat
column of calculated statistics.
# calculate a null distribution of hours worked per week under # the null hypothesis that the mean is 40 gss %>% specify(response = hours) %>% hypothesize(null = "point", mu = 40) %>% generate(reps = 1000, type = "bootstrap") %>% calculate(stat = "mean")#> Warning: Removed 1244 rows containing missing values.#> # A tibble: 1,000 x 2 #> replicate stat #> <int> <dbl> #> 1 1 39.9 #> 2 2 39.9 #> 3 3 39.8 #> 4 4 40.1 #> 5 5 40.4 #> 6 6 40.1 #> 7 7 40.2 #> 8 8 40.4 #> 9 9 39.9 #> 10 10 40.3 #> # … with 990 more rows# calculate a null distribution assuming independence between age # of respondent and whether they have a college degree gss %>% specify(age ~ college) %>% hypothesize(null = "independence") %>% generate(reps = 1000, type = "permute") %>% calculate("diff in means", order = c("degree", "no degree"))#> Warning: Removed 22 rows containing missing values.#> # A tibble: 1,000 x 2 #> replicate stat #> <int> <dbl> #> 1 1 0.814 #> 2 2 0.619 #> 3 3 0.146 #> 4 4 0.972 #> 5 5 0.381 #> 6 6 1.15 #> 7 7 0.0394 #> 8 8 0.651 #> 9 9 0.731 #> 10 10 0.111 #> # … with 990 more rows#> Warning: vignette ‘infer’ not found