Calculate summary statistics

Calculates summary statistics from outputs of generate() or hypothesize().

Learn more in vignette("infer").

calculate(
  x,
  stat = c("mean", "median", "sum", "sd", "prop", "count", "diff in means",
    "diff in medians", "diff in props", "Chisq", "F", "slope", "correlation", "t", "z"),
  order = NULL,
  ...
)

Arguments

x	The output from `generate()` for computation-based inference or the output from `hypothesize()` piped in to here for theory-based inference.
stat	A string giving the type of the statistic to calculate. Current options include `"mean"`, `"median"`, `"sum"`, `"sd"`, `"prop"`, `"count"`, `"diff in means"`, `"diff in medians"`, `"diff in props"`, `"Chisq"`, `"F"`, `"t"`, `"z"`, `"slope"`, and `"correlation"`.
order	A string vector of specifying the order in which the levels of the explanatory variable should be ordered for subtraction, where `order = c("first", "second")` means `("first" - "second")` Needed for inference on difference in means, medians, or proportions and t and z statistics.
...	To pass options like `na.rm = TRUE` into functions like mean(), sd(), etc.

Value

A tibble containing a stat column of calculated statistics.

Examples


# calculate a null distribution of hours worked per week under
# the null hypothesis that the mean is 40
gss %>%
 specify(response = hours) %>%
 hypothesize(null = "point", mu = 40) %>%
 generate(reps = 1000, type = "bootstrap") %>%
 calculate(stat = "mean")
#> Warning: Removed 1244 rows containing missing values.
#> # A tibble: 1,000 x 2
#>    replicate  stat
#>        <int> <dbl>
#>  1         1  39.9
#>  2         2  39.9
#>  3         3  39.8
#>  4         4  40.1
#>  5         5  40.4
#>  6         6  40.1
#>  7         7  40.2
#>  8         8  40.4
#>  9         9  39.9
#> 10        10  40.3
#> # … with 990 more rows

# calculate a null distribution assuming independence between age
# of respondent and whether they have a college degree
gss %>%
 specify(age ~ college) %>%
 hypothesize(null = "independence") %>%
 generate(reps = 1000, type = "permute") %>%
 calculate("diff in means", order = c("degree", "no degree"))
#> Warning: Removed 22 rows containing missing values.
#> # A tibble: 1,000 x 2
#>    replicate   stat
#>        <int>  <dbl>
#>  1         1 0.814 
#>  2         2 0.619 
#>  3         3 0.146 
#>  4         4 0.972 
#>  5         5 0.381 
#>  6         6 1.15  
#>  7         7 0.0394
#>  8         8 0.651 
#>  9         9 0.731 
#> 10        10 0.111 
#> # … with 990 more rows

# More in-depth explanation of how to use the infer package
vignette("infer")
#> Warning: vignette ‘infer’ not found

Arguments

Value

Examples

Contents