Questioning lifecycle

Generation creates a null distribution from specify() and (if needed) hypothesize() inputs.

Learn more in vignette("infer").

generate(x, reps = 1, type = NULL, ...)

Arguments

x

A data frame that can be coerced into a tibble.

reps

The number of resamples to generate.

type

Currently either bootstrap, permute, or simulate (see below).

...

Currently ignored.

Value

A tibble containing reps generated datasets, indicated by the replicate column.

Generation Types

The type argument determines the method used to create the null distribution.

  • bootstrap: A bootstrap sample will be drawn for each replicate, where a sample of size equal to the input sample size is drawn (with replacement) from the input sample data.

  • permute: For each replicate, each input value will be randomly reassigned (without replacement) to a new output value in the sample.

  • simulate: A value will be sampled from a theoretical distribution with parameters specified in hypothesize() for each replicate. (This option is currently only applicable for testing point estimates.)

Examples

# Generate a null distribution by taking 1000 bootstrap samples gss %>% specify(response = hours) %>% hypothesize(null = "point", mu = 40) %>% generate(reps = 1000, type = "bootstrap")
#> Warning: Removed 1244 rows containing missing values.
#> Response: hours (numeric) #> Null Hypothesis: point #> # A tibble: 1,756,000 x 2 #> # Groups: replicate [1,000] #> replicate hours #> <int> <dbl> #> 1 1 39.2 #> 2 1 31.2 #> 3 1 39.2 #> 4 1 59.2 #> 5 1 39.2 #> 6 1 24.2 #> 7 1 34.2 #> 8 1 39.2 #> 9 1 49.2 #> 10 1 47.2 #> # … with 1,755,990 more rows
# Generate a null distribution for the independence of # two variables by permuting their values 1000 times gss %>% specify(partyid ~ age) %>% hypothesize(null = "independence") %>% generate(reps = 1000, type = "permute")
#> Warning: Removed 37 rows containing missing values.
#> Response: partyid (factor) #> Explanatory: age (numeric) #> Null Hypothesis: independence #> # A tibble: 2,963,000 x 3 #> # Groups: replicate [1,000] #> partyid age replicate #> <fct> <dbl> <int> #> 1 dem 37 1 #> 2 dem 29 1 #> 3 rep 58 1 #> 4 ind 40 1 #> 5 dem 39 1 #> 6 rep 37 1 #> 7 dem 53 1 #> 8 dem 41 1 #> 9 dem 55 1 #> 10 rep 47 1 #> # … with 2,962,990 more rows
# More in-depth explanation of how to use the infer package vignette("infer")
#> Warning: vignette ‘infer’ not found