-
Notifications
You must be signed in to change notification settings - Fork 0
/
steak.R
428 lines (313 loc) · 12.7 KB
/
steak.R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
library(fivethirtyeight)
library(tidyverse)
data("steak_survey", package="fivethirtyeight")
sdat <- na.omit(steak_survey) %>%
filter(region=="Mountain")
ggplot(sdat,
aes(x = steak_prep, fill = female)) +
geom_bar(position="dodge") +
ggtitle(paste0("Steak Preparation Preference by Gender for ","Mountain"," Region"))
*Be sure to only copy and paste the sections as they are discussed on screen.*
Module 3 - Lesson: Building a Document Template, part 1
Here is the code for the first R code chunk
```{r setup, include=FALSE}
# define knitr options
knitr::opts_chunk$set(echo = FALSE)
knitr::opts_chunk$set(warning = FALSE)
knitr::opts_chunk$set(message = FALSE)
# set any missing NAs in table to blank
options(knitr.kable.NA='')
# load R packages
library(fivethirtyeight)
library(tidyverse)
library(knitr)
library(kableExtra)
library(ggthemes)
# load steak_survey dataset
data("steak_survey", package="fivethirtyeight")
# create subset, no missing NAs
# pick a specific region
sdat <- na.omit(steak_survey) %>%
filter(region=="Mountain")
```
Module 3 - Lesson: Building a Document Template, part 2
## Background
In May 2014, Walt Hickey at
[538.com](http://fivethirtyeight.com/) published an
article entitled
["How Americans Like Their Steak"](https://fivethirtyeight.com/features/how-americans-like-their-steak/). This article utilized survey data collected from 550 people which asked questions related to various risky activities, such as whether they speed in traffic, if they smoke or go skydiving, and if they prefer a riskier lottery. The survey also asked if they eat steak and if so, how they prefer their steak.
## Purpose
The "steak survey" dataset that was used for the ["How
Americans Like Their
Steak"](https://fivethirtyeight.com/features/how-americans-like-their-steak/) article, is available through
the R package [`fivethirtyeight`](https://cran.r-
project.org/web/packages/fivethirtyeight/index.html).
Using this dataset, this article summarizes the steak
preferences for the _**Mountain**_ Region of the
United States by gender.
For our summary presented here, a subset of the original steak survey was extracted to only include responses with no missing data and only included respondents from the _**Mountain**_ region of the US. The data subset used for our summary had `r nrow(sdat)` respondents.
## Demographic Summary Tables
The age categories of the `r nrow(sdat)` survey
respondents were:
```{r}
# create table of the age categories
tb <- sdat %>%
select(age) %>%
table() %>%
prop.table()*100
# convert table to data frame
dt <- as.data.frame(tb)
# use kable from knitr package to make table
# and use kable styling from kableExtra package
knitr::kable(dt, format="html",
col.names=c("Ages","%"),
digits=2,
caption="Ages of Survey Respondents") %>%
kableExtra::kable_styling(bootstrap_options ="striped",
full_width=FALSE)
```
The education levels of the `r nrow(sdat)` survey respondents were:
```{r}
# repeat for education
tb <-sdat %>%
select(educ) %>%
table() %>%
prop.table()*100
dt <-as.data.frame(tb)
knitr::kable(dt, format="html",
col.names=c("Education","%"),
digits=2,
caption="Education of Survey Respondents") %>%
kableExtra::kable_styling(bootstrap_options = "striped",
full_width=FALSE)
```
The income levels of the `r nrow(sdat)` survey respondents were:
```{r}
# repeat for household income
tb <-sdat %>%
select(hhold_income) %>%
table() %>%
prop.table()*100
dt <-as.data.frame(tb)
knitr::kable(dt, format="html",
col.names=c("Income","%"),
digits=2,
caption="Income of Survey Respondents") %>%
kableExtra::kable_styling(bootstrap_options = "striped",
full_width=FALSE)
```
## _OPTIONAL_ Demographics of Survey Respondents in a Merged Table
This section is provided as an example of how to merge the three demographic summary tables for the `r nrow(sdat)` survey respondents above into a single merged table.
```{r}
# create table summary for ages
tb1 <-sdat %>%
select(age) %>%
table() %>%
prop.table()*100
# create table summary for education
tb2 <-sdat %>%
select(educ) %>%
table() %>%
prop.table()*100
# create table summary for income
tb3 <-sdat %>%
select(hhold_income) %>%
table() %>%
prop.table()*100
# convert all tables to data frames
tb1df <-as.data.frame(tb1)
tb2df <-as.data.frame(tb2)
tb3df <-as.data.frame(tb3)
# merge 1st 2 data frames together
mtb <-merge(data.frame(tb1df, row.names = NULL),
data.frame(tb2df, row.names = NULL),
by=0, all=TRUE)[-1]
# merge result with 3rd data frame
mtb2 <-merge(data.frame(mtb, row.names = NULL),
data.frame(tb3df, row.names = NULL),
by=0, all=TRUE)[-1]
# use the final data frame
# make into a table with kable
# add styling with kableExtra
# add header with labels spanning 2 columns each
mtb2 %>%
knitr::kable(format="html",
col.names=c("Category","%","Category","%","Category","%"),
digits=2,
caption="Demographics of Survey Respondents") %>%
kableExtra::kable_styling(c("striped","bordered"),
full_width=FALSE) %>%
add_header_above(c("Ages"=2,"Education"=2,"Income"=2))
```
## Steak Preparation Preference by Gender
Finally, here is a breakdown of the `r nrow(sdat)` survey respondents for the _**Mountain**_ region of the US on how they prefer their steak to be prepared by gender.
```{r}
ggplot(sdat, aes(x = steak_prep, fill = female)) +
geom_bar(position="dodge",colour="black") +
ggtitle(paste0("Steak Preference by Gender: ",
"Mountain"," Region")) +
xlab("Steak Preparation Preference") +
ylab("Number of Respondents") +
scale_fill_manual(values=c("skyblue","palevioletred"),
name="Gender",
breaks=c(FALSE,TRUE),
labels=c("Male", "Female")) +
theme_fivethirtyeight()
```
Module 3 - Lesson: Adding Parameters in a Document Template
The text sections and R code chunks have been updated with the params$region substitutions as needed for this lesson.
## Background
In May 2014, Walt Hickey at
[538.com](http://fivethirtyeight.com/) published an
article entitled
["How Americans Like Their Steak"](https://fivethirtyeight.com/features/how-americans-like-their-steak/). This article utilized survey data collected from 550 people which asked questions related to various risky activities, such as whether they speed in traffic, if they smoke or go skydiving, and if they prefer a riskier lottery. The survey also asked if they eat steak and if so, how they prefer their steak.
## Purpose
The "steak survey" dataset that was used for the ["How
Americans Like Their
Steak"](https://fivethirtyeight.com/features/how-americans-like-their-steak/) article, is available through
the R package [`fivethirtyeight`](https://cran.r-
project.org/web/packages/fivethirtyeight/index.html).
Using this dataset, this article summarizes the steak
preferences for the `r params$region` Region of the
United States by gender.
For our summary presented here, a subset of the original steak survey was extracted to only include responses with no missing data and only included respondents from the `r params$region` region of the US. The data subset used for our summary had `r nrow(sdat)` respondents.
## Demographic Summary Tables
The age categories of the `r nrow(sdat)` survey
respondents were:
```{r}
# create table of the age categories
tb <- sdat %>%
select(age) %>%
table() %>%
prop.table()*100
# convert table to data frame
dt <- as.data.frame(tb)
# use kable from knitr package to make table
# and use kable styling from kableExtra package
knitr::kable(dt, format="html",
col.names=c("Ages","%"),
digits=2,
caption="Ages of Survey Respondents") %>%
kableExtra::kable_styling(bootstrap_options ="striped",
full_width=FALSE)
```
The education levels of the `r nrow(sdat)` survey respondents were:
```{r}
# repeat for education
tb <-sdat %>%
select(educ) %>%
table() %>%
prop.table()*100
dt <-as.data.frame(tb)
knitr::kable(dt, format="html",
col.names=c("Education","%"),
digits=2,
caption="Education of Survey Respondents") %>%
kableExtra::kable_styling(bootstrap_options = "striped",
full_width=FALSE)
```
The income levels of the `r nrow(sdat)` survey respondents were:
```{r}
# repeat for household income
tb <-sdat %>%
select(hhold_income) %>%
table() %>%
prop.table()*100
dt <-as.data.frame(tb)
knitr::kable(dt, format="html",
col.names=c("Income","%"),
digits=2,
caption="Income of Survey Respondents") %>%
kableExtra::kable_styling(bootstrap_options = "striped",
full_width=FALSE)
```
## _OPTIONAL_ Demographics of Survey Respondents in a Merged Table
This section is provided as an example of how to merge the three demographic summary tables for the `r nrow(sdat)` survey respondents above into a single merged table.
```{r}
# create table summary for ages
tb1 <-sdat %>%
select(age) %>%
table() %>%
prop.table()*100
# create table summary for education
tb2 <-sdat %>%
select(educ) %>%
table() %>%
prop.table()*100
# create table summary for income
tb3 <-sdat %>%
select(hhold_income) %>%
table() %>%
prop.table()*100
# convert all tables to data frames
tb1df <-as.data.frame(tb1)
tb2df <-as.data.frame(tb2)
tb3df <-as.data.frame(tb3)
# merge 1st 2 data frames together
mtb <-merge(data.frame(tb1df, row.names = NULL),
data.frame(tb2df, row.names = NULL),
by=0, all=TRUE)[-1]
# merge result with 3rd data frame
mtb2 <-merge(data.frame(mtb, row.names = NULL),
data.frame(tb3df, row.names = NULL),
by=0, all=TRUE)[-1]
# use the final data frame
# make into a table with kable
# add styling with kableExtra
# add header with labels spanning 2 columns each
mtb2 %>%
knitr::kable(format="html",
col.names=c("Category","%","Category","%","Category","%"),
digits=2,
caption="Demographics of Survey Respondents") %>%
kableExtra::kable_styling(c("striped","bordered"),
full_width=FALSE) %>%
add_header_above(c("Ages"=2,"Education"=2,"Income"=2))
```
## Steak Preparation Preference by Gender
Finally, here is a breakdown of the `r nrow(sdat)` survey respondents for the `r params$region` region of the US on how they prefer their steak to be prepared by gender.
```{r}
ggplot(sdat, aes(x = steak_prep, fill = female)) +
geom_bar(position="dodge",colour="black") +
ggtitle(paste0("Steak Preference by Gender: ",
params$region," Region")) +
xlab("Steak Preparation Preference") +
ylab("Number of Respondents") +
scale_fill_manual(values=c("skyblue","palevioletred"),
name="Gender",
breaks=c(FALSE,TRUE),
labels=c("Male", "Female")) +
theme_fivethirtyeight()
```
YAML updates for the parameters pull-down list
params:
region:
label:"Region:"
value: Mountain
input: select
choices: ["East North Central","East South Central",
"Middle Atlantic","Mountain", "New England",
"Pacific","South Atlantic", "West North Central",
"West South Central"]
R code for the render function
rmarkdown::render("steakArticleParams.Rmd",
params = list(region = "Mountain"))
R code to create the custom function
render_report <-function(regionvar){
template <-"steakArticleParams.Rmd"
outfile <-sprintf("steakArticle_%s.html",regionvar)
parameters <-list(region = regionvar)
rmarkdown::render(template,
output_file=outfile,
params=parameters)
invisible(TRUE)
}
render_report("Pacific")
R code to use the custom function with purrr
library(purrr)
params_list <- list(list("East North Central",
"East South Central", "Middle Atlantic",
"Mountain", "New England", "Pacific",
"South Atlantic", "West North Central",
"West South Central"))
purrr::pmap(params_list,render_report)