statistical project essay

Name:
Project Final Report Form
Instructions: Answer each question below using either Minitab/JMP software. All results from the
statistical software must be copied and pasted into the blank boxes below. Do not enter output by hand.
Wherever it says “display”, you must copy and paste the results obtained from Minitab/JMP.
Introduction
1
1. is the sample biased? Why I or why not?

2. What population is being studied?
For this project you will be using the Blood Alcohol Content Dataset. Here is a
description of the dataset:
How much alcohol can one consume before one’s blood alcohol content (BAC) is
above the legal limit? An undergraduate statistics project conducted at The Ohio
State University in Columbus, Ohio explored the relationship between BAC and
other variables such as amount of alcohol consumed, weight, gender, and age.
The experiment took place in February of 1986 at a student dormitory. Students who
volunteered for the experiment blew into a breathalyzer to indicate that his or her
initial BAC was zero. The number (between 1 and 9) of 12 ounce beers to be drunk
was assigned to each of the subjects by drawing tickets from a bowl. Thirty minutes
after consuming their final beer, students had their BAC measured by a police
officer of the OSU police department. The officer also administered a road sobriety
test before and after the alcohol consumption. This involved performing four simple
tasks, graded on a scale of 1 to 10 (ten being a perfect rating), demonstrating
coordination: balancing on one foot, touching the tip of one’s nose with a forefinger,
placing one’s head back with one’s eyes closed, and walking heel to toe. The police
officer was not aware of how much alcohol each subject had consumed.
The following variables are contained in the dataset:
Gender
Weight = weight of each subject in pounds
Beers = number of 12 ounce beers consumed
BAC = blood alcohol content
1ST SOB = combined score on the four road sobriety tests before alcohol
consumption
2ND SOB = combined score on the four road sobriety tests after alcohol consumption
Some of the variables in this dataset are extraneous to the project. It is up to you to
determine which variables to use. There are some questions below that give you the
freedom to choose any variable from the dataset; be sure to use the variable that
makes the most sense.
2
3. Display the following descriptive statistics for one of the numerical variables: mean, standard
deviation, variance, trimmed mean, minimum, maximum, range, number of missing
observations, total number of observations, first quartile, median, third quartile, and interquartile
range.
a. How many observations are in the dataset?
4. Display histograms for Weight, BAC, and 2nd SOB.
5. Characterize the shape of each histogram using all of the following that apply: unimodal,
bimodal, multimodal, symmetric, bell shaped, uniform, right skewed, or left skewed.
6. Display a boxplot for Weight, BAC, and 2nd SOB.
7. Are there any outliers in your boxplots? If yes, give the value(s).
8. Display the counts and percents of how many data values fall into each category.
9. Display a bar graph for the categorical variable.
10. Display a scatterplot of BAC AND 2nd SOB.
11. Display the value of the correlation coefficient between BAC AND 2nd SOB.
3
12. Characterize the linear relationship between the variables as positive or negative; and as weak,
moderate, or strong. Give a reason for your answer.
13. In one display, include a scatterplot of BAC AND 2nd SOB, the graph of the regression line, and
the regression equation.
14. What is the value of the slope of your regression line? Slope =

15. State the specific meaning of the value of the slope of your regression line in the context of your
data.
16. Use the regression equation to make one prediction. That is, for a value of the predictor variable,
calculate the value of the response variable, and enter both values below. Include units of
measure.
17. What is the value of the coefficient of determination, r2, for your data?
18. State the specific meaning of the value of r2 in the context of your data.
19. Is the prediction you made a reasonably good one? Give a reason for your answer.
Value of predictor variable =
Value of response variable =
Show calculation:
r2 =
4
20. Display the descriptive statistics for one of your numerical variables by each value of your
categorical variable.
21. Display comparative boxplots for one of your numerical variables by each value of your
categorical variable.
22. Use the comparative boxplots to discuss the similarities or differences in the medians, the ranges,
and the interquartile ranges.
23. Display a 98% confidence interval for the mean of BAC.

a. Interpret this result in the context of the data.
b. Find the margin of error for this estimate and state its meaning in the context of the data.
c. Does this margin of error seem larger than desirable? What could you do in a future
observational study to decrease the margin of error for this estimate?
24. Suppose I guess that the average BAC of the students was .08. Display a test of the hypotheses
that the mean of the numerical variable equals the value of the guess given to you versus the
alternative that it does not equal this value.
5
a. At the 5% significance level, what is your decision and why?
b. Write a standard interpretation in the context of the data.
25. Display a hypothesis test for a difference between the means of the BAC variable for the two
subpopulations (male/female). You may use a one-tailed test if you explain why you would
anticipate the difference to be in a certain direction. Otherwise, use a two-tailed test.
a. At the 10% significance level, what is your decision and why?
b. Write a standard interpretation in the context of the data.
6