- Mina Wheeler (2.5) http://rpubs.com/minawheeler/362648
- Iden Watanabe (2.19) http://rpubs.com/eyeden/362387
- Vinicio Haro (2.29) http://rpubs.com/vharo00/362885
February 21, 2018
coins <- sample(c(-1,1), 100, replace=TRUE) plot(1:length(coins), cumsum(coins), type='l') abline(h=0)
cumsum(coins)[length(coins)]
## [1] -12
samples <- rep(NA, 1000) for(i in seq_along(samples)) { coins <- sample(c(-1,1), 100, replace=TRUE) samples[i] <- cumsum(coins)[length(coins)] } head(samples)
## [1] -8 8 -2 -10 -8 6
hist(samples)
(m.sam <- mean(samples))
## [1] 0.162
(s.sam <- sd(samples))
## [1] 9.883088
within1sd <- samples[samples >= m.sam - s.sam & samples <= m.sam + s.sam] length(within1sd) / length(samples)
## [1] 0.677
within2sd <- samples[samples >= m.sam - 2 * s.sam & samples <= m.sam + 2* s.sam] length(within2sd) / length(samples)
## [1] 0.951
within3sd <- samples[samples >= m.sam - 3 * s.sam & samples <= m.sam + 3 * s.sam] length(within3sd) / length(samples)
## [1] 0.999
\[ f\left( x|\mu ,\sigma \right) =\frac { 1 }{ \sigma \sqrt { 2\pi } } { e }^{ -\frac { { \left( x-\mu \right) }^{ 2 } }{ { 2\sigma }^{ 2 } } } \]
x <- seq(-4,4,length=200); y <- dnorm(x,mean=0, sd=1) plot(x, y, type = "l", lwd = 2, xlim = c(-3.5,3.5), ylab='', xlab='z-score', yaxt='n')
pnorm(15, mean=mean(samples), sd=sd(samples))
## [1] 0.9333678
1 - pnorm(15, mean=mean(samples), sd=sd(samples))
## [1] 0.06663219
SAT scores are distributed nearly normally with mean 1500 and stan- dard deviation 300. ACT scores are distributed nearly normally with mean 21 and standard deviation 5. A college admissions officer wants to determine which of the two applicants scored better on their standardized test with respect to the other test takers: Pam, who earned an 1800 on her SAT, or Jim, who scored a 24 on his ACT?
\[ Z = \frac{observation - mean}{SD} \]
Converting Pam and Jim's scores to z-scores:
\[ Z_{Pam} = \frac{1800 - 1500}{300} = 1 \]
\[ Z_{Jim} = \frac{24-21}{5} = 0.6 \]
SAT scores are distributed nearly normally with mean 1500 and standard deviation 300.
To use the 68-95-99 rule, we must verify the normality assumption. We will want to do this also later when we talk about various (parametric) modeling. Consider a sample of 100 male heights (in inches).
Histogram looks normal, but we can overlay a standard normal curve to help evaluation.
DATA606::qqnormsim(heights)