Page 95 - Applied Statistics with R
P. 95
7.1. MODELING 95
Recall that we use capital to indicate a random variable, and lower case
to denote a potential value of the random variable. Since we will have
observations, we have random variables and their possible values .
In the simple linear regression model, the are assumed to be fixed, known
constants, and are thus notated with a lower case variable. The response
remains a random variable because of the random behavior of the error vari-
able, . That is, each response is tied to an observable and a random,
unobservable, .
Essentially, we could explicitly think of the as having a different distribution
for each . In other words, has a conditional distribution dependent on the
value of , written . Doing so, we still make no distributional assumptions of
the , since we are only interested in the distribution of the for a particular
value .
2
∣ ∼ ( + , )
0
1
The random are a function of , thus we can write its mean as a function of
,
E[ ∣ = ] = + .
1
0
However, its variance remains constant for each ,
2
Var[ ∣ = ] = .
This is visually displayed in the image below. We see that for any value , the
expected value of is + . At each value of , has the same variance
0
1
2
.
Often, we directly talk about the assumptions that this model makes. They can
be cleverly shortened to LINE.
• Linear. The relationship between and is linear, of the form + .
0
1
• Independent. The errors are independent.
• Normal. The errors, are normally distributed. That is the “error”
around the line follows a normal distribution.
2
• Equal Variance. At each value of , the variance of is the same, .
We are also assuming that the values of are fixed, that is, not random. We
do not make a distributional assumption about the predictor variable.
As a side note, we will often refer to simple linear regression as SLR. Some
explanation of the name SLR:

