Simple Linear Regression Key Results
Table of Contents
Assumptions
$$ Y_{i}=\beta_{0}+\beta_{1} x_{i}+\epsilon_{i}, \quad i=1,2, \ldots, n $$Assumptions
- $E\left(\epsilon_{i}\right)=0$ for all $i$
- $V\left(\epsilon_{i}\right)=\sigma^{2}$ for all $i$
- The $\epsilon_{i}$ are uncorrelated, i.e., $\operatorname{Cov}\left(\epsilon_{i}, \epsilon_{j}\right)=0$ for all $i \neq j$.
The above three are the only assumptions for linear regression. If we assume that $\epsilon$ follows Normal, then this will lead to stronger results about the confidence interval.
Remarks
- $\epsilon_i$ is called irreducible errors
- $\hat \epsilon_i$ is the residual
Important Results
Define the following statistics.
- sample covariance
- sample variance
- sample correlation
RSS
The objective of the regression is to minimize the following:
$$ \operatorname{RSS}\left(\beta_{0}, \beta_{1}\right)=\sum_{i=1}^{n}\left(y_{i}-\left(\beta_{0}+\beta_{1} x_{i}\right)\right)^{2} $$Beta
$$ \widehat{\beta}_{1}=\frac{\sum_{i}\left(y_{i}-\bar{y}\right)\left(x_{i}-\bar{x}\right)}{\sum_{i}\left(x_{i}-\bar{x}\right)^{2}}=\frac{s_{x y}}{s_{x}^{2}}=r_{x y}\left(\frac{s_{y}}{s_{x}}\right) $$Beta Variance
$$ \begin{aligned} V\left(\widehat{\beta}_{1}\right)&=\frac{\sigma^{2}}{(n-1) s_{x}^{2}} \approx \frac{\widehat{\sigma}^{2}}{(n-1) s_{x}^{2}}\\ \operatorname{SE}&\left(\widehat{\beta}_{1}\right) \approx \frac{\hat{\sigma}}{s_{x} \sqrt{(n-1)}} \end{aligned} $$Sigma Squared
$$ \begin{aligned} \widehat{\sigma}^{2}=\left(\frac{n}{n-2}\right) \sigma_{\mathrm{MLE}}^{2}=\left(\frac{1}{n-2}\right) \sum_{i=1}^{n}\left(y_{i}-\widehat{y}_{i}\right)^{2}=\operatorname{RSS} /(n-2) \end{aligned} $$Coefficient of Determination
$$ R^{2}=1-\left[\sum_{i=1}^{n}\left(y_{i}-\widehat{y}_{i}\right)^{2} / \sum_{i=1}^{n}\left(y_{i}-\bar{y}\right)^{2}\right] $$Questions
Advantages of Least Square
- Simple derivation from calculus.
- Least squares lead to the same result as the derivation by MLE (with the normality assumption of $\epsilon$).
- Least square estimators are best linear unbiased estimators for betas under the 3 basic assumptions.
Constraints
In general, for a regression with $p$ predictors, we lost $p+1$ degree of freedoms, because of the following two constraints.
- $\sum \hat{\epsilon}_{i} = \hat{\epsilon}^{\top} \underline{1}=0$
- $\hat{\epsilon}^{\top} \underline{x}=0$
Switch the predictor and response
Note that the original slope is $r_{x y}\left(\frac{s_{y}}{s_{x}}\right)$, while the new slope is $r_{x y}\left(\frac{s_{x}}{s_{y}}\right)$. They are not the simple relationship of reciprocal.
Data Duplication
$\beta$ and $R^2$ will not change. Remember that these two statistics are just derived form $s_x$, $s_y$ or $r_{xy}$. The data replication (almost) has no impact on them.
But $t$ statistics will be multiplied by $\sqrt{2}$. The confidence interval of $\beta$ is narrower.