Simple Linear Regression Key Results

Table of Contents

Assumptions

$$ Y_{i}=\beta_{0}+\beta_{1} x_{i}+\epsilon_{i}, \quad i=1,2, \ldots, n $$

Assumptions

  1. $E\left(\epsilon_{i}\right)=0$ for all $i$
  2. $V\left(\epsilon_{i}\right)=\sigma^{2}$ for all $i$
  3. The $\epsilon_{i}$ are uncorrelated, i.e., $\operatorname{Cov}\left(\epsilon_{i}, \epsilon_{j}\right)=0$ for all $i \neq j$.

The above three are the only assumptions for linear regression. If we assume that $\epsilon$ follows Normal, then this will lead to stronger results about the confidence interval.

Remarks

  1. $\epsilon_i$ is called irreducible errors
  2. $\hat \epsilon_i$ is the residual

Important Results

Define the following statistics.

  1. sample covariance
  2. sample variance
  3. sample correlation
$$ \begin{aligned} s_{x y}&=\left(\frac{1}{n-1}\right) \sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)\\ s_{x}^{2}&=\left(\frac{1}{n-1}\right) \sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{2}\\ r_{x y}&=\frac{s_{x y}}{s_{x} s_{y}} \end{aligned} $$

RSS

The objective of the regression is to minimize the following:

$$ \operatorname{RSS}\left(\beta_{0}, \beta_{1}\right)=\sum_{i=1}^{n}\left(y_{i}-\left(\beta_{0}+\beta_{1} x_{i}\right)\right)^{2} $$

Beta

$$ \widehat{\beta}_{1}=\frac{\sum_{i}\left(y_{i}-\bar{y}\right)\left(x_{i}-\bar{x}\right)}{\sum_{i}\left(x_{i}-\bar{x}\right)^{2}}=\frac{s_{x y}}{s_{x}^{2}}=r_{x y}\left(\frac{s_{y}}{s_{x}}\right) $$

Beta Variance

$$ \begin{aligned} V\left(\widehat{\beta}_{1}\right)&=\frac{\sigma^{2}}{(n-1) s_{x}^{2}} \approx \frac{\widehat{\sigma}^{2}}{(n-1) s_{x}^{2}}\\ \operatorname{SE}&\left(\widehat{\beta}_{1}\right) \approx \frac{\hat{\sigma}}{s_{x} \sqrt{(n-1)}} \end{aligned} $$

Sigma Squared

$$ \begin{aligned} \widehat{\sigma}^{2}=\left(\frac{n}{n-2}\right) \sigma_{\mathrm{MLE}}^{2}=\left(\frac{1}{n-2}\right) \sum_{i=1}^{n}\left(y_{i}-\widehat{y}_{i}\right)^{2}=\operatorname{RSS} /(n-2) \end{aligned} $$

Coefficient of Determination

$$ R^{2}=1-\left[\sum_{i=1}^{n}\left(y_{i}-\widehat{y}_{i}\right)^{2} / \sum_{i=1}^{n}\left(y_{i}-\bar{y}\right)^{2}\right] $$

Questions

Advantages of Least Square

Why do we use least squares to estimate parameters?
  1. Simple derivation from calculus.
  2. Least squares lead to the same result as the derivation by MLE (with the normality assumption of $\epsilon$).
  3. Least square estimators are best linear unbiased estimators for betas under the 3 basic assumptions.

Constraints

Why the unbiased estimator has to be adjusted by $n-2$?

In general, for a regression with $p$ predictors, we lost $p+1$ degree of freedoms, because of the following two constraints.

  1. $\sum \hat{\epsilon}_{i} = \hat{\epsilon}^{\top} \underline{1}=0$
  2. $\hat{\epsilon}^{\top} \underline{x}=0$

Switch the predictor and response

If we switch the predictors and the response, how will the slope change?

Note that the original slope is $r_{x y}\left(\frac{s_{y}}{s_{x}}\right)$, while the new slope is $r_{x y}\left(\frac{s_{x}}{s_{y}}\right)$. They are not the simple relationship of reciprocal.

Data Duplication

If observed data set is included twice in the analysis. How does $\beta$, $R^2$ and $t$ statistics change?

$\beta$ and $R^2$ will not change. Remember that these two statistics are just derived form $s_x$, $s_y$ or $r_{xy}$. The data replication (almost) has no impact on them.

But $t$ statistics will be multiplied by $\sqrt{2}$. The confidence interval of $\beta$ is narrower.

Correlation between predicted values and residuals

Prove that $\operatorname{Corr}(\hat{y}, \hat{\epsilon})=0$.

$R^2$ and $r_{xy}$

Prove that $R^2=r_{xy}^2$.