BruceCampbell_ST503_HW3_RiceCh14_Pblms_2_3_5_30_31.Rmd

---
title: "Bruce Campbell NCSU ST-503 HW 3"
subtitle: "Chapter 14 Problems 2,3,6,30,31 Rice, John A. Mathematical Statistics and Data Analysis, Cengage"
author: "Bruce Campbell"
date: "`r format(Sys.time(), '%d %B, %Y')`"
fontsize: 11pt
header-includes:
   - \usepackage{bbm}
output: pdf_document
---

---
```{r setup, include=FALSE,echo=FALSE}
rm(list = ls())
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(dev = 'pdf')
knitr::opts_chunk$set(cache=TRUE)
knitr::opts_chunk$set(tidy=TRUE)
knitr::opts_chunk$set(prompt=FALSE)
knitr::opts_chunk$set(fig.height=5)
knitr::opts_chunk$set(fig.width=7)
knitr::opts_chunk$set(warning=FALSE)
knitr::opts_chunk$set(message=FALSE)
knitr::opts_knit$set(root.dir = ".")
library(latex2exp)   #expmain <- TeX('$x_t = cos(\\frac{2\\pi t}{4}) + w_t$');x = ts(cos(2*pi*0:500/4) + rnorm(500,0,1));plot(x,main =expmain )
library(pander)
library(ggplot2)
library(ggplot2)
library(GGally)
```
##Problem 14.2

For the following data points Plot $(x,y)$, fit and sketcha line $y = a + bx$ by the method of least squares, fit and sketch a line given by $x = c + dy$

\begin{table}[]
\centering
\caption{My caption}
\label{my-label}
\begin{tabular}{lllllllllll}
x & .34 & 1.38 & -.65 & .68 & 1.40 & -.88 & -.30 & -1.18 & .50 & -1.75 \\
y & .27 & 1.34 & -.53 & .35 & 1.28 & -.98 & -.72 & -.81  & .64 & -1.59
\end{tabular}
\end{table}

```{r,echo=FALSE}
x <-c(0.34,1.38,-0.65,0.68,1.4,-.88,-.3,-1.18,.5,-1.75)
y<-c(.27,1.34,-.53,.35,1.28,-.98,-.72,-.81,.64,-1.59)

df <-data.frame(x=x,y=y)
p<-ggplot(data = df ,aes(x=x,y=y)) +geom_point(show.legend = TRUE) +ggtitle(TeX("$(X,Y)$"))

lm.fit.xy <- lm(x~y,data = df)
p<- p+ geom_abline(intercept = lm.fit.xy$coefficients["(Intercept)"],slope = lm.fit.xy$coefficients["y"],color="red",show.legend = TRUE, linetype="dashed",legend=TRUE,show_guide=TRUE,guide="legend")

lm.fit.yx <- lm(y~x,data = df)
p<- p+ geom_abline(intercept = lm.fit.yx$coefficients["(Intercept)"],slope = lm.fit.yx$coefficients["x"],color="blue", show.legend = TRUE,legend=TRUE,show_guide=TRUE,guide="legend")  +  guides(shape = guide_legend(override.aes = list(linetype = 0)))
p
```

### c. Are the lines in parts (a) and (b) the same? If not, why not?

The lines are not the same.  Geometrically, we're minimizing errors in the y direction 
for the model $y \sim x$ and for the model $x \sim y$ we're minimizing errors in the x direction. 
Going further we can work out when we might see equality in the two regression lines. 

Let's denote the two lines by $y = \beta_0 + \beta_1 x$ and $x = \beta_0' + \beta_1' y$
We know that $\bar{y} = \beta_0 + \beta_1 \bar{x}$ and  $\bar{x} = \beta_0' + \beta_1' \bar{y}$ 
The point $(\bar{x},\bar{y})$ is where the two regression lines above intersect. Now we also know that
$(\beta_0,0)$ is a point on $y = \beta_0 + \beta_1 x$ and that $(0,\beta_0')$ is a point on $x = \beta_0' + \beta_1' y$.
The two lines will only be the same when $\beta_0=\beta_0'=0$.  We can go to the derivation of the values of
$\beta_0 ,\; \beta_0'$ in the case of simple linear regression and ask about the conditions in which we will see $\beta_0=\beta_0'=0$.
We won't show the algebra here, but if we did it right the data constraints for equality of the regression lines is 

$$ \frac{\bar{x}}{\bar{y}} = \frac{\hat{\sigma_x^2}}{\hat{\sigma_y^2}}$$

##Problem 14.3 

Show that when $y_i = \mu + \epsilon_i \ni e_i \;\;iid \,\; : \; E[\epsilon]=0 \; , Var(\epsilon_i)=\sigma^2$ we have $\bar{y}$ is the least squares estimate for $\mu$.

The LS errors we want to minimize are 

$$ S(\mu) = \sum (y_i - \mu)^2 $$
Taking derivatives and setting equal to zero we have that 

$$\frac{\partial S}{\partial \mu} = 0 = -2 \; \sum (y_i - \mu) \implies \sum y_i = n \mu $$
So $\hat{\mu} = \bar{y}$


##Problem 14.5
_Three objects are located on a line at points p1 < p2 < p3. These locations are not precisely known. A surveyor makes the following measurements: a. He stands at the origin and measures the three distances from there to p1, p2, p3. Let these measurements be denoted by Y1, Y2, Y3. b. He goes to p1 and measures the distances from there to p2 and p3. Let these measurements be denoted by Y4, Y5. c. He goes to p2 and measures the distance from there to p3. Denote this measurement by Y6.He thus makes six measurements in all, and they are all subject to error. In order to estimate the values p1, p2, p3, he decides to combine all the measurements by the method of least squares. Using matrix notation, explain clearly how the least squares estimates would be calculated (you don't have to do the actual calculations)._

Rice, John A.. Mathematical Statistics and Data Analysis (Available 2010 Titles Enhanced Web Assign) (Page 592). Cengage Textbook. Kindle Edition. 

The predictors are $X_i \in \{-1,0,1\}$ there will be three of them corresponding to the three objects.  We'll be adding vectors to determine how the measurement was made. The coefficients are $d_1, d_2, d_3$ and these will denote the unknown distances. There is no intercept in this model. 

The matrix equation that needs to be solve in this case is 

$$ \textbf{Y} = \textbf{X} \textbf{d} $$
Where 

$$Y =\left(
\begin{array}{c}
Y_1\\
Y_2\\
Y_3\\
Y_4\\
Y_5\\
Y_6
\end{array}
\right) =
\left(
\begin{array}{ccc}
1 & 0 & 0\\
0 & 1 & 0\\
0 & 0 & 1\\
-1 & 1 & 0\\
-1 & 0 & 1\\
0 & -1 & 1
\end{array}
\right)
\left(
\begin{array}{c}
d_1 \\
d_2 \\
d_3
\end{array}
\right)
$$

The matrix solution is given by 

$$(\textbf{X}^\intercal \textbf{X} )^{-1} \;  \textbf{X}^\intercal \textbf{Y}  = \hat{\textbf{d}}$$

which is derived by solving the least squares problem for the model $\textbf{Y} = \textbf{X} \textbf{d} + \boldsymbol\epsilon$  In practice the matrix equation is solved numerically using the QR matrix factorization. 


##Problem 14.30

Find $Var(\bar{X})$ where 

$$ \textbf{X} = (X_1, \ldots , X_n) \ni Var(X_i)=\sigma \; \forall \; i $$
and 
$$ Cov(X_i,X_j)= \rho \, \sigma \; \forall \; i \neq j$$
The covariance matrix of the random vector $X$ is given by

$$\Sigma_{X \; X} = 
\left(\begin{array}{cccc}
\sigma^2 & \rho\sigma^2 & \cdots & \rho\sigma^2  \\
\rho\sigma^2 & \sigma^2 & \cdots & \rho\sigma^2  \\
\vdots & \vdots & \ddots & \vdots  \\
\rho\sigma^2 & \rho\sigma^2 & \cdots & \sigma^2 
\end{array}
\right)
$$
Now consider the mean as a vector.

$$Y = \frac{1}{n} \textbf{1}^\intercal X = \bar{X} \in \mathbb{R}^1$$
$\Sigma_{Y \, Y}$ is given by a special case of the covarinace expression for affine transforms of random vectors


$$\textbf{Y}= \textbf{A} \textbf{X} \;\; \textbf{Z}= \textbf{B} \textbf{X} \implies \Sigma_{Y \,Z} =\textbf{A}^\intercal \Sigma_{X \,X} \textbf{B} $$

We have that 

$$\Sigma_{Y \, Y} = \frac{1}{n} \textbf{1}^\intercal \;\Sigma_{X  X} \;\frac{1}{n} \textbf{1} =
\frac{1}{n} \textbf{1}^\intercal 
\left(\begin{array}{c}
\sigma^2 + (n-1)\rho\sigma^2  \\
\vdots  \\
\sigma^2 + (n-1)\rho\sigma^2  \\
\end{array}
\right)=
\frac{\sigma^2}{n} (1 + (n-1)\rho)
$$
Now by definition $\Sigma_{Y \, Y}=Var(\bar{X})$ 


##Problem 14.31

Let $$Z= \left(\begin{array}{c}
Z_1  \\
Z_2  \\
Z_3  \\
Z_4  
\end{array}
\right)\in \mathbb{R}^4$$ and  $\Sigma_{Z \, Z} = \sigma^2 I$

$$U = Z_1 + Z_2 + Z_3 +Z_4$$ and $$V= (Z_1+Z_2) - (Z_3 + Z_4)$$
Find $Cov(U,V)$

First let's define $U$ and $V$ as linear forms $U=\textbf{1}^\intercal Z$ and $V=a^\intercal Z \, : \, a=(1,1,-1,-1)$

Then we have 

$$Cov(U,V) = \Sigma_{U \, V} = \textbf{1}^\intercal \,  \Sigma_{Z \, Z}  \, a = \textbf{1}^\intercal 
\left(\begin{array}{c}
\sigma^2  \\
\sigma^2  \\
-\sigma^2  \\
-\sigma^2  
\end{array}
\right) =0$$

We've use the result about the cross covariance of 2 affine transforms of a random vector;

$$\textbf{Y}= \textbf{A} \textbf{X} \;\; \textbf{Z}= \textbf{B} \textbf{X} \implies \Sigma_{Y \,Z} =\textbf{A}^\intercal \Sigma_{X \,X} \textbf{B} $$
We don't need it and the book doesn't state this but we note that adding a constant to $\textbf{Y}$ or $\textbf{Z}$ does not change the cross covariance.