--- title: "R Markdown demo" author: "DP" date: "September 4, 2016" output: pdf_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ## R Markdown This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see . When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this: \bigskip \hrule\smallskip ``my cutoff'' \smallskip \hrule \newcommand{\bone}{\mathbbm1} \def\Rlang{{\bfseries R}} \setlength{\parindent}{2em} Everything above the line (``my cutoff'') was created automatically when I created a new file by selecting \qquad File > New File > R Markdown \qquad in RStudio. I was prompted to provide a title and author. The file originally contained more about R Markdown, below my cutoff line. I replaced that material by some Linear Models stuff then clicked on ``Knit PDF'' in the toolbar. ```{r} set.seed(10) # for reproducibility mydata <- data.frame(y=rnorm(10), x1=1:10,x2= 11:20, x3= 0.5*(1:10)-3*(11:20)) out <- lm(y ~ ., data=mydata) summary(out) ``` Now let's try to figure out what \Rlang\ has done. First determine which matrix \Rlang\ fed to the $qr()$ function: ```{r} M <- model.matrix(out) # should have out$qr equal to qr(M) round(cbind(M,mydata)[1:4,],3) # for comparison ``` \noindent As expected, \Rlang\ prepended a column of $1$'s to the predictors in $mydata$. You might want to compare $out\$qr$ with $qr(M)$. Now extract the matrices for the QR decomposition of the model matrix: ```{r} Q <- qr.Q(out$qr) R <- qr.R(out$qr) # What would you expect # round(cbind( Q %*% R,M),3) #to show? round(Q[1:4,],2) # Why are there four columns? round(R,3) # Why is it 4 by 4 ? ``` \noindent Notice that R[3:4,] is all zeros. That means that only the first two columns of $Q$ are being used to span the model space; the model matrix has rank $2$. Let me split both matrices in the way described in the QR.pdf handout: ```{r} Q1 <- Q[,1:2] R1 <- R[1:2,1:2] R2 <- R[1:2,3:4] # as a check look at # round(cbind( M, Q1 %*% R1, Q1 %*% R2),3) ``` According to QR.pdf, the fitted vector $\widehat y$ should equal $Q_1Q_1^T y$: ```{r} # round(out$fitted.values- Q1 %*% t(Q1) %*% mydata$y,4) # all zero? ``` \noindent The matrix $Q_1Q_1^T$ projects ten-dimensional Euclidean space orthogonally onto the two-dimensional subspace spanned by a column of $1$'s and $x_1,x_2,x_3$. If you read through QR.pdf you should see how to calculate other parts the summary using only~$y$ and~$out\$qr$. Homework~1 essentially asks you to add some more calculations to this handout.