---
title: "R Notebook Example"
output:
html_document:
toc: true
toc_depth: 2
toc_float: true
---
# Introduction
The beginning of this `.Rmd` file has the following code:
```{r, eval = FALSE}
---
title: "R Notebook Example"
output:
html_document:
toc: true
toc_depth: 2
toc_float: true
---
```
This means that we want to generatean `.html` document with a table of contents which floats when we scoll down or up.
If we want to add an Author name and their email, we can use
```{r, eval = FALSE}
---
title: "Task X"
author: "Student Name, student.name@mif.stud.vu.lt"
...
---
```
This markdown file shows the basic functionality of how the code is estimated and how to write formulas and is ment as a proof of concept rather than a full fledged tutorial.
[A more complete introduction to RMarkdown can be found here](http://rmarkdown.rstudio.com/lesson-1.html).
# Some code examples
## Basics
This is an [R Markdown](http://rmarkdown.rstudio.com) Notebook. When you execute code within the notebook, the results appear beneath the code.
Try executing this chunk by clicking the *Run* button within the chunk or by placing your cursor inside it and pressing `Ctrl+Shift+Enter`.
```{r, fig.height = 4, fig.width = 6}
plot(cars)
```
Add a new chunk by clicking the *Insert Chunk* button on the toolbar or by pressing `Ctrl+Alt+I`.
When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the *Preview* button or press `Ctrl+Shift+K` to preview the HTML file).
The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike *Knit*, *Preview* does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.
Note the difference between:
```{r}
x1 <- rnorm(100)
print(mean(x1))
```
and
```{r, eval = FALSE}
x2 <- rnorm(100)
print(mean(x2))
```
By specifying `eval = FALSE` we are telling R to not evaluate this code. We can check this by looking at the list of variables in our environment:
```{r}
print(ls())
```
We can see that only `x1` variables was created.
## Code generation example
Open this notebook in RStudio and try executing the following code. If you click the green arrow a couple of times, the resulting output will be the same.
```{r}
set.seed(1233)
mean(rnorm(100))
```
Now, try executing the above code and then execute the below code immediately:
```{r}
mean(rnorm(100))
```
Now, only execute the above code block - you will notice that the output is **different** each time! In order to avoid this - use the `set.seed` function in those code chunks, where you are generating data. (Note: if you use the 'Run All' command, then both of the above chunks will produce the same results).
To evaluate **all** of the chunks in an `*.Rmd` file, press `Ctrl+Alt+Enter` ('Run All' command).
## Printing output
We can print a model output:
```{r}
my.ols <- lm(mpg ~ disp + cyl, data = mtcars)
summary(my.ols)
```
We can also print the data output:
```{r}
print(head(mtcars))
```
**However** if we have a large data frame, the nall of the data will be printed. If we want to format it differently, we can use a number of libraries:
```{r, results = "asis"}
DT::datatable(mtcars, width = 400)
```
Note that this interactivity will **not work** in a `.pdf` file.
In most cases regarding the output, using the standard `print()` function will be enough. For model where there is a large amount of output, please `print()` only the required results. Example:
The coefficients:
```{r}
print(summary(my.ols)$coefficients)
```
We can also use inline code with `$R^2=$` \``r ` `round(summary(my.ols)$r.squared, 4)`\`:
Our $R^2=$ `r round(summary(my.ols)$r.squared, 4)`.
# Formula examples
More info on [Latex formulas, matrices, etc.](https://en.wikibooks.org/wiki/LaTeX/Mathematics)
## General formulas
Formulas in Markdown are written between the `$` symbols for inline formulas, and between `$$` symbols for centered formulas.
For example, writing `$X_t = \sum_{j = 1}^t \epsilon_j$` produces the following output: $X_t = \sum_{j = 1}^t \epsilon_j$.
Writing `$$X_t = \sum_{j = 1}^t \epsilon_j$$` produces: $$X_t = \sum_{j = 1}^t \epsilon_j$$
## Matrices
If we want to write a matrix, we use (either with `$` or `$$`, and using `\quad` to separate the different matrices)::
```{r, eval = FALSE}
$$
\begin{bmatrix}
\alpha& \beta^{*}\\
\gamma^{*}& \delta
\end{bmatrix} \quad
\begin{pmatrix}
\alpha& \beta^{*}\\
\gamma^{*}& \delta
\end{pmatrix}
$$
```
which produces the following output:
$$
\begin{bmatrix}
\alpha& \beta^{*}\\
\gamma^{*}& \delta
\end{bmatrix}, \quad
\begin{pmatrix}
\alpha& \beta^{*}\\
\gamma^{*}& \delta
\end{pmatrix}
$$
## Equation aligning
Using the `align` environment and writing `&` next to the symbols we want to align in each row lets us specify multiple equations aligned by the same symbol, for example:
```{r, eval = FALSE}
$$
\begin{align}
Y_{1,t} &= \alpha_1 + \beta_1 X_{1,t} + \epsilon_{1,t} \\
Y_{2,t} &= \alpha_1 + \beta_1 X_{1,t} + \epsilon_{1,t}
\end{align}
$$
```
Produces:
$$
\begin{align}
Y_{1,t} &= \alpha_1 + \beta_1 X_{1,t} + \epsilon_{1,t} \\
Y_{2,t} &= \alpha_1 + \beta_1 X_{1,t} + \epsilon_{1,t}
\end{align}
$$
Or if we need to write our equation in a different form:
```{r, eval = FALSE}
$$
\begin{align}
f(x) & = (a+b)^2 \\
& = a^2+2ab+b^2
\end{align}
$$
```
$$
\begin{align}
f(x) &= (a+b)^2 \\
&= a^2+2ab+b^2
\end{align}
$$
Or if we simply have longer names for our variables:
```{r, eval = FALSE}
\begin{align}
Population_t &= \alpha_1 + \gamma_1 X_{1,t} + \gamma_2 X_{1,t-1} \\
Price_t &= \alpha_2 + \beta_1 Z_{1,t}
\end{align}
```
$$
\begin{align}
Population_t &= \alpha_1 + \gamma_1 X_{1,t} + \gamma_2 X_{1,t-1} \\
Price_t &= \alpha_2 + \beta_1 Z_{1,t}
\end{align}
$$
## Writing equation systems
We can write the equation systems using the `cases` environment (note - we are also using the `&` symbol to align our equations):
```{r, eval = FALSE}
$$
f(n) = \begin{cases}
n/2 &\mbox{if } n \equiv 0 \\
(3n +1)/2 & \mbox{if } n \equiv 1.
\end{cases} \pmod{2}
$$
```
$$
f(n) = \begin{cases}
n/2 &\mbox{if } n \equiv 0 \\
(3n +1)/2 & \mbox{if } n \equiv 1.
\end{cases} \pmod{2}
$$
## Regression models
```{r, eval = FALSE}
$$
\text{wage} = \beta_0 + \beta_1 \cdot \text{educ}^2 + \epsilon
$$
```
$$
\text{wage} = \beta_0 + \beta_1 \cdot \text{educ}^2 + \epsilon
$$
## Estimated regression models, along with their standard errors:
We can write doen the estimated regression model, along with the standard errors using the following code
```{r, eval = FALSE}
$\underset{(se)}{\widehat{\log(\text{wage})}} = \underset{(0.0702)}{1.5968} + \underset{(0.0048)}{0.0988} \cdot \text{educ}$
```
$\underset{(se)}{\widehat{\log(\text{wage})}} = \underset{(0.0702)}{1.5968} + \underset{(0.0048)}{0.0988} \cdot \text{educ}$
We can use `$$` instead of `$` around the formula in order to center the equation.
**Important: you can right-click on the formulas in the `.html` file to see the code for the mathematical expressions**
# Multiple plots example
```{r}
nsample = 1000
```
```{r}
set.seed(123)
#
x <- seq(from = 0, to = 100, length.out = nsample)
y1 <- rnorm(n = nsample)
y2 <- rexp(n = nsample)
```
```{r}
print(mean(y1))
print(mean(y2))
```
We can see from the output that the default value of the mean is:
- around `0` for the `rnorm()` function;
- around `1` for the `rexp()` function.
```{r}
#a 1-row, 2-column figure:
par(mfrow=c(1, 2))
# plots are added in the order that we plot them:
plot(x, y1, col = "cornflowerblue", type = "l",
main = bquote("Plot of"~X~"~N("~mu~","~sigma^2~"), "~mu==0~", "~sigma==1))
plot(x, y2, col = "orange", type = "l",
main = bquote("Plot of"~X~"~Exp("~lambda~"), "~lambda==1))
```
Again, examine the lecture notes/lecture slides for additional ways to plot the data.