---
title: "R Notebook Example"
output: 
  html_document:
    toc: true
    toc_depth: 2
    toc_float: true
---

# Introduction

The beginning of this `.Rmd` file has the following code:

```{r, eval = FALSE}
---
title: "R Notebook Example"
output: 
  html_document:
    toc: true
    toc_depth: 2
    toc_float: true
---
```

This means that we want to generatean `.html` document with a table of contents which floats when we scoll down or up.

If we want to add an Author name and their email, we can use
```{r, eval = FALSE}
---
title: "Task X"
author: "Student Name, student.name@mif.stud.vu.lt"

...

---
```

This  markdown file shows the basic functionality of how the code is estimated and how to write formulas and is ment as a proof of concept rather than a full fledged tutorial.

[A more complete introduction to RMarkdown can be found here](http://rmarkdown.rstudio.com/lesson-1.html).

# Some code examples

## Basics

This is an [R Markdown](http://rmarkdown.rstudio.com) Notebook. When you execute code within the notebook, the results appear beneath the code. 

Try executing this chunk by clicking the *Run* button within the chunk or by placing your cursor inside it and pressing `Ctrl+Shift+Enter`. 

```{r, fig.height = 4, fig.width = 6}
plot(cars)
```

Add a new chunk by clicking the *Insert Chunk* button on the toolbar or by pressing `Ctrl+Alt+I`.

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the *Preview* button or press `Ctrl+Shift+K` to preview the HTML file).

The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike *Knit*, *Preview* does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.

Note the difference between:

```{r}
x1 <- rnorm(100)
print(mean(x1))
```

and

```{r, eval = FALSE}
x2 <- rnorm(100)
print(mean(x2))
```

By specifying `eval = FALSE` we are telling R to not evaluate this code. We can check this by looking at the list of variables in our environment:

```{r}
print(ls())
```

We can see that only `x1` variables was created.

## Code generation example

Open this notebook in RStudio and try executing the following code. If you click the green arrow a couple of times, the resulting output will be the same.

```{r}
set.seed(1233)
mean(rnorm(100))
```

Now, try executing the above code and then execute the below code immediately:

```{r}
mean(rnorm(100))
```

Now, only execute the above code block - you will notice that the output is **different** each time! In order to avoid this - use the `set.seed` function in those code chunks, where you are generating data. (Note: if you use the 'Run All' command, then both of the above chunks will produce the same results).

To evaluate **all** of the chunks in an `*.Rmd` file, press `Ctrl+Alt+Enter` ('Run All' command).

## Printing output

We can print a model output:

```{r}
my.ols <- lm(mpg ~ disp + cyl, data = mtcars)
summary(my.ols)
```

We can also print the data output:

```{r}
print(head(mtcars))
```

**However** if we have a large data frame, the nall of the data will be printed. If we want to format it differently, we can use a number of libraries:

```{r, results = "asis"}
DT::datatable(mtcars, width = 400)
```

Note that this interactivity will **not work** in a `.pdf` file.

In most cases regarding the output, using the standard `print()` function will be enough. For model where there is a large amount of output, please `print()` only the required results. Example:

The coefficients:
```{r}
print(summary(my.ols)$coefficients)
```

We can also use inline code with `$R^2=$` \``r ` `round(summary(my.ols)$r.squared, 4)`\`:

Our $R^2=$ `r round(summary(my.ols)$r.squared, 4)`.


# Formula examples

More info on [Latex formulas, matrices, etc.](https://en.wikibooks.org/wiki/LaTeX/Mathematics)

## General formulas
Formulas in Markdown are written between the `$` symbols for inline formulas, and between `$$` symbols for centered formulas. 

For example, writing `$X_t = \sum_{j = 1}^t \epsilon_j$` produces the following output: $X_t = \sum_{j = 1}^t \epsilon_j$. 

Writing `$$X_t = \sum_{j = 1}^t \epsilon_j$$` produces: $$X_t = \sum_{j = 1}^t \epsilon_j$$

## Matrices

If we want to write a matrix, we use (either with `$` or `$$`, and using `\quad` to separate the different matrices)::

```{r, eval = FALSE}
$$
\begin{bmatrix}
\alpha& \beta^{*}\\
\gamma^{*}& \delta
\end{bmatrix} \quad 
\begin{pmatrix}
\alpha& \beta^{*}\\
\gamma^{*}& \delta
\end{pmatrix}
$$
```

which produces the following output:
$$
\begin{bmatrix}
\alpha& \beta^{*}\\
\gamma^{*}& \delta
\end{bmatrix}, \quad 
\begin{pmatrix}
\alpha& \beta^{*}\\
\gamma^{*}& \delta
\end{pmatrix}
$$

## Equation aligning

Using the `align` environment and writing `&` next to the symbols we want to align in each row lets us specify multiple equations aligned by the same symbol, for example:
```{r, eval = FALSE}
$$
\begin{align}
Y_{1,t} &= \alpha_1 + \beta_1 X_{1,t} + \epsilon_{1,t} \\
Y_{2,t} &= \alpha_1 + \beta_1 X_{1,t} + \epsilon_{1,t}
\end{align}
$$
```

Produces:

$$
\begin{align}
Y_{1,t} &= \alpha_1 + \beta_1 X_{1,t} + \epsilon_{1,t} \\
Y_{2,t} &= \alpha_1 + \beta_1 X_{1,t} + \epsilon_{1,t}
\end{align}
$$

Or if we need to write our equation in a different form:
```{r, eval = FALSE}
$$
\begin{align}
f(x) & = (a+b)^2 \\
& = a^2+2ab+b^2
\end{align}
$$
```

$$
\begin{align}
f(x) &= (a+b)^2 \\
&= a^2+2ab+b^2
\end{align}
$$

Or if we simply have longer names for our variables:

```{r, eval = FALSE}
\begin{align}
Population_t &= \alpha_1 + \gamma_1 X_{1,t} + \gamma_2 X_{1,t-1} \\
Price_t &= \alpha_2 + \beta_1 Z_{1,t}
\end{align}
```

$$
\begin{align}
Population_t &= \alpha_1 + \gamma_1 X_{1,t} + \gamma_2 X_{1,t-1} \\
Price_t &= \alpha_2 + \beta_1 Z_{1,t}
\end{align}
$$

## Writing equation systems

We can write the equation systems using the `cases` environment (note - we are also using the `&` symbol to align our equations):

```{r, eval = FALSE}
$$
f(n) = \begin{cases} 
n/2 &\mbox{if } n \equiv 0 \\
(3n +1)/2 & \mbox{if } n \equiv 1. 
\end{cases} \pmod{2} 
$$
```

$$
f(n) = \begin{cases} 
n/2 &\mbox{if } n \equiv 0 \\
(3n +1)/2 & \mbox{if } n \equiv 1. 
\end{cases} \pmod{2} 
$$

## Regression models


```{r, eval = FALSE}
$$
\text{wage} = \beta_0 + \beta_1 \cdot \text{educ}^2 + \epsilon
$$
```

$$
\text{wage} = \beta_0 + \beta_1 \cdot \text{educ}^2 + \epsilon
$$

## Estimated regression models, along with their standard errors:

We can write doen the estimated regression model, along with the standard errors using the following code

```{r, eval = FALSE}
$\underset{(se)}{\widehat{\log(\text{wage})}} = \underset{(0.0702)}{1.5968} + \underset{(0.0048)}{0.0988} \cdot \text{educ}$
```
    
$\underset{(se)}{\widehat{\log(\text{wage})}} = \underset{(0.0702)}{1.5968} + \underset{(0.0048)}{0.0988} \cdot \text{educ}$

We can use `$$` instead of `$` around the formula in order to center the equation.


**Important: you can right-click on the formulas in the `.html` file to see the code for the mathematical expressions**


# Multiple plots example

```{r}
nsample = 1000
```

```{r}
set.seed(123)
#
x  <- seq(from = 0, to = 100, length.out = nsample)
y1 <- rnorm(n = nsample)
y2 <- rexp(n = nsample)
```

```{r}
print(mean(y1))
print(mean(y2))
```

We can see from the output that the default value of the mean is:
    
- around `0` for the `rnorm()` function;
- around `1` for the `rexp()` function.

```{r}
#a 1-row, 2-column figure:
par(mfrow=c(1, 2))
# plots are added in the order that we plot them:
plot(x, y1, col = "cornflowerblue", type = "l", 
     main = bquote("Plot of"~X~"~N("~mu~","~sigma^2~"), "~mu==0~", "~sigma==1))
plot(x, y2, col = "orange", type = "l", 
     main = bquote("Plot of"~X~"~Exp("~lambda~"), "~lambda==1))
```

Again, examine the lecture notes/lecture slides for additional ways to plot the data.