---
title:  'STK2100'
subtitle: |
  | Machine Learning and Statistical Methods for Prediction and Classification
  | Mandatory Assignment 1
author: "Lars H. B. Olsen"
date: "`r format(Sys.time(), '%d %B %Y')`"
output:
  pdf_document: default
fig_caption: true
urlcolor: blue
extra_dependencies:
  caption: labelfont={bf}
  asmath: null
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```


# Exercise 1 
In this exercise, we look at a dataset \texttt{nuclear} and see how we can use regression for prediction of costs of light-water reactor. The data is available in the file \texttt{nuclear.dat} while a description of the data is available at \texttt{nuclear.txt}, both available from the course [data webpage](https://uio.no/studier/emner/matnat/math/ STK2100/data).
Our interest will be in the \texttt{cost} variable while the other variables will be explanatory variables. Since \texttt{cost} is always positive, we will model this variable at the log-scale.

## 1a)
Make the data available through the commands:
```{r 1a, echo=TRUE}
datadir = "http://www.uio.no/studier/emner/matnat/math/STK2100/data/"
nuclear = read.table(paste(datadir,"nuclear.dat",sep=""), header=T)
n = nrow(nuclear)
```

Look at the first observations. We see that..
```{r 1a:head}
library(knitr)
knitr::kable(head(nuclear), caption = "First six observations in the data set.")
```


Make also different plots in order to get some understanding of the data.
We now make pairs plot of the data.
```{r 1a:pairs, fig.cap="A pairs plot of the data. See, e.g., a strong linear dependence between \\texttt{data} and \\texttt{t1}. Note that in caption we need double backslash."}
pairs(nuclear)
```

We see that ...


## 1b)
We will first look at a model $Y_i = \beta_0 + \beta_1x_{i,1} + \dots + \beta_1x_{i,p} + \epsilon_i$, where $Y_i$ is \texttt{cost} at log scale for observation $i$.

[Math in Rmarkdown](https://rpruim.github.io/s341/S19/from-class/MathinRmd.html)

### Assumptions
What are the standard assumptions about the noise terms $\epsilon_i$? Discuss also which of these assumptions that are most important.

The standard assumptions are: 

1. Assumption 1.
2. Assumption 2.
3. Assumption 3.
4. ...
5. ...


### Fit model: 
Fit this model including all the observations with log(cost) as response and all the other variables as covariates.
```{r 1b:lm}
# Fit a linear model with log of cost as response
model = lm(log(cost) ~ ., data = nuclear)

# Get a summary of the model with estimates coefficients and if they are significant.
summary(model)
```

### We now discuss the results:

```{r 1b:lm_plots, fig.cap="Summary plots of the linear model. Explanation in the main text."}
par(mfrow = c(2,2))
plot(model)
```


## c)
We now remove the variable (\texttt{t1}) with the highest corresponding P-value ($0.81610$) and fit the new model.
```{r 1c:removing_variable}
#Fit with t1 taken away.
fit2 = lm(log(cost) ~ . - t1, data = nuclear)
```

This is a reasonable procedure because ...


Discuss potential changes of the P-values for the remaining variables. You can relate this to correlations between the explanatory variables.

## d)
We continue to remove explanatory variables until all P-values are less than 0.05.
```{r 1d:removing_variables}
# Add code yourself
```

The final model we end up with is
```{r 1d:final_model}
# Add code yourself
```

We now make different plots in order to evaluate whether the model is reasonable.

```{r 1d:final_model_plots}
# Add code yourself
# CALL THE FINAL MODEL 'fit' FOR CODE BELOW TO WORK
```


## e)
We use the final model to predict response and make a model based on the average
quadratic error ($\frac{1}{n}\sum_{i=1}^n (y_i - \hat{y}_i)^2)$ in order to evaluate how good the model is.

We get the following results:
```{r 1e:final_model_MSE_error}
# Add code yourself
# REMOVE HASHTAG BELOW WHEN YOU HAVE FITTED THE MODEL 'fit' IN EXERCISE D).
# mean(fit$residuals^2)
```


The weaknesses with such a procedure is ...
We can \textit{fix} it by ...


## f)
We run the two commands:
```{r 1f:two_commands}
# We create the new observation
d.new = data.frame(date=70.0,t1=13,t2=50,cap=800,pr=1,
                   ne=0,ct=0,bw=1,cum.n=8,pt=1)

# We predict using the two commands
# REMOVE HASHTAG BELOW WHEN YOU HAVE FITTED THE MODEL 'fit' IN EXERCISE D).
# predict(fit, d.new, interval = "confidence")
# predict(fit, d.new, interval = "predict")
```


The differences between the two predict commands are ...


## g)
The intervals given in the previous point is related to Cost on log-scale. 
We now construct intervals for \texttt{Cost} on the original scale ...

```{r 1g}
# Add code yourself
```


## h)
We will here use Lasso regression on this data set.

If you use cross-validation for selection of the penalty parameter, which variables are then included in the final model?

Also compare this with the model you obtained earlier.

Hint: Look at the \texttt{Hitters\textunderscore lasso.R} script.

```{r 1h}
# Comment the code ...
x = model.matrix(log(cost) ~ . , data = nuclear)
y = log(nuclear$cost)

library(glmnet)
grid = 10^seq(1, -5, length = 100)
lasso.mod = glmnet(x, y, alpha = 1, lambda = grid)
plot(lasso.mod, xvar="lambda")

set.seed(1)
cv.out = cv.glmnet(x, y, alpha=1)
plot(cv.out)
cv.out
```


# Exercise 2
We will in this exercise look at linear regression with quantitative (categorical) explanatory variables. We will look at a dataset from Devore & Berk (2012), exercise 11.5. The dataset consists of measurements of iron content in four different iron formations (Form, 1=carbonate, 2=silicate, 3=magnetite, 4 =hematite). The table below shows the data, with 10 observations within each type of iron formation.

## a)
We start by reading in the data by the following commands:
```{r 2a:read_data}
datadir = "http://www.uio.no/studier/emner/matnat/math/STK2100/data/"
Fe = read.table(paste(datadir,"fe.dat",sep=""), header=T, sep=",")
```

We take small look at the data:
```{r}
knitr::kable(head(Fe, caption = "First six observations in the data set."))
knitr::kable(summary(Fe))
```
We see that ....


We now fit a linear model to the data.
```{r 2a:fit_model_1_first_time}
options(contrasts=c("contr.treatment","contr.treatment"))
fit1 = lm(Fe~form,data=Fe)
summary(fit1)
```
The reason this goes wrong is that ...


We fix this mistake by ... and then refit the model.
```{r 2a:convert_to_factor_and_refit_model}
# Convert the form to factor
Fe$form = as.factor(Fe$form)

# Fit a linear
fit1 = lm(Fe~form,data=Fe)
summary(fit1)
```
These results are more reasonable as ...


## b)
The reason such constraints are necessary is because ...

The interpretation of $\beta_j$ is then ...


## c)
We now use an alternative constraint where $\beta_0 = 0$. We do this by the following commands.
```{r 2c:fit_model}
fit2 <- lm(Fe~form+0,data=Fe)
summary(fit2)
```

The interpretation of the $\beta_j$'s in this case is ...


## d)
Another possibility is
```{r 2d:fit_model}
options(contrasts=c("contr.sum","contr.sum"))
fit3 <- lm(Fe~form,data=Fe)
summary(fit3)
```

We see that ...

We obtain $\hat{\beta}_4$ by ...


## e)
Do the results indicate that there are differences between the formations? 


Which of the fitted models do you find most suitable for answering this question?


## f)
We try out the following commands which does ...
```{r 2f:predict_response_of_new_data}
newdata = data.frame(form=as.factor(c(1,2,3,4)))
pred1 = predict(fit1,newdata)
pred2 = predict(fit2,newdata)
pred3 = predict(fit3,newdata)
print(rbind(pred1, pred2, pred3))
```

Compare the three predictions and comment on the results.


## g)
Based on the summary outputs from the different models, is it possible to simplify the model in some way?


# Exercise 3
This problem is a continuation of Problem 2, but with more focus on mathematical derivations. These are **not** needed to be fulfilled in order to get the compulsory exercise accepted but do provide good exam training.

In the following we will use different symbols for the parameters in the different alternatives. In particular we write
\begin{alignat}{2}
  Y_i &= \beta_0 + \beta_1x_{i,1} + \beta_2x_{i,2} + \beta_3x_{i,3} + \beta_4x_{i,4} + \epsilon_i \hspace{2cm} \beta_1 &&=0 \label{Ex3:models_1} \\
  Y_i &= \alpha_0 + \alpha_1x_{i,1} + \alpha_2x_{i,2} + \alpha_3x_{i,3} + \alpha_4x_{i,4} + \epsilon_i \hspace{2cm} \alpha_0 &&=0 \label{Ex3:models_2}\\
  Y_i &= \gamma_0 + \gamma_1x_{i,1} + \gamma_2x_{i,2} + \gamma_3x_{i,3} + \gamma_4x_{i,4} + \epsilon_i \hspace{2cm} \sum_{j=1}^4\gamma_j &&=0 \label{Ex3:models_3}
\end{alignat}


## a)
We are here going to show that the three models in Eq. (\ref{Ex3:models_1} to \ref{Ex3:models_3}) are equivalent. More precisely, we will write an explicit relationship between
$\boldsymbol\beta = (\beta_0,\beta_2,\beta_3,\beta_4)$, $\boldsymbol\alpha = (\alpha_1,\alpha_2,\alpha_3,\alpha_4)$ and $\boldsymbol\gamma = (\gamma_0,\gamma_1,\gamma_2,\gamma_3,\gamma_4)$.

Derive the relationships ...

## b)
We will in the following concentrate on model Eq. (\ref{Ex3:models_3}) since this version is somewhat simpler
mathematically.

Show that $\boldsymbol{X}^T\boldsymbol{X}$ is a diagonal matrix is a diagonal matrix with diagonal elements $n_j$ where $n_j$ is the number of observations with $c_i = j$.

Also show that $\boldsymbol{X}^T\boldsymbol{y}$ is a vector where the $j$-th element is equal to element $\sum_{i:c_i = j} y_i$.

Based on this, derive formulas for the least squares estimates for $\alpha_1, \dots, \alpha_K$.
Discuss whether these estimates are reasonable.

## c)
Based on the relationship between $\boldsymbol\beta$ and $\boldsymbol\alpha$, construct formulas for the estimates for $\boldsymbol\beta$. Argue why also these estimates are least squares estimates for $\boldsymbol\beta$.

We get the following derivations and results ...


## d)
Based on the relationship between $\boldsymbol\gamma$ and $\boldsymbol\alpha$, construct formulas for the estimates for $\boldsymbol\gamma$.

We get the following derivations and results ...


\newpage 
Other stuff that can help you write/produce a nice rmarkdown pdf. Just delete everything below before handing in the assignment! :) 

## R
As STK1100 (maybe you use python there) is recommended previous knowledge and many of you have taken STK1110, we assume that you have basic knowledge in coding. 
The R language is reasonable similar to Python, but with some differences, e.g., arrays, list, start counting at 1, and so on.
R is often used in statistical analysis.

For an introduction to R, I would recommend https://education.rstudio.com/learn/beginner/ and/or https://moderndive.netlify.app/1-getting-started.html. One can always Google when you have questions, and the [ISLR book](https://hastie.su.domains/ISLR2/ISLRv2_website.pdf) has very nice lab chapters where they include code and describe it in details. The corresponding code is also available at https://www.statlearning.com/resources-second-edition.


## R Markdown


R markdown is a file format for making dynamic documents with R. An R Markdown document is written in markdown (an easy-to-write plain text format) and contains chunks of embedded code. Very similar to jupyter notebooks.

Many excellent guides and resources to get to know R markdown:

* https://rmarkdown.rstudio.com
* https://rmarkdown.rstudio.com/lesson-1.html (All lessons are excellent)
* https://rmarkdown.rstudio.com/authoring_basics.html
* https://rmarkdown.rstudio.com/gallery.html (Gallery examples)
* https://bookdown.org/yihui/rmarkdown/
* https://bookdown.org/yihui/rmarkdown-cookbook/
* https://bookdown.org/yihui/rmarkdown-cookbook/install-latex.html (`tinytex::install_tinytex()`)
* https://bookdown.org/yihui/bookdown/
* https://bookdown.org/dalzelnm/bookdown-demo/ 
* https://rmd4sci.njtierney.com
* https://www.statlearning.com/resources-second-edition has Rmarkdown files for each chapter in the ISLR book you can look at.

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>. 

When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

```{r cars}
summary(cars)
```

You can include latex code just as usual. Both inline ($\sum_{n=1}^{10}x_i = \dots$) and on separate lines
\begin{align*}
a &= b \\
X & \sim \mathcal{N}(0,1)
\end{align*}


You can even run define variable in chunks 

```{r}
x <- 5  # radius of a circle
```

and then use it in the text. E.g., for a circle with the radius `r x`, its area is `r round(pi * x^2, 2)`, and there are `r nrow(cars)` cars in the data.


## Including Plots

You can also embed plots, for example:

```{r pressure, echo=FALSE, fig.cap = "\\label{fig:pressure}This is a figure caption! You *cannot* see me before you knit me!"}
plot(pressure)
```

Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot, i.e. Figure \ref{fig:pressure}. 
For more chunk options click [here](https://www.rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf), and [here](https://yihui.org/knitr/options/) for more elegant output using knitr.


\newpage
## Markdown syntax

Plain text

End a line with two spaces  
to start a new paragraph.

*italics* and _italics_

**bold** and __bold__

superscript^2^

~~strikethrough~~

[link](www.rstudio.com)

# Header 1

## Header 2

### Header 3

#### Header 4

##### Header 5

###### Header 6

endash: --

emdash: ---

ellipsis: ...

inline equation: $A = \pi*r^{2}$

image: ![Image of UiO's logo](/Users/larsolsen/Downloads/UiO_logo.jpeg){width=10%}


> block quote

* unordered list
* item 2
    + sub-item 1
    + sub-item 2
    
1. ordered list
2. item 2
    + sub-item 1
    + sub-item 2
    
Table Header  | Second Header
------------- | -------------
Table Cell    | Cell 2
Cell 3        | Cell 4
Cell 5        | Cell 6
Cell 7        | Cell 8


\newpage
## Markdown highlighting {.unnumbered}

| Formatting | Code         |
|:-----------|:-------------|
| **bold**   | \*\*bold\*\* |
| __bold__   | \_\_bold\_\_ |
| *italic*   | \*italic\*   |
| _italic_   | \_italic\_   |


## Section references {.unnumbered #x99.4}

Section [99.4](#x99.4) comes from `Section [99.4](#x99.4)`

## Footnotes {.unnumbered}

^[A footnote] comes from `^[A footnote]`

## Formatting Text {.unnumbered}

| Appearance                                               | Code                                                   |
|:---------------------------------------------------------|:-------------------------------------------------------|
| `r xfun::n2w(3, cap = TRUE)`                             | `xfun::n2w(3, cap = TRUE)`                             |
| `r scales::percent(0.1447, accuracy= .1)`                | `scales::percent(0.1447, accuracy= .1)`                |
| `r scales::pvalue(0.1447, accuracy= .001, add_p = TRUE)` | `scales::pvalue(0.1447, accuracy= .001, add_p =TRUE)`  |

## Figures {.unnumbered}

Name a figure chunk with a name like figure99-1 and include fig.cap="something" then reference it like this: `\@ref(fig:figure99-1)`


\newpage
## Displaying Formula {.unnumbered}

### Formatting Text Font {.unnumbered}

To tweak the appearance of words use these formats:

| Formatting        | Looks like              | Code                    |
|:------------------|:------------------------|:------------------------|
| plain text        | $\text{text Pr}$        | \\text{text Pr}         |
| bold Greek symbol | $\boldsymbol{\epsilon}$ | \\boldsymbol{\\epsilon} |
| typewriter        | $x\ \tt{sentence}\ x$   | \\tt{sentence}          |
| teletype          | $x\ \texttt{blah}\ x$   | \\texttt{blah}          |
| slide font        | $\sf{blah}$             | \\sf{blah}              |
| bold              | $\mathbf{x}$            | \\mathbf{x}             |
| plain             | $\mathrm{text Pr}$      | \\mathrm{text Pr}       |
| cursive           | $\mathcal{S}$           | \\mathcal{S}            |
| Blackboard bold   | $\mathbb{R}$            | \\mathbb{R}             |

### Symbols {.unnumbered}

| Symbols                                     | Code                                      |
|:--------------------------------------------|:------------------------------------------|
| $\stackrel{\text{def}}{=}$                  | \\stackrel{\\text{def}}{=}                |
| $\nabla$                                    | `\nabla`                                  |
| $\partial$                                  | `\partial`                                |
| $\vert$  or use $\lvert a \rvert$           | `\vert \text{ or use } \lvert a \rvert` \ |
| $\Vert$                                     | `\Vert`                                   |
| $\mid$ in set notation; `\given` is missing | `\mid`                                    |

### Brackets and Parentheses {.unnumbered}

| Looks like                 | Code                          |
|:---------------------------|:------------------------------| 
|$\big( \Big( \bigg( \Bigg($ | \\big( \\Big( \\bigg( \\Bigg( | 
|$\big] \Big] \bigg] \Bigg]$ | \\big] \\Big] \\bigg] \\Bigg] | 

### Notation {.unnumbered}

| Math                                                          | Code                                                               |
|:--------------------------------------------------------------|:-------------------------------------------------------------------|
| $x = y$                                                       | `$x = y$`                                                          |
| $x \approx y$                                                 | `$x \approx y$`                                                    |
| $f_k(X) \equiv Pr(X|Y = k)$                                   | `f_k(X) \equiv Pr(X|Y = k)`                                        |
| $x < y$                                                       | `$x < y$`                                                          |
| $x > y$                                                       | `$x > y$`                                                          |
| $x \le y$                                                     | `$x \le y$`                                                        |
| $x \ge y$                                                     | `$x \ge y$`                                                        |
| $x \times y$                                                  | `$x \times y$`                                                     |
| $x^{n}$                                                       | `$x^{n}$`                                                          |
| $x_{n}$                                                       | `$x_{n}$`                                                          |
| $x_1, x_2, \dots, x_n$                                        | `$x_1, x_2, \dots, x_n$`                                           |
| $x_1 + x_2 + \cdots + x_n$                                    | `$x_1 + x_2 + \cdots + x_n$`                                       |
| $\overline{x}$                                                | `$\overline{x}$`                                                   |
| $\hat{x}$                                                     | `$\hat{x}$`                                                        |
| $\widehat{SE}$                                                | `$\widehat{SE}$`                                                   |
| $\tilde{x}$                                                   | `$\tilde{x}$`                                                      |
| $\frac{a}{b}$                                                 | `$\frac{a}{b}$`                                                    |
| $\displaystyle \frac{a}{b}$                                   | `$\displaystyle \frac{a}{b}$`                                      |
| $\binom{n}{k}$                                                | `$\binom{n}{k}$`                                                   |
| $x_{1} + x_{2} + \cdots + x_{n}$                              | `$x_{1} + x_{2} + \cdots + x_{n}$`                                 |
| $x_{1}, x_{2}, \dots, x_{n}$                                  | `$x_{1}, x_{2}, \dots, x_{n}$`                                     |
| $\mathbf{x} = \langle x_{1}, x_{2}, \dots, x_{n}\rangle$      | `$\mathbf{x} = \langle x_{1}, x_{2}, \dots, x_{n}\rangle$`         |
| $x \in A$                                                     | `$x \in A$`                                                        |
| $\lvert A \rvert$                                             | `$\lvert A \rvert$`                                                |
| $\big \langle x_i, x_{i'}\big \rangle$                        | `\big \langle x_i, x_{i'}\big \rangle`                             |
| $x \in A$                                                     | `$x \in A$`                                                        |
| $x \subset B$                                                 | `$x \subset B$`                                                    |
| $x \subseteq B$                                               | `$x \subseteq B$`                                                  |
| $A \cup B$                                                    | `$A \cup B$`                                                       |
| $A \cap B$                                                    | `$A \cap B$`                                                       |
| $X \sim {\sf Binom}(n, \pi)$                                  | `X \sim {\sf Binom}(n, \pi)$`                                      |
| $\mathrm{P}(X \le x) = {\tt pbinom}(x, n, \pi)$               | `$\mathrm{P}(X \le x) = {\tt pbinom}(x, n, \pi)$`                  |
| $P(A \mid B)$                                                 | `$P(A \mid B)$`                                                    |
| $\mathrm{P}(A \mid B)$                                        | `$\mathrm{P}(A \mid B)$`                                           |
| $\{1, 2, 3\}$                                                 | `$\{1, 2, 3\}$`                                                    |
| $\sin(x)$                                                     | `$\sin(x)$`                                                        |
| $\log(x)$                                                     | `$\log(x)$`                                                        |
| $\exp(x)$                                                     | `$\exp(x)$`                                                        |
| $\exp{\big(\sum_{i=1}^p x_i\big)}$                            | `$\exp{\big(\sum_{i=1}^p x_i\big)}$`                               |
| $\int_{a}^{b}$                                                | `$\int_{a}^{b}$`                                                   |
| $\left(\int_{a}^{b} f(x) \; dx\right)$                        | `$\left(\int_{a}^{b} f(x) \; dx\right)$`                           |
| $\left[\int_{-\infty}^{\infty} f(x) \; dx\right]$             | `$\left[\int_{\-infty}^{\infty} f(x) \; dx\right]$`                |
| $\left. F(x) \right|_{a}^{b}$                                 | `$\left. F(x) \right|_{a}^{b}$`                                    |
| $\sum_{x = a}^{b} f(x)$                                       | `$\sum_{x = a}^{b} f(x)$`                                          |
| $\prod_{x = a}^{b} f(x)$                                      | `$\prod_{x = a}^{b} f(x)$`                                         |
| $\lim_{x \to \infty} f(x)$                                    | `$\lim_{x \to \infty} f(x)$`                                       |
| $\displaystyle \lim_{x \to \infty} f(x)$                      | `$\displaystyle \lim_{x \to \infty} f(x)$`                         |
| $RMSE = \sqrt{\frac{1}{n}\sum_{i=1}^{n} (Y_n - \hat{Y}_i)^2}$ | `$RMSE = \sqrt{\frac{1}{n}\sum_{i=1}^{n} (Y_n - \hat{Y}_i)^2}$`    |
| $(\texttt{pop} - \overline{ \texttt{pop}})$                   | `$(\texttt{pop} - \overline{\texttt{pop}})$`                       |
| $\hat{f}(x) \leftarrow \hat{f}(x) + \lambda\hat{f}^b(x)$      | `\hat{f}(x) \leftarrow \hat{f}(x) + \lambda\hat{f}^b(x)`           |


\newpage
#### Stacked text (for finding maximum):
To get text stacked below a word for optimization formulas use `\mathop` with `$$`

| Math                                                          | Code                                                               |
|:--------------------------------------------------------------|:-------------------------------------------------------------------|
| $$\mathop{\text{max}}_{k}(\hat{p}_{mk})$$                     | `$$\mathop{\text{max}}_{k}(\hat{p}_{mk})$$`                        |


#### Adjust position of limits:
To have limits appear next to sigma or pi (not above):

| Math                                                          | Code                                                               |
|:--------------------------------------------------------------|:-------------------------------------------------------------------|
| $\sum\nolimits_{i=1}^{n}X_i.$                                 | `\sum\nolimits_{i=1}^{n}X_i.`                                      |

### Matrices {.unnumbered}

\begin{equation}
A = 
  \begin{bmatrix}
    1 & \cdots & 3\\
    \vdots & \ddots & \vdots\\
    7 & \cdots & 9
  \end{bmatrix}
\end{equation}

Comes from this code:

```
\begin{equation}
A = 
  \begin{bmatrix}
    1 & \cdots & 3\\
    \vdots & \ddots & \vdots\\
    7 & \cdots & 9
  \end{bmatrix}
\end{equation}
```

Options to surround the matrix:

| Type            | Code              |
|:----------------|:------------------|
| Nothing         | `\begin{matrix}`  |
| Parentheses     | `\begin{pmatrix}` |
| Square Brackets | `\begin{bmatrix}` |
| Curly brackets  | `\begin{Bmatrix}` |
| Pipes           | `\begin{vmatrix}` |
| Double Pipes    | `\begin{Vmatrix}` |

## Equations {.unnumbered}

These are formulas that appear with an equation number.

### Basic Equation {.unnumbered}

The names of equations can not include . or \_ but it can include -

    \begin{equation} 
      1 + 1 = 2 
      (\#eq:eq99-1)
    \end{equation} 

Which appears as:

\begin{equation} 
\label{eq:eq99-1}
  1 + 1 = 2
\end{equation} 

The reference to the equation is \ref{eq:eq99-1} which comes from this code `\ref{eq:eq99-1}`.

### Case-When Equation (Large Curly Brace) {.unnumbered}

Based on: <https://tex.stackexchange.com/questions/9065/large-braces-for-specifying-values-of-variables-by-condition> 

Case when formula: 

```{=tex}
\begin{equation} 
\label{eq:eq99-2}
y =
\begin{cases}
  0, & \text{if}\ a=1 \\
  1, & \text{otherwise}
\end{cases}
\end{equation} 
```

Which comes from this code:

```{}
\begin{equation} 
\label{eq:eq99-2}
  y =
  \begin{cases}
    0, & \text{if}\ a=1 \\
    1, & \text{otherwise}
  \end{cases}
\end{equation} 
```

The reference to equation is \ref{eq:eq99-2} which comes from this code `\ref{eq:eq99-2}`.

### Alligned with Underbars {.unnumbered}

\begin{equation*}
\begin{aligned}
\mathrm{E}(Y-\hat{Y})^2 & = \mathrm{E}[f(X) + \epsilon -\hat{f}(X)]^2 \\
                        & = \underbrace{[f(X) -\hat{f}(X)]^2}_{\mathrm{Reducible}}
                        + \underbrace{\mathrm{var}(\epsilon)}_{\mathrm{Irreducible}} \\
\end{aligned}
\end{equation*}

Comes from this code:

```{}
\begin{equation*}
\begin{aligned}
\mathrm{E}(Y-\hat{Y})^2 & = \mathrm{E}[f(X) + \epsilon -\hat{f}(X)]^2 \\
                        & = \underbrace{[f(X) -\hat{f}(X)]^2}_{\mathrm{Reducible}} 
                        + \underbrace{\mathrm{var}(\epsilon)}_{\mathrm{Irreducible}} \\
\end{aligned}
\end{equation*}
```

## Greek letters {.unnumbered}


::: {.col2}
| letters                   | code                        |
|:--------------------------|:----------------------------|
| $\alpha A$                | `$\alpha A$`                |
| $\beta B$                 | `$\beta B$`                 |
| $\gamma \Gamma$           | `$\gamma \Gamma$`           |
| $\delta \Delta$           | `$\delta \Delta$`           |
| $\epsilon \varepsilon E$  | `$\epsilon \varepsilon E$`  |
| $\zeta Z \sigma$          | `$\zeta Z \sigma`           |
| $\eta H$                  | `$\eta H$`                  |
| $\theta \vartheta \Theta$ | `$\theta \vartheta \Theta$` |
| $\iota I$                 | `$\iota I$`                 |
| $\kappa K$                | `$\kappa K$`                |
| $\lambda \Lambda$         | `$\lambda \Lambda$`         |
| $\mu M$                   | `$\mu M$`                   |
| $\nu N$                   | `$\nu N$`                   |
| $\xi\Xi$                  | `$\xi\Xi$`                  |
| $o O$                     | `$o O$ (omicron)`           |
| $\pi \Pi$                 | `$\pi \Pi$`                 |
| $\rho\varrho P$           | `$\rho\varrho P$`           |
| $\sigma \Sigma$           | `\sigma \Sigma$`            |
| $\tau T$                  | `$\tau T$`                  |
| $\upsilon \Upsilon$       | `$\upsilon \Upsilon$`       |
| $\phi \varphi \Phi$       | `$\phi \varphi \Phi$`       |
| $\chi X$                  | `$\chi X$`                  |
| $\psi \Psi$               | `$\psi \Psi$`               |
| $\omega \Omega$           | `$\omega \Omega$`           |
:::

## Calligraphic Letters {.unnumbered}


| Letters                                | Code                                   | Meaning                         |
|:---------------------------------------|:---------------------------------------|:--------------------------------|
| $\mathcal{B}$                          | `$\mathcal{B}$`                        | Basis                           |
| $\mathcal{O}$                          | `$\mathcal{O}$`                        | Used for Big-O notation         |
| $\mathcal{P}$                          | `$\mathcal{P}$`                        | Power set, probability function |
| $\mathcal{S}$                          | `$\mathcal{S}$ `                       | A set                           |
| $(\Omega, \mathcal{F}, \mathcal{P})$   | `$(\Omega, \mathcal{F}, \mathcal{P})$` | Probability space/triple        |