---
title: "Exercise 7.30"
author: "Per August Jarval Moen"
date: "26/10/2023"
output: pdf_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
Let $Y_i$ denote the annual number of shark attacks in Florida, for $i=1,\dots,n=13$.
We want to test $H_0 : Y_i \overset{\text{ind}}{\sim}{\text{Poisson}}(\mu)$ for some $\mu >0$ versus the alternative hypothesis $H_1 : Y_i \overset{\text{ind}}{\sim}{\text{NegBin}}(\mu, \gamma)$\footnote{Here the negative binomial is parametrized by the mean $\mu>0$ and the "overdispersion" parameter $\gamma>0$, as on page 248 in the book.} for some $\mu>0$ and $\gamma>0$. 
\newline
\newline
The pmf of the negative binomial distribution ${\text{NegBin}}(\mu, \gamma)$ will converge to the pmf of a $\text{Poisson}(\mu)$ as $\gamma \rightarrow 0$. Informally, we have that the Poisson distribution corresponds to a negative binomial distribution with $\gamma =0$. Therefore, the model implied by the null hypothesis $H_0$ is in some sense nested in the model implied by the alternative hypothesis $H_1$. It is therefore reasonable to test the null hypothesis by a likelihood ratio test, as the models are nested in some sense. The likelihood ratio statistic is given by
$$
Z_{\text{LR}} = -2\left\{ \underset{\mu>0}{\sup} \ \ell_{\text{Poisson}}(\mu) -\underset{\mu>0,\gamma>0}{\sup} \ \ell_{\text{NegBin}}(\mu,\gamma)        \right\},
$$
where $\ell_{\text{Poisson}}(\mu)$ is the log-likelihood of the Poisson model given the observed data, and $\ell_{\text{NegBin}}(\mu,\gamma)$ is the log-likelihood of the Negative Binomial model given the observed data. 
\newline
\newline
However, the Negative Binomial distribution is only defined for $\gamma>0$. Hence, $\gamma=0$ lies in the boundary of the parameter space of the Negative Binomial model (the alternative hypothesis), which violates the assumptions of Wilk's Theorem. It turns out that this is not a problem: It can be shown that under $H_0$, 
\begin{equation}
Z_{\text{LR}} \overset{d}{\rightarrow} \frac{1}{2}\delta_0 + \frac{1}{2} \chi_1^2 \quad \text{as } n\rightarrow \infty \label{adist},
\end{equation}
where $\delta_0$ indicates the distribution with a point-mass in zero (it will be zero with probability $1$). So under $H_0$, the likelihood ratio statistic should be $0$ with probability $\frac{1}{2}$ and be distributed as a chi square with probaility $\frac{1}{2}$. To obtain a correct p-value, we can perform the test as if $Z_{\text{LR}}$ is a chi square, and then divide the p-value by two. 
\newline
\newline
So let's try it out. First we load the data and fit the Poisson model:
```{r cache=TRUE}
Y = c(33,29,29,12,17,21,31,28,19,14,11,26,23)
```
We use the R function glm.nb from library MASS to fit the Negative Binomial model. Note that R uses the parameterization $\theta = 1/\gamma$ for the overdispersion parameter. 
```{r cache=TRUE}
library(MASS)
negbin = glm.nb(Y~1)
```
We can test $H_0$ versus $H_1$ by using the function odTest in the pscl library:
```{r cache=TRUE}
library(pscl)
odTest(negbin)
```
The p-value of the likelihood ratio test using the result in \eqref{adist} is $0.005332$. This is very small, so we reject $H_0$ in favor of $H_1$. 
\newline
\newline
Lastly, we check if there is any evidence of a positive linear trend over time. We use a negative binomial model:
```{r cache=TRUE}
Y = c(33,29,29,12,17,21,31,28,19,14,11,26,23)
time = 1:length(Y)
data = data.frame(Y,time)
negbin2 = glm.nb(Y~time,data=data)
summary(negbin2)
```
The estimated coefficient of time is not significantly different from zero, even for one-sided tests. Hence we do not have any evidence that there is any linear trend over time at all.