--- title: "Exercise 7.30" author: "Per August Jarval Moen" date: "26/10/2023" output: pdf_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` Let $Y_i$ denote the annual number of shark attacks in Florida, for $i=1,\dots,n=13$. We want to test $H_0 : Y_i \overset{\text{ind}}{\sim}{\text{Poisson}}(\mu)$ for some $\mu >0$ versus the alternative hypothesis $H_1 : Y_i \overset{\text{ind}}{\sim}{\text{NegBin}}(\mu, \gamma)$\footnote{Here the negative binomial is parametrized by the mean $\mu>0$ and the "overdispersion" parameter $\gamma>0$, as on page 248 in the book.} for some $\mu>0$ and $\gamma>0$. \newline \newline The pmf of the negative binomial distribution ${\text{NegBin}}(\mu, \gamma)$ will converge to the pmf of a $\text{Poisson}(\mu)$ as $\gamma \rightarrow 0$. Informally, we have that the Poisson distribution corresponds to a negative binomial distribution with $\gamma =0$. Therefore, the model implied by the null hypothesis $H_0$ is in some sense nested in the model implied by the alternative hypothesis $H_1$. It is therefore reasonable to test the null hypothesis by a likelihood ratio test, as the models are nested in some sense. The likelihood ratio statistic is given by $$ Z_{\text{LR}} = -2\left\{ \underset{\mu>0}{\sup} \ \ell_{\text{Poisson}}(\mu) -\underset{\mu>0,\gamma>0}{\sup} \ \ell_{\text{NegBin}}(\mu,\gamma) \right\}, $$ where $\ell_{\text{Poisson}}(\mu)$ is the log-likelihood of the Poisson model given the observed data, and $\ell_{\text{NegBin}}(\mu,\gamma)$ is the log-likelihood of the Negative Binomial model given the observed data. \newline \newline However, the Negative Binomial distribution is only defined for $\gamma>0$. Hence, $\gamma=0$ lies in the boundary of the parameter space of the Negative Binomial model (the alternative hypothesis), which violates the assumptions of Wilk's Theorem. It turns out that this is not a problem: It can be shown that under $H_0$, \begin{equation} Z_{\text{LR}} \overset{d}{\rightarrow} \frac{1}{2}\delta_0 + \frac{1}{2} \chi_1^2 \quad \text{as } n\rightarrow \infty \label{adist}, \end{equation} where $\delta_0$ indicates the distribution with a point-mass in zero (it will be zero with probability $1$). So under $H_0$, the likelihood ratio statistic should be $0$ with probability $\frac{1}{2}$ and be distributed as a chi square with probaility $\frac{1}{2}$. To obtain a correct p-value, we can perform the test as if $Z_{\text{LR}}$ is a chi square, and then divide the p-value by two. \newline \newline So let's try it out. First we load the data and fit the Poisson model: ```{r cache=TRUE} Y = c(33,29,29,12,17,21,31,28,19,14,11,26,23) ``` We use the R function glm.nb from library MASS to fit the Negative Binomial model. Note that R uses the parameterization $\theta = 1/\gamma$ for the overdispersion parameter. ```{r cache=TRUE} library(MASS) negbin = glm.nb(Y~1) ``` We can test $H_0$ versus $H_1$ by using the function odTest in the pscl library: ```{r cache=TRUE} library(pscl) odTest(negbin) ``` The p-value of the likelihood ratio test using the result in \eqref{adist} is $0.005332$. This is very small, so we reject $H_0$ in favor of $H_1$. \newline \newline Lastly, we check if there is any evidence of a positive linear trend over time. We use a negative binomial model: ```{r cache=TRUE} Y = c(33,29,29,12,17,21,31,28,19,14,11,26,23) time = 1:length(Y) data = data.frame(Y,time) negbin2 = glm.nb(Y~time,data=data) summary(negbin2) ``` The estimated coefficient of time is not significantly different from zero, even for one-sided tests. Hence we do not have any evidence that there is any linear trend over time at all.