next up previous contents
Next: The equal-probabilities method of Up: MAXIMUM LIKELIHOOD METHODS Previous: Generalizations   Contents

Goodness-of-fit tests

Throughout the previous sections we assumed that the finite number of distances is the only source of error. In practice, the distribution of distances within the (identified) scaling region will usually not exactly be given by eq. (5.1). We therefore have to test whether this distribution fits to the data or not. Other sources of error, e.g. lacunarity (non-constant $\tilde{\phi}$), may distort the scaling behaviour and the estimators and their variances derived in the previous sections may no longer be valid. Furthermore, this test is particularly interesting if we lack prior knowledge about the attractor, in particular its dimension. Therefore, we will first estimate $\nu$ and $\phi$ and then test whether the chosen scaling region deserves its name. Note however, that passing such a test does only mean that it is reasonable to say that the scaling region is straight, not that the slope is the true dimension. Let $r_{1},\; r_{2},\; \ldots, \; r_{n}$ be independent calculations on a random variable (distances) with distribution function P(r) which has a known form, but some unknown parameters. We now wish to test the hypothesis
\begin{displaymath}
H_{o} : P(r) = P_{o}(r) \quad (= \phi r^{\nu})
\end{displaymath} (5.58)

Any test of (5.58) is called a test of fit [Kendall and Stuart, 1979, §30.2]. The hypothesis is called simple if the distribution $P_{o}(r)$ is completely specified. In our case, we have to estimate the parameters of the distribution, so that our hypothesis is composite [Kendall and Stuart, 1979, §23.2].

The probability that we reject the hypothesis $H_{o}$ when it is true, is called the size of the test [Kendall and Stuart, 1979, §22.6]. A ``$X^{2}$ test'' can be devised to determine a value of $\alpha$ such that a test of any size $\alpha_{o} \leq \alpha$ would not reject $H_{o}$. In other words: we calculate $\alpha$ (as discussed below) and we reject the hypothesis $H_{o}$ if $\alpha < \alpha_{o}$. We will use the conventional value of $\alpha_{o}=0.05$.

Now suppose that the range of the variate $r$ is divided into $k$ mutually exclusive classes. The probability of an observation falling in each class is denoted by $p_{i}$ and the observed frequency by $n_{i}$ with $\sum_{i=1}^{k} n_{i} = n$. If we assume that the $n_{i}$ are multinomially distributed, then

\begin{displaymath}
X^{2} = \sum_{i=1}^{k} \frac{ {(n_{i}-np_{0i})}^{2} } {np_{0i}}
\end{displaymath} (5.59)

has an (approximate [Hogg and Craig, 1978, p.271]) $\chi^{2}$ distribution with $k-1$ degrees of freedom [Kendall and Stuart, 1979, §30.5]. However, we estimate the $p_{0i}$ using maximum likelihood estimators so that the distribution of $X^{2}$ is bounded between a $\chi_{k-1}^{2}$ and a $\chi_{k-s-1}^{2}$ variable where $s$ is the number of estimated parameters [Kendall and Stuart, 1979, §30.19]. For $k$ large enough, this poses no serious problem. The value of $\alpha$ we choose to use is now given by the probability associated with the upper tail of the $\chi_{k-s-1}^{2}$ distribution [Kendall and Stuart, 1979, §30.6]. This $\alpha$ will be smaller than the one we would obtain with the $\chi_{k-1}^2$ distribution, so we will sooner reject $H_{o}$.



Subsections
next up previous contents
Next: The equal-probabilities method of Up: MAXIMUM LIKELIHOOD METHODS Previous: Generalizations   Contents
webmaster@rullf2.xs4all.nl