Coming To A Point of View
From this point of view, the difference between data set 1 and data set 2 is that the closest
point pairs in data set 2 seem, in general, to be closer than the closest point pairs of data set 1. If this
indeed is the main characteristic of the difference, then the Alternative hypothesis should not
be that one closest point pair is closer than expected by chance, but that some summary statistic
that the set of closest point pairs for the points of data set 2 has smaller values than the set
of the closest point pairs for data set 1. In this case, the test statistic we might consider
can be some measure of central location. We consider the arithmetic mean, the harmonic mean, and the geometric mean.
We will use a 1% significance level test as before using each kind of mean as our test statistic
and we will calculate the misdetect rate of each of the statistics. The table below shows what
we find and compares it to the test statistic we tried first, which was the minimum of these
| Test Statistic
|| Misdetect Rate
| Arithmetic Mean
| Geometric Mean
| Harmonic Mean
Table showing the misdetect rate for different test statistics for a 1% significance level test
By looking at the table, we see that of the different test statistics tried, it was the harmonic
mean that has the smallest misdetect rate and therefore, the highest power.
We might wonder whether this ordering that the harmonic mean is best followed by the min
holds only for the 1% significance level (false alarm rate). Perhaps at some other significance level
the harmonic mean is not the best. The receiver operating curve (ROC) will tell us.