The Results

Skip Range Number of Maximal ELS Phrases Torah Text Number of Maximal ELS Phrases Monkey Text P-value
2-100 40,977 41,694 14/1000
2-1000 406,206 410,668 795/1000
1002-2000 391,500 395,582 879/1000

For the small skip range of 2-100 we reject the Null hypothesis at the 1% significance level. For the two larger ranges of skips, we are not able to reject the Null hypothesis.


The difference in the behavior for the smaller skips is probably due to a higher order letter dependency among the successive letters of the ELS phrase coming from the Torah text than from the monkey text. This would tend to make the maximal ELS phrases coming from the Torah text somewhat more statistically coherent than those coming from a monkey text.

We can conclude that by our experimental protocol, that in so far as the two features we used: difficulty class and average conditional entropy per letter (4th order Markov), there is not a statistically significant difference between the maximal ELS phrases generated by the Torah text and those generated by a letter permuted Torah text for large skips.

Our protocol has a built in assumption about the class conditional distributions of the maximal ELS feature vectors from the Torah text and from the monkey text. The assumption is that the class conditional distributions are Gaussian. We will next explore what happens when we assume a more general ellipsoidally symmetric form for class conditional distributions.

