Some statistics of Jesus’ words

1280 words, 7 min read 

As I was reading John’s Gospel recently1, I was struck by how often Jesus there connects himself with his Father and with his audience, and therefore, by extension, with us. I had the impression that he was using personal and possessive pronouns (I, you, my, your, …) to a greater degree than in the synoptic Gospels and a train of thought started in me about what the motives of such a pattern might be.

Then, I started wondering about whether such a characterisation of Jesus’ words in John was rooted in fact, or merely the fruit of some psychological bias that I brought to the text. This, naturally, led to a bit of an exploration of some statistics of Jesus’ words in the four Gospels. I proceeded to parse the four texts and separate the words that the narrator attributed to Jesus from the words of the narrator proper, resulting in a pair of texts for each Gospel – its full text and Jesus’ words only.

Let’s start with some high level descriptive statistics in terms of word and verse totals for the two texts per Gospel, and some relative comparisons of Jesus’ words to the whole in each case:

MatthewMarkLukeJohn
All words22893143092510318813
Jesus’ words (JWs)129035017119787849
JWs as % of all56%35%48%42%
All verses10716781150781
Verses with JWs647183585411
Verses with JWs as % of all60%27%51%53%

Mark’s Gospel is shortest and contains fewest of Jesus’ words, but while Luke is 75% longer than Mark, it is Matthew who recounts most words by Jesus – more than twice as many as Mark. In other words, Mathew is most about sharing what Jesus said, while Mark’s focus is on what Jesus did.

Before turning to the question about pronouns, I also wondered about the broader features of these eight texts in terms of complexity and understandability, which are summarised in the following table:

MatthewMarkLukeJohn
Flesch reading ease score (all)81818084
Flesch reading ease score (JWs)84878390
Automated readability index (all)5.14.75.64.5
Automated readability index (JWs)4.83.65.43.3
Lexical density (all)36%36%36%31%
Lexical density (JWs)33%32%33%25%
Lexical diversity (all)12%14%12%9%
Lexical diversity (JWs)15%21%16%11%

Here the Flesch score looks at the mean number of words per sentence and the mean number fo syllables per word to derive a score that relates to the level of education required of a reader to understand a text (e.g., Reader’s Digest and written assignments of 12-year-olds score around 65, while the Harward Law Review would come in around 30). Here we can see that all of the eight texts score very high in reading ease terms, but there is some difference nonetheless between Luke’s Gospel as a whole (80) and Jesus’ words in John (90). The automated readability index, which is based on characters rather than syllables paints a similar picture and its scores, which are in terms of school grades suggest that Luke’s gospel as a whole would be accessible to an 11-year-old, while an 8-year-old could follow Jesus’ words in John.

Next, we can also look at the lexical properties of these texts, where the characteristics of lexical words (nouns, adjectives, verbs, and adverbs – i.e., “content” words) are compared with those of functional words (everything else – i.e., “grammar” words). Lexical density looks at the proportion of lexical words compared to all words (i.e., the more “content” words the denser a text) while lexical diversity looks at the number of unique lexical words versus all lexical words in a text. Here again John comes out as both least dense and least diverse, while the density of the synoptic gospels is similar and diversity is highest in Mark. All levels of lexical density here are extremely low though – for comparison, the lexical density of a 10-year-old’s spontaneous narrative is 30-40 and that of an adult around 60, with essays written by adults scoring around 90. In other words, the Gospels are written in an unusually simple way as far as vocabulary is concerned.

This tendency towards linguistic simplicity and frugality across the board, and especially in John can also be seen if we look more closely at the number of unique words used in each of the eight texts next. As a benchmark, the typical vocabulary of a 4-year-old English speaker is 5 000 words and that of an adult 20 000.

MatthewMarkLukeJohn
Unique words (all)2391186427651500
Unique words (JWs)170410041819802
JWs as % of all71%54%66%53%

While we are looking at how words are used in these texts, let us also take a look at what the most frequent ones are, focusing on Jesus’ words in the four gospels, and let’s do so for nouns and verbs separately:

Top nounsMatthewMarkLukeJohn
1.fathergodgodfather
2.heavenmansonworld
3.mansonmanlife
4.kindgomkingdomfathergod
5.sonfatherkingdomson

The top 5 noun lists paint an interesting picture! In all four Gospels, Jesus’ words include father and son in the top five and this does make a great deal of sense. The gospels are all about God the Father and his Son. There is a notable difference between the synoptics, which are very homogeneous from this perspective, and John, where Jesus says world and life, instead of kingdom and man. Clearly this is only a hint at what goes on in these texts that are so rich in meaning, while being constructed with extreme simplicity and a highly sparing use of variety.

Top verbsMatthewMarkLukeJohn
1.bebebedo
2.dododohave
3.havehavehavebe
4.saycomesayknow
5.comesaycomesent

The top 5 verbs are even more consistent, with the synoptics using the same set of five (be, do, have, say, come) and John sharing the top three of these and replacing say and come with know and sent. An intriguing hint at Jesus’ focus on being sent by the Father in John, instead of the invitation to come and join/follow him in the synoptics. The say-know pair is also telling and, to me, points to another axis between the interior and exterior in which John and the other three differ.

Now, we can finally turn to the question that sparked this little exploration for me. Does John’s Jesus use personal and possessive pronouns2 more than the Jesus of the other gospels? The simple answer is: yes, he does! In the synoptic gospels, 13% of the words Jesus says are in this category, while in John they make up a whopping 20% – every fifth words that Jesus says in John’s gospel is a reference to the Father, to himself and to his audience. It is about who Jesus is, who the Father is and who we are as we relate to them and to each other in them.

Verses like the following are prime examples of a pattern that jumped out at me: “For the Father loves his Son and shows him everything that he himself does, and he will show him greater works than these, so that you may be amazed.” (5:20), “Just as the living Father sent me and I have life because of the Father, so also the one who feeds on me will have life because of me.” (6:57) or “You heard me tell you, ‘I am going away and I will come back to you.’ If you loved me, you would rejoice that I am going to the Father; for the Father is greater than I.” (14:28) They are examples of a message that is rooted in and permeated by relationships among persons – divine and human; relationships that are deeply personal and all-possessing.


  1. I used the New American Bible Revised Edition (NABRE) translation, which is also the basis of the analysis that follows.
  2. I, he, her, hers, him, his, it, its, me, mine, my, our, ours, she, their, theirs, them, they, us, we, you, your, yours.

Leave a comment