banner banner banner
Geekspeak: Why Life + Mathematics = Happiness
Geekspeak: Why Life + Mathematics = Happiness
Оценить:
Рейтинг: 0

Полная версия:

Geekspeak: Why Life + Mathematics = Happiness

скачать книгу бесплатно


The same approach can be used to estimate your vocabulary. Sample the ‘population’ of words by opening the dictionary at random 100 times. Each time, look at the first entry at the top of the page. Do you know the meaning of this word? If the answer is yes, add one to your word score. At the end of the exercise, divide your score by the sample size of one hundred to get an estimate of the fraction of words in the dictionary that you know. Multiply that fraction by the total number of words in the dictionary to make an estimate of your vocabulary size.

This method works, but you need to be careful: how many times should you dip into the dictionary at random to get a good estimate? Say you do the test twice and find that you know the first word, but not the second. That means that you know 50% of the words in the tiny bit of the dictionary you examined.

But common sense tells you that this estimate is unreliable. It is true that you might know half the dictionary, but it is also possible that you know 10% or 90% of all the words. The two words you chanced upon might have been unusually uncommon, or unusually common. Two out of however many thousand words the dictionary defines is not a representative sample.

Do the trial 10 times, and confidence in the result is greater; 100 times, even better. If you did the trial 1,000 times and found that you knew 500 words, you could argue quite strongly that you really do know about half of all the words in the dictionary.

To complete the estimate of your vocabulary you’ll need to know the total number of words in the dictionary – preferably without having to count them. This is quite easy: look up the number of the last page in the dictionary, and take that as the number of pages. Next, open the dictionary at random and count the number of different words listed on that page. Multiply the number of pages by the number of words per page, and you have an estimate of the number of words in the dictionary.

I thought I’d better test myself using this statistical sampling technique. The dictionary I used has about 60 entries on each page, and over 800 pages. That’s around 48,000 words altogether.

I opened the dictionary 125 times, and made a tick on a piece of paper if I knew the meaning of the word at the top of the page, and a cross if I didn’t. Like me, you’ll probably find it hard to stop yourself jumping ahead to other entries if the first is unfamiliar. Don’t – that’s cheating, and invalidates the statistical sampling!

The result: there were 25 words whose meaning I didn’t know. On that basis, my passive vocabulary is 48,000 multiplied by 100/125. That’s around 40,000 words. It sounds high, but it includes all the possible extensions of the stem of each word. For example, take the word ‘abstract’. The dictionary will include ‘abstractedly’, ‘abstractedness’, and so on. The number of stem words I know is a lot less than 40,000.

Still, I’m feeling pretty good about myself, so I’m going to exercise my gigantic male vocabulary by introducing the next chapter:

‘The, er, next chapter is, er, fucking interesting…’

SPEAK GEEK

‘IT IS A TRUTH UNIVERSALLY ACKNOWLEDGED THAT A SINGLE MAN IN POSSESSION OF A GOOD FORTUNE MUST BE IN WANT OF A WIFE.’

Some authors are instantly recognisable from their vocabulary. For example, everyone recognises the style of Jane Austen, and many would say that her writing’s distinguishing feature is its abundance of long words. But is this true? A bit of statistical analysis can reveal the answer.

The four longest words used by Jane Austen in Pride andPrejudice have 16 or 17 characters. They are ‘superciliousness’, ‘communicativeness’, ‘disinterestedness’ and ‘misrepresentation’. But just looking at the longest words is not enough: we need to examine the distribution of word lengths over her entire vocabulary, as shown in the graph below:

For comparison, here is the ‘fingerprint’ of the writer Ian McEwan, showing that his vocabulary includes many shorter words:

And, what about this book? In this work I intend to speakwith candour, and without misrepresentation or superciliousness, of the accomplishments of the irreproachable retrospections…


Вы ознакомились с фрагментом книги.
Для бесплатного чтения открыта только часть текста.
Приобретайте полный текст книги у нашего партнера:
Полная версия книги
(всего 230 форматов)