In our daily lives, we often come across English and English becomes part of our language. However, did you pay attention to the frequency of the letters used in the context? Which letter is used frequently, and which letter is used less frequently? Is it A, S, or I? In order to reveal the answer to the question, let me give you some history behind the commonness of the letters.
On April 27, 1791, the inventor of Morse code, Samuel Morse, was born. As he grew older, he went to Yale College and received education. He gradually developed an interest towards art. After his graduation, he became a well-known artist. By the year of 1832, when he moved to New York, he became interested in telegraphy. In the same year, he overheard a conversation on a ship about electromagnetism that inspired him to invent a telegraph system. It contains five magnetic needles that could be pointed around a panel of letters and numbers by using an electric current. At the same time, he realised that pulses of electric current could convey information over wires. With more help of Leonardo Gale, a professor of science, and Alfred Vail, who developed his medical skills, Morse eventually produced a single circuit telegraph by pushing the operator key down to complete the electric circuit of the battery, which then sent the electrical signal across a wire to a receiver at the other end.
In 1837, Morse’s first telegraph device was revealed. It was a one-wire system where the dips in the line had to be decoded into letters and numbers using a dictionary composed by Morse. Yet, the pen or pencil didn’t always write clearly. Therefore, he decided to improve the system by creating a dot-and-dash code that could be used in different numbers to represent the letters of the English alphabet and the ten digits. At an exhibition of his telegraph in 1838, Morse transmitted ten words per minute using the Morse code that would become standard throughout the world.
Here comes the question. How did Morse and Vail determine the most common letter in English? Morse’s aim was to keep the code as short as possible, which meant that commonest letters should have the shortest code. He came up with an idea. He went to his local newspaper and covered the block with ink and pressing paper on the top. In the old days, printers made their papers by putting individual letters(type) together into a block. The printers kept the letters(type) in cases with each letter kept in a separate compartment. Morse then counted the number of pieces of type for each letter. He found that there were more e’s than any other letter and so he gave ‘e’ the shortest code, ‘dit’ where dit refers to dot and dah refers to dash.
A Dit = 1 dot
A Dah = 3 dots
The pause between Dits/Dahs takes 1 unit of time
The pause between letters takes 3 units of time
The pause between words takes 7 units of time
The code below will give you a clear visual image of how each letter and number works in Morse Code.
If you want to learn how dit and dah work, the flow chart below will give you clear information about it.
A movement to the left represents a long duration, a dah(dash); the movement to the right represents a short duration, a dit (dot). If you check the morse code again, you will notice that the letter e is in fact a dot. How do you know? First, you start at where it says START. Then, you move down and to the right, you will see a blue arrow where it says dit at the top and that’s how you find a dot. To give you an example, the morse code of Samuel Morse would be:
S: • • • A: • ⁃ M: ⁃ ⁃ U: • • ⁃ E:• L: • ⁃ • •
M: ⁃⁃ O: ⁃ ⁃ ⁃ R: • ⁃ • S: • • • E: •
After counting the number of letters in sets of printers’ types, he realised that the letter e has the most frequency of 12,000 and the letter z has the least frequency of 200. The frequency of each letter is shown below:
In the old days, however, if people send a distress signal using morse code, it will get ridiculously long. Imagine using the code to send an emergent message with more than a hundred word in it. To reduce the inconveniency, a shorthand had been developed over the years. The most well-known is the distress signal, SOS with three dots on the side and three dash in the middle, which looks like this: ••• — -•••. SOS is a form of ‘prosign’, meaning that two or three letters are indicated in the form of morse code without using actual text. They are sent without a space between the two letters. It was a way to send signals clearly and quickly without confusing others when radiotelegraph machines were first made onto ships in the 20th century.
One thing that intrigues me about the frequency of letter used is the abandoned letter of ‘e’ in a novel called Gadsby by Ernest Vincent Wright. The story is written in the form of lipogram, which means that a letter or several letters are omitted intentionally. Wright did it deliberately for a challenge and that means he couldn’t use words such as ‘the’, ’they’, ’their’, and so on.
According to Wright, the entire manuscript was written with the E-type bar of the typewriter tied down; thus making it impossible for that letter to be printed. This was done so that none of that vowel might slip in accidentally. In the story, Wright did not use abbreviations such as Mr. and he avoided using numerals as words. The word ‘said’ has also been used frequently to replace ‘replied’ and ‘asked’ could not be used as well. “Pronouns also caused trouble; for such words as he, she, they, them, theirs, her, herself, myself, himself, yourself, etc., could not be utilised.” Claimed Wright. Likewise, the lack of ability to use ‘the’ is particularly troublesome for him as it is the most common article that is used in English. The book took him five and a half months to write and was newsworthy even before he had finished. Gadsby tells a story of a fictional city called Branton Hills, which is revitalised by the effort of a new mayor John Gadsby and helps increase the town’s population. The book died down for some time but gradually gained attention over the years. Today, it has become a classic despite the awkwardness of the language used in the novel.
The first paragraph of the story goes like this:
“If youth, throughout all history, had had a champion to stand up for it; to show a doubting world that a child can think; and, possibly, do it practically; you wouldn’t constantly run across folks today who claim that “a child don’t know anything.” A child’s brain starts functioning at birth; and has, amongst its many infant convolutions, thousands of dormant atoms, into which God has put a mystic possibility for noticing an adult’s act, and figuring out its purport.”
It is rather difficult when the most common letter is taken out of the context since it is almost impossible when you have to use prepositions such as ‘of’, ‘to’, and ‘in’ next to ‘e’ words. David Taylor, who is a blogger, did a research on the book and found out that Wright used more ‘g’ than other letters after removing the letter ‘e’, based on The Atlantic. The chart below illustrates the frequency of each letter used after ignoring the most common letter.
From the chart, we notice that the letter ‘g’ has an occurrence of 1571 and the letter ‘f’ appears to be the least letter that is used in the story as I mentioned above.
To conclude, it is always interesting to look at the language and data when you want to carry out a research or when you are curious about the linguistic side of a language. Again, it is challenging to use words without using the most common letter and it will dramatically affect the writings. So, unless you’re up for a challenge, it won’t worth a penny.