Watch 'The Root: The Birth of Big Brother' and all of 'The Root' episodes HERE
Catch up on the research from segment 1 HERE
Segment 2: William Friedman; Edgar Allan Poe; Solve the Cipher
In this segment, Glenn explained how Yardley’s operation was shut down – but the practice of domestic spying wasn’t. By 1929 surveillance, coding and decoding messages were established as a crucial skill to have, maintain, and advance. So, Secretary of State Stimson may have put an end to Yardley’s operation, but Yardley’s files were not destroyed. Instead, they were sent to the desk of one of the most brilliant minds in cryptology – William Friedman.
Edgar Allan Poe
Friedman was a pioneer in the field, and as you’ll see in the next segment, inspired the creation of the only known cipher machine to never be cracked. But how Friedman got his start in the field is an incredible tale all its own. Friedman was a big fan of Edgar Allan Poe, and Poe was a big fan of ciphers. In fact, he once bragged that he could decipher anything. Friedman loved deciphering messages, and one of his favorite Poe stories was ‘The Gold Bug’.
The code in Poe’s ‘Gold Bug’ is an example of a substitution cipher, which is a cipher where the letters of the alphabet are replaced by other letters or symbols. Calculating all possible combinations of 26 letters of the alphabet and 26 symbols the possible results are in the trillions. For hundreds of years, substitution ciphers were considered impossible to crack.
The secret to solving substitution ciphers is by analyzing syntax. Languages follow a set of rules. There’s only so many ways you can combine letters so that they form intelligible words. Even though the cipher uses symbols it still has to follow those rules. Pretty soon by looking for these rules we can start to see patterns within the message. All that’s needed is a key.
The foundation for all codebreaking is letter frequency analysis. As Poe's Legrand pointed out in the story, in English the most commonly used letter is the letter E. Locating the most frequently used vowels and consonants is the key to letter frequency analysis. Legrand argued that the letters of the alphabet ordered from most frequent to less frequent looked like this:
e a o i d h n r s t u y c f g l m w b k p q x z
Here’s the section that contains the cipher and the explanation on how to decrypt it (this is the portion Glenn animated on the program):
Here Legrand submitted the parchment to my inspection. The following characters were rudely traced between the death's-head and the goat:
"But," said I, returning him the slip, "I am as much in the dark as ever. Were all the jewels of Golconda awaiting me upon my solution of this enigma, I am quite sure that I should be unable to earn them."
"And yet," said Legrand, "the solution is by no means so difficult as you might be led to imagine from the first hasty inspection of the characters. These characters, as any one might readily guess, form a cipher — that is to say, they convey a meaning; but then, from what is known of Kidd, I could not suppose him capable of constructing any of the more abstruse cryptographs. I made up my mind, at once, that this was of a simple species — such, however, as would appear, to the crude intellect of the sailor, absolutely insoluble without the key."
"And you really solved it?"
"Readily; I have solved others of an abstruseness ten thousand times greater. Circumstances, and a certain bias of mind, have led me to take interest in such riddles, and it may well be doubted whether human ingenuity can construct an enigma of the kind which human ingenuity may not, by proper application, resolve. In fact, having once established connected and legible characters, I scarcely gave a thought to the mere difficulty of developing their import.
"In the present case — indeed in all cases of secret writing — the first question regards the language of the cipher; for the principles of solution, so far, especially, as the more simple ciphers are concerned, depend upon, and are varied by, the genius of the particular idiom. In general, there is no alternative but experiment (directed by probabilities) of every tongue known to him who attempts the solution, until the true one is attained. But, with the cipher now before us, all difficulty was removed by the signature. The pun upon the word 'Kidd' is appreciable in no other language than the English. But for this consideration I should have begun my attempts with the Spanish and French, as the tongues in which a secret of this kind would most naturally have been written by a pirate of the Spanish main. As it was, I assumed the cryptograph to be English.
"You observe there are no divisions between the words. Had there been divisions, the task would have been comparatively easy. In such case I should have commenced with a collation and analysis of the shorter words, and, had a word of a single letter occurred, as is most likely, (a or I, for example,) I should have considered this solution as assured. But, there being no division, my first step was to ascertain the predominant letters, as well as the least frequent. Counting all, I constructed a table, thus:
Of the character 8 there are 33.
; " 26.
4 " 19.
‡ ) " 16.
* " 13.
5 " 12.
6 " 11.
†1 " 8.
0 " 6.
9 2 " 5.
: 3 " 4.
? " 3.
¶ " 2. —. " 1.
"Now, in English, the letter which most frequently occurs is e. Afterwards, the succession runs thus: a o i d h n r s t u y c f g l m w b k p q x z. E predominates so remarkably that an individual sentence of any length is rarely seen, in which it is not the prevailing character.
"Here, then, we have, in the very beginning, the groundwork for something more than a mere guess. The general use which may be made of the table is obvious — but, in this particular cipher, we shall only very partially require its aid. As our predominant character is 8, we will commence by assuming it as the e of the natural alphabet. To verify the supposition, let us observe if the 8 be seen often in couples — for e is doubled with great frequency in English — in such words, for example, as 'meet,' 'fleet,' 'speed,' 'seen,' 'been,' 'agree,' &c. In the present instance we see it doubled no less than five times, although the cryptograph is brief.
"Let us assume 8, then, as e. Now, of all words in the language, 'the' is most usual; let us see, therefore, whether there are not repetitions of any three characters, in the same order of collocation, the last of them being 8. If we discover repetitions of such letters, so arranged, they will most probably represent the word 'the.' Upon inspection, we find no less than seven such arrangements, the characters being ;48. We may, therefore, assume that ; represents t, 4 represents h, and 8 represents e — the last being now well confirmed. Thus a great step has been taken.
"But, having established a single word, we are enabled to establish a vastly important point; that is to say, several commencements and terminations of other words. Let us refer, for example, to the last instance, but one, in which the combination ;48 occurs — not far from the end of the cipher. We know that the ; immediately ensuing is the commencement of a word, and, of the six characters succeeding this 'the,' we are cognizant of no less than five. Let us set these characters down, thus, by the letters we know them to represent, leaving a space for the one unknown —
"Here we are enabled, at once, to discard the 'th' as forming no portion of the word commencing with the first t; since, by experiment of the entire alphabet for a letter adapted to the vacancy, we perceive that no word can be formed of which this th can be a part. We are thus narrowed into
and, going through the alphabet, if necessary, as before, we arrive at the word 'tree,' as the sole possible reading. We thus gain another letter, r, represented by (, with the words 'the tree' in juxtaposition.
; "Looking beyond these words, for a short distance, we again see the combination ;48, and employ it by way of termination to what immediately precedes. We have thus this arrangement:
the tree ;4(‡?34 the,
or, substituting the natural letters, where known, it reads thus:
the tree thr‡?3h the.
"Now, if, in place of the unknown characters, we leave blank spaces, or substitute dots, we read thus:
the tree thr...h the,
when the word 'through' makes itself evident at once. But this discovery gives us three new letters, o, u and g, represented by ‡ ? and 3.
"Looking, now, narrowly, through the cipher for combinations of known characters, we find, not very far from the beginning, this arrangement,
83(88, or egree,
which, plainly, is the conclusion of the word 'degree,' and gives us another letter, d, represented by †.
; "Four letters beyond the word 'degree,' we perceive the combination
; "Translating the known characters, and representing the unknown by dots, as before, we read thus:
an arrangement immediately suggestive of the word 'thirteen,' and again furnishing us with two new characters, i and n, represented by 6 and *.
; "Referring, now, to the beginning of the cryptograph, we find the combination,
"Translating, as before, we obtain
which assures us that the first letter is A, and that the first two words are 'A good.'
"To avoid confusion, it is now time that we arrange our key, as far as discovered, in a tabular form. It will stand thus:
5 represents a
† " d
8 " e
3 " g
4 " h
6 " i
* " n
‡ " o
( " r
; " t
"We have, therefore, no less than ten of the most important letters represented, and it will be unnecessary to proceed with the details of the solution. I have said enough to convince you that ciphers of this nature are readily soluble, and to give you some insight into the rationale of their development. But be assured that the specimen before us appertains to the very simplest species of cryptograph. It now only remains to give you the full translation of the characters upon the parchment, as unriddled. Here it is:
'A good glass in the bishop's hostel in the devil's seat forty-one degrees and thirteen minutes northeast and by north main branch seventh limb east side shoot from the left eye of the death's-head a bee line from the tree through the shot fifty feet out.' "
And that’s how you solve a cipher. Easy, right? We thought we’d give you a little challenge, to see how you stack up against the brilliant cipher minds of history. Using the Gold Bug cipher can you decrypt our code?
Solve the Cipher
1(68†95* ]5) 6*).6(8† 2: 8†35( 5005* .‡8 5*† 28-598 5 95);8( -(:.;5*50:); ;48 38(95* 8*6395 95-46*8 ]5) †8-6.48(8† 2?; 1(68†95*) *8¶8( ]5) 6; ]5) -5008† )63525 ;‡ ;46) †5: *‡ ‡*8 8¶8( )?--8))1?00: †8-(:.;8† ;48 598(6-5* )63525 95-46*8
Write out the cipher on a piece of paper leaving enough space in between each line. You’ll need the space to write your solutions underneath. (example below)
1(68†95* ]5) 6*).6(8† 2: 8†35( 5005* .‡8 5*† 28-598 5 95);8( -(:.;5*50:); ;48 38(95* 8*6395
95-46*8 ]5) †8-6.48(8† 2?; 1(68†95*) *8¶8( ]5) 6; ]5) -5008† )63525 ;‡ ;46) †5: *‡ ‡*8 8¶8(
)?--8))1?00: †8-(:.;8† ;48 598(6-5* )63525 95-46*8
Count out the times each symbol/number is used in the cipher and write the results on a table ordered most to least frequently used. You’ll start to see that one of the symbols/numbers clearly stands out from the rest. This is your first and biggest clue and is most likely the most commonly used letter in the English language...E. Write in that letter next to each symbol in the cipher. (example below)
character 8 = 10 times
character ; = 5 times
character 4 = 2 times
Now we have, as Legrand said, “the groundwork” for more. Look for recurring combinations of symbols to try and guess words. Legrand identified the word THE as the most commonly used word. After that the most frequently used words in order are: OF, AND, TO, IN. Once you’ve spotted all the THE’s then you’ve discovered 2 more letters. Write them down on your table accordingly. Each new letter you decipher leads to more fragments of the message you’ll uncover. Eventually you’ll get the entire message decoded.
As technology progressed machines and advanced mathematics made these ciphers more and more complex. Amazingly the same principles that Poe’s Legrand used to break the Gold Bug cipher still apply today.
Good luck! If you successfully solve it, email the answer to firstname.lastname@example.org and the first responders will get a message from Glenn!
Note: This article originally said to send answers to email@example.com. Please re-send to firstname.lastname@example.org