Вы находитесь на странице: 1из 25

# This special report created for http://www.IncreaseBrainpower.

com

## The Book Of Codes by Steve Gillman

Legal Notice and Disclaimer The author has used his best efforts to verify the information contained in this e-book, but makes no warranties with respect to the accuracy or applicability of the information. The author shall not be held liable for loss or damage resulting from use or misuse of the material here. All rights reserved. No part of this book may be reproduced or used without permission from the author, except for brief quotations that are properly credited.

## The Book Of Codes by Steve Gillman

Introduction
This is a relatively simple look at how to make codes and how to break them, which includes examples and exercises. In other words, this book, like many, needs no long introduction, so unlike others, it has a short one.

## The Book Of Codes by Steve Gillman

Part 1: Letter Frequency and Code Breaking Part 2: Decoding a Secret Message - An Example Part 3: Cryptograms and Ciphers Part 4: Two Cryptogram Puzzles Part 5: Creating Secret Codes

## Part 1 Letter Frequency and Code Breaking

Have you ever tried code breaking? How about code making? Either is a good activity for stimulating the brain. Breaking a secret code starts with an understanding of how often certain letters normally appear in written language, also known as "letter frequency."
Letter Frequency in English a 8.167% b 1.492% c 2.782% d 4.253% e 12.702% f 2.228% g 2.015% h 6.094% i 6.966% j 0.153% k 0.772% l 4.025% m 2.406% n 6.749% o 7.507% p 1.929% q 0.095% r 5.987% s 6.327% t 9.056% u 2.758% v 0.978% w 2.360% x 0.150% y 1.974% z 0.074% Source: Wikipedia

## The Book Of Codes by Steve Gillman

Letter frequency varies by language, of course. It also varies according to topic. If you're speaking about jail and jokes told in jail, you'll use the letter "j" far more often than it is normally used. There are some differences between dialects, due to different word frequency and spellings. For example, the ending "ise" is more common in British English, while "ize" is used more in American English. As a result, the use of "z" is more common in American English. Finally, everyone writes differently, a fact sometimes used to prove or disprove authorship, since the average sentence length, word length, and letter frequencies can be determined for each author. A forger can rarely forge an authors "style," as determined in part by these measurements. Interestingly, during World War Two, British code breakers used this kind of analysis to identify not just what a secret message said, but who was sending it. Simple Code Breaking Below are the letters of the English alphabet, arranged in order of prevalence (on average). Using a frequency table like this is the first step in code breaking. e-t-a-o-i-n-s-h-r-d-l-c-u-m-w-f-g-y-p-b-v-k-j-x-q-z The simplest secret codes simply replace each letter in a message with a different letter (w = a, x = b, d = c, ...). Periods and other punctuation may or may not be used. Simple code breaking then, begins with identifying the most common letter in a message and replacing each of these with an "e", since that is the most common and therefore most likely letter. If that doesn't seem to work, a "t" or "a" would be tried next. There are other ways to speed up the process. As you look at the coded message below, for example, you can see a few words that are just one letter. The obvious candidate is "a", although it could be the letter "i" as well. A three-letter word starting several sentences is likely to be "the", in which case (if the assumption is correct) you would know three more of the letters. Starting with these simple code breaking guidelines, see how quickly you can
5

## The Book Of Codes by Steve Gillman

break the code and read the message below. A Secret Message lbyeavb ha hpb lagyc at eacb sgbxzqwr. hpb zbm, xi mak vxm pxjb wahqebc, qi ha ihxgh lqhp hpb iqvdyb gkybi xsakh ybhhbg tgbfkbwem, xwc hgm wbl xddgaxepbi awym lpbw hpbib txqy. qw ahpbg lagci, mak dyxm hpb acci ha ihxgh, xwc hpbw xcukih makg ihgxhbrm xi wbbcbc. hpqi qi x rgbxh sgxqw bnbgeqib. egbxhqwr makg alw, daiiqsym kwsgbxzxsyb eacb qi xyia x tkw lxm ha bnbgeqib makg sgxqw. hpbgb lqyy sbe vagb aw hpxh qw xw kdeavqwr qiikb at hpb sgxqwdalbg wbliybhhbg. hpb iksiegqdhqaw tagv qi aw hpb pavb dxrb. lpm wah ra iqrw kd wal? Of course no good code creator would make such a simple code, so don't get too excited if you figured this one out. Breaking a tough code requires a bit more knowledge than the simple rules laid out here.

7

## The Book Of Codes by Steve Gillman

That's makes it 11.4% of the total letters (413 total), close to the 12.7% normal frequency of the letter "e" in written English. Of course, you don't need to figure the percentages. It is enough to note that it is the frequent letter, and so it is likely to represent "e". You try that first. If that didn't work, you would try the next most common letter (t) and so on. If you look for the common three letter words with "b" in them (which represents "e"), you'll quickly see that there are several times that "hpb" occurs. One of the more common three-letter words ending in "e", of course, is "the" Assuming "hpb" is "the, you now have three of the letters decoded. Change all the letters of "h" and "p" and "b" to "t", "h" and "e", and the message starts to get easier and easier to decode. By the way, there are tools that can help you do this, including some that you may have in your computer. If you have a spell checker, for example, start it just before the first "hpb", insert "the" as the correct spelling, and click the "change all" button. That will speed things up. You can do this repeatedly as you decipher each word. It also helps to write the original message with double or triple spacing. That way you have space to try several possibilities when deciphering it. Remember too that "a" and "I" are the only one-letter words in English. In the above message, this makes finding the code letter for "a' very easy, and then it is easy to find the letter "n" in "an" and the "d" in "and". Here is the code used to create the message: a=x b=s c=e d=c e=b f=t g=r h=p i=q j=u k=z l=y m=v n=w o=a p=d q=f r=g s=I t=h u=k v=j w=l x=n y=m z=o

## Part 3 Cryptograms and Ciphers

The terms "cryptograms," "ciphers" (sometimes spelled cypher) and "secret codes" are often used interchangeably. They are closely related, as you can see from these dictionary definitions: Code 1. A system of signals used to represent letters or numbers in transmitting messages. 2. A system of symbols, letters, or words given certain arbitrary meanings, used for transmitting messages requiring secrecy or brevity. Cryptogram 1. A piece of writing in code or cipher. Also called cryptograph. 2. A figure or representation having a secret or occult significance. Cipher 1. A cryptographic system in which units of plain text of regular length, usually letters, are arbitrarily transposed or substituted according to a predetermined code. 2. The key to such a system. 3. A message written or transmitted in such a system.

## Let's look at these a little deeper...

Cryptograms
Cryptograms are messages delivered in secret codes, and were first used for secure wartime communications. One of the oldest versions known was a strip of paper wrapped around a stick. This was used by the Spartan Army over two thousand years ago. A strip of paper was wrapped around the stick or staff, edge-to-edge without overlapping, and the message was written vertically. To read it, the receiver had to wrap the paper strip around a stick of the exact same diameter as the one used to create the message, so the letters would line up correctly. The receiver knew what diameter stick to use, of course. Meanwhile, any messages intercepted would take some time to be decoded, because even if the enemy knew to use a stick, he had to find one of the right diameter. Cryptograms are used primarily for entertainment now. They are usually created using a simple substitution cipher, in which each letter is replaced by another letter or number. The Caesar Cipher, invented by Julius Caesar, may have been the first of this type. These secret codes have been used as entertaining puzzles for over a thousand years. Solving a cryptogram is usually done using "frequency analysis." This involves looking for the coded letters which are most frequent in a message, and substituting the real letters which occur most often in common usage. In English, the most common letter used is "e," followed by "t" and "a". You also look for one-letter words, since these typically can only be "a" or "i".

A Cryptogram Example
A Caesar Cipher is a simple "shift cipher." You simply substitute for each letter another letter that is a fixed number of positions away in the alphabet. For example, if you were to use a "shift" of five letters, the letter "a" would be represented by "f", "b" would be represented by "g", and so on. Here is the
10

## The Book Of Codes by Steve Gillman

complete code: a=f, b=g, c=h, d=i, e=j, f=k, g=l, h=m, i=n, j=o, k=p, l=q, m=r, n=s, o=t, p=u, q=v, r=w, s=x, t=y, u=z, v=a, w=b, x=c, y=d, z=e A short coded message: Ymnx xnruqj rjxxflj nx bwnyyjs zxnsl f Hfjxfw Hnumjw. Of course, if the code breaker suspects that this cryptogram is a simple shiftcipher, she could start with the the single-letter word "f", which would almost certainly be "a". Counting the five letters from "a" to "f", the code would be broken. The message could be decoded in minutes and read as follows: "This simple message is written using a Caesar Cipher." As you can imagine, any cryptogram as simple as this can be easily broken. Since there are only 26 "shifts" possible in English, you could break such a code quickly by trial and error. A computer program could try all 26 in seconds, then display the 26 versions, and the viewer (or computer) would immediately recognize which was readable. This is why simple substitution ciphers, while used for entertaining puzzles, are not used by themselves for truly secret messages. They may be used as a start, however. The Vigenre cipher, for example, uses a shift, but shifts again at different points in a message, the shift value determined by a repeating keyword. There are many other ways to make a cryptogram or secret code more difficult to break.

Ciphers or Cyphers?
Ciphers, also spelled "cyphers," are the same as "codes" in common usage.
11

## The Book Of Codes by Steve Gillman

However, there is a technical distinction used by cryptographers. A code is something that works at the level of meaning. In a coded message, for example, "big daddy" might refer to a person or a boat, or anything else. An otherwise meaningless string of letters or numbers, like "wwx23" could represent a word or a whole phrase, such as, "Meet me at the usual place." (Because of this, codes can actually shrink the length of a message, or the space and time needed to create it.) A cipher (or cypher), on the other hand, works at the level of individual letters or at least small groups of letters, or even bits of information in the case of modern computer encryption. A simple substitution cipher, for example, might replace each letter with a two-digit number (a=11, b=47, etc). Using both ciphers and codes in the same system makes messages even harder to decipher. The problem with codes is that they can require a large and ever growing code book that both the sender and receiver of secrets messages must have. Some cryptographers also argue that coded messages are ultimately easier to decode or decipher than messages based on a good cipher. Ciphers, rather than codes, have become the dominant method of encryption in modern cryptography. (Cryptography is "The process or skill of communicating in or deciphering secret writings or ciphers.")

Creating a Cipher
A couple more definitions: A cipher (or cypher) is essentially an algorithm - a procedure for enciphering (encrypting) and deciphering (decrypting) information or messages. The encrypted message - sometimes called "scrambled", is referred to as "ciphertext". The decrypted message is referred to as "plaintext" (after it has been "unscrambled"). Ideally, good codes or ciphers should be "unbreakable", meaning they are impossible to decipher without having the key. What is the key? If the cipher uses simple letter substitution, the key may
12

## The Book Of Codes by Steve Gillman

simply be a chart showing which letter represents which: a=d, b=t, c=f, etc. However, a key may also be a bit of information that determines which algorithm is used. For example, a system may use twenty different lettersubstitution ciphers, but which one is used may change at every fourth letter. The number "4" could be the key telling the receiver the frequency of the changes. In a more complicate scheme, even the frequency of the changes may change. For example, suppose a sender and receiver each have the same list of ten different letter-substitution ciphers. A key might be sent separately from a message, or encoded in a different way in the message, and consist of a string of digits; for example, "3468". This key could tell the receiver to change the cipher used after 3 letters, and again after 4 letters, and again after 6 letters, and again after 8 letters, and again after 3 letters, and so on. The receiver works his way down the list of ciphers, changing to the next one at each position indicated by the key. As you can imagine, analyzing letter frequency - one of the traditional ways of breaking codes, as outlined in Part 1, wouldn't be very helpful here. Without the key, even a message using such simple letter-substitution ciphers can be very difficult to decipher.

13

## Part 4 Cryptogram Puzzles

Here are two cryptogram puzzles for you to test your cryptography skills on. Don't have any cryptography skills yet? Then read Part Two again and practice.

## Two Cryptogram Puzzles

Puzzle # 1 Let's start with a simple Caesar cipher. If you have worked on cryptograms before, you'll want to skip past this one, as it will not be much of a challenge. Decipher the following quote from a famous mathematician: fq pqv yqtta cdqsv aqwt fkhhkewnvkgu kp ocvjgocvkeu. k ecp cuuwtg aqw okpg atg uvknn itgcvgt. - cndgtv gkpuvgkp Puzzle # 2 This one uses numbers in place of letters. Decipher the following quote about intelligence. 3325863186 2432 881621163412 3216 24313124331933248826 1932 3216878621163412 53243325 89863232 248833868989242686884286 198834 87163186 3286883286 33251988 5386 25195786. - 341688 258631168934 As you might imagine, a cipher using numbers can be tougher than one using
14

## The Book Of Codes by Steve Gillman

letters. There are only 26 letters in English after all, while even just using a two-digit number for each letter allows for 100 possible substitutions. This isn't a very difficult cryptogram, however. It still uses a simple alphanumeric-substitution cipher, and so can be solved using letter-frequency analysis or even a "brute force attack," in which you try out the various possibilities one after the other. Ready for the solutions to these cryptogram puzzles? Hopefully you tried to get at least one of them first, but you'll find the solutions and an explanation of how to arrive at them below.

*************** ************** ************* ************ *********** ********** ********* ******** ******* ****** ***** **** *** ** *

15

## The Book Of Codes by Steve Gillman

Cryptogram Solutions

Cryptogram Puzzle # 1 (Quote from a famous mathematician.) fq pqv yqtta cdqsv aqwt fkhhkewnvkgu kp ocvjgocvkeu. k ecp cuuwtg aqw okpg atg uvknn itgcvgt. - cndgtv gkpuvgkp The Solution: Do not worry about your difficulties in mathematics. I can assure you mine are still greater. - Albert Einstein This is an easy one, to say the least. If you have been paying attention, you know that a Caesar cipher simply substitutes for each letter another letter that is a fixed number of positions away in the alphabet. In this case the shiftvalue is two, so a=c, b=d, c=e, etc. If you suspect a simple shift-cipher (in this case it was a given), you don't need to use letter frequency analysis. What is referred to as a "brute force attack" will work fine. Start with a shift of one and test a piece of the ciphertext. In this case, "fq" would be "ep" (move back one position in the alphabet for each letter). It isn't a word, so you can stop there. A shift of two gives you "do," which is a word. Looking at the alphabet in front of you (a b c d e f g h i j k l m n o p q r s t u v w x y z), you can easily move back two positions for each letter in the second word: "pqv" = "not". The rest can be deciphered in a couple minutes now that you have the key. Interestingly, in this case, you could also have guessed that a "famous
16

## The Book Of Codes by Steve Gillman

mathematician" might be Albert Einstein, and compared the name to the obvious attribution at the end of the ciphertext. Sure enough, the number of letters matches, and "gkpuvgkp" has a repeating pair of letters, just like the "ei" that repeats in "Einstein". If the shift had been greater than two, this might have been the faster way to solve the puzzle. Cryptogram Puzzle # 2 (Quote about intelligence.) 3325863186 2432 881621163412 3216 24313124331933248826 1932 3216878621163412 53243325 89863232 248833868989242686884286 198834 87163186 3286883286 33251988 5386 25195786. - 341688 258631168934 The Solution: There is nobody so irritating as somebody with less intelligence and more sense than we have. - Don Herold First of all, you know that there cannot be one digit substituted for each letter, since there are only 10 digits. On the other hand, numbers of two, three or more digits could be substituted for each letter, so how do you determine how many are used? You make an educated guess. You might notice that there is an even number of digits in each word, so you know that either two or four digits are being used for the letters. Why not more? Because there are four words with just four digits, which have to have at least one letter, right? But in fact, four one-letter words in a short quote seems less likely than four two-letter words, so two digit numbers are most probable (and if you are wrong, you start over and try it another way). Now that you have decided the letters are represented by two-digit numbers, it will be easier to solve the cryptogram by separating out the letters. You can copy and paste the puzzle onto any word processing program to do this, or
17

## The Book Of Codes by Steve Gillman

use pen and paper. I would also use dividers for words, as in this example: 33 25 86 31 86 | 24 32 | 88 16 21 16 34 12 | 32 16 | 24 31 31 24 33 19 33 24 88 26 | 19 32 | 32 16 87 86 21 16 34 12 | 53 24 33 25 | 89 86 32 32 | 24 88 33 86 89 89 24 26 86 88 42 86 | 19 88 34 | 87 16 31 86 | 32 86 88 32 86 | 33 25 19 88 | 53 86 | 25 19 57 86. - | 34 16 88 | 25 86 31 16 89 34 Now you can more easily see the numbers that represent the letters. Solving cryptograms like this usually involves letter frequency analysis, as already discussed. In Part 1 there is a table showing the statistical averages for how often letters show up in English, but here is a simple distribution of letters from most-frequent to least frequent: e-t-a-o-i-n-s-h-r-d-l-c-u-m-w-f-g-y-p-b-v-k-j-x-q-z If you write a list of the numbers used in the cryptogram, and then count how many times each appears, you can see that "86" is the most common: 11-0 12-2 13-0 14-0 15-0 16-8 19-5 24-7 25-5 26-2 27-0 21-2 31-5 32-8 33-6 34-5 42-1 53-2 55-0
18

## The Book Of Codes by Steve Gillman

57-1 86-13 87-2 88-8 89-4 90-0 91-0 In fact, it occurs 13 times, which is 5 more than the next closest number. It is almost certainly one of the first three letters in the frequency table, so you start with "e", since it is the most common by far. Replacing every "86" with an "e", you can look for any other related clues. The next to last word before the author's name is "53 86", for example, meaning it is a two-letter word ending in "e", probably "be", "he", "me", or "we". "53" then, is one of the four letters: b, h, m or w. Since it shows up only twice, it is most likely a "b" or "w", since these are less common than the others. You can play with this for a minute, to see if it yields anything. For example, "53 24 33 25" (8th word) is the only other word with "53" in it, so what four letter words start with "b" or "w"? There are fewer "w" words, so start with those. "With" and what" and "when" come to mind, but we can eliminate the last because there is no "86" in the word ("e"). "With" would make the following true: 53=w, 24=i, 33=t, 25=h. Applying these to the first word of the cryptogram, you get "t-h-e-31-e". that sure looks like a word if the "31" is an "r". Furthermore, if the first word is "there", what could the two-letter second word be? Almost certainly it would be "is", giving us 24=i and 32=s. This is a speculative guessing game to some extent, and it isn't likely you took this route to a solution. But you can guess like this and test your guesses. Otherwise you return to the frequency table and note that "16", "32" and "88" tie for the second most common numbers, appearing 8 times each. Anyone of these are likely to be any of the next five or six letters in the frequency table. You may notice that a small sample of ciphertext makes using the frequency table more difficult. In this case, for example, the letter "t" is not the second most common letter, as is normal in the English language, but sixth.
19

## The Book Of Codes by Steve Gillman

Essentially you keep testing and looking for patterns. If a given assumption doesn't work to produce a word, you drop it and try another. If you notice that the word "89 86 32 32" ends in a double letter, you might stop to think about the possibilities. An "ss" or "ll" is most likely, and you know that "86" is probably "e", so you can look at possible four-letter "ess" or "ell" words to determine the first letter. Of course, don't limit yourself to clues within the cryptogram for solutions. I did mention in the original puzzle that it was a quote about intelligence. That made it likely that the word "intelligence" might be in the message. Look for a 12-letter word with a double letter in in and you'll find one: "24 88 33 86 89 89 24 26 86 88 42 86". That gives you a lot of information to work with.

20

## Part 5 Creating Secret Codes

If you have read through the information above on codes and ciphers, and tried one or two of the cryptograms, you probably understand that simple substitution ciphers don't offer much real security for a secret message. If you want to create an unbreakable code (or as close as you can get to one), you have to do something more. On the other hand, you don't want to get too complicated. During World War Two, some Russian army units were confused by more complicated codes, and so reverted to simple Caesar ciphers. As a result, their messages were easily deciphered by the Germans. Obviously, simplicity can be a virtue in codes and ciphers, as long as they are also difficult to decipher.

## Ways to Make Tougher Codes and Ciphers

Make Mistakes on Purpose A subscriber to my Brainpower Newsletter (now The Mind Power Report) pointed out a mistake I made in a cryptogram. A few mistakes like that would make it hard to decipher, he said. True, but for the receiver of a message who had the key, there would be no problem. For example, suppose in deciphering a message you thought "w" was "e", and then came to a word that should be "the", but was coded as "ghs". That might throw you off, but a receiver of a message would have no trouble deciphering and then understanding a message full of misspellings: "Tha munny is in an envalope unda the rok." Purposeful mistakes can make a message more secure without any real problems for the receiver.

21

## The Book Of Codes by Steve Gillman

Don't Punctuate Another very simple way to further disguise a message is to leave out the punctuation. This simple change makes the code-breaker's job much tougher. Look at the following cryptogram from the Cryptogram Puzzles Page, with punctuation removed: 3325863186243288162116341232162431312433193 3248826193232168786211634125324332589863232 2488338689892426868842861988348716318632868 8328633251988538625195786341688258631168934 With no separation of words, it's difficult to even guess whether letters are represented by 2 digits or 5. But will it cause any problems for the receiver? probablynotyoucanreadthiscantyou? Mix Codes and Ciphers Suppose "The eggs are in the basket," was code for "The money has arrived." If it was then also enciphered using numbers to replace letters, it would be doubly encrypted. Even if the code breaker deciphers the string of digits, he is left with a message that doesn't mean anything to him. A system of this sort does require that the sender and receiver have a code book in addition to a cipher key, however, so it is not as simple to use a other methods. Use Multiple Ciphers If each word in a message was encrypted with a different cipher, it would be very difficult to decipher. But could you keep such a system simple enough to use for communication? Yes. Suppose you have twenty simple alphanumeric ciphers. In one perhaps a=3, b=e, c=5, and so on. Each other cipher is different, but all twenty could be listed on one page. Now if each party had that key, the understanding could be that the user would simply start with the first cipher for the first
22

## The Book Of Codes by Steve Gillman

word, then the second for the second word, and after twenty words start back at the first cipher, and so on. That would be a tough code to break without the key. You might think it would be virtually unbreakable. It seems that letterfrequency analysis would be useless since the cipher changes continually. And how could anyone guess that there were twenty different ciphers being used? Actually, there is a way. First of all, at some point in the process a code breaker would probably realize that there was more than one cipher involved. With 10 digits and 26 letters, there would only be about a thousand possible codes based on a simple substitution cipher (my math may be off, but you get the point - there is a limited number). A "brute force attack" on one or two words using all possible ciphers would prove that there was more than one cipher used. At that point the code breaker might notice that there is some repetition of one-letter words, which are likely "a" and three letter words, likely to be "and" or "the". For example, if the cipher text was long enough he might see "6" occur three times, and "y7k" appear four times, including once at the start of a sentence. These are likely "a" and "the". A rotating set of ciphers is not a new idea, so he considers how to determine how many ciphers were used. He counts the number of words between each "6", and "y7k", to arrive at the maximum number of ciphers that could be rotated through the message. Suppose there are only 24 words between one "y7k" and another. Both are almost certainly the same cipher (it is possible that "y7k" represents a different word in another cipher, but not likely), so he now knows that there no more than 23 ciphers. If there were more "y7k" could not show up again after just 24 words. Now, on the assumption that the ciphers are simply rotated through, he takes every 23rd word out and applies letter-frequency analysis to these. This yields nothing, so he tries every 22nd word, and then every 21st word. Finally, when he tries every 20th word, he breaks one cipher, and more importantly, determines that there are 20 rotating ciphers in the message.
23

## The Book Of Codes by Steve Gillman

He then can start with the second word of the message and pull out every 20th word after that to break the second cipher. Following the same procedure, he soon has every cipher figured out. Then he can reassemble the message to read it. As you can imagine, with the help of computers, almost any code you can invent could be broken. But it can take a long time and a lot of work to break a code that is well-constructed. ###

By Steve Gillman
(the end)

24

## Нижнее меню

### Получите наши бесплатные приложения

Авторское право © 2021 Scribd Inc.