Вы находитесь на странице: 1из 11

Crypto Primer: Understanding encryption, public/private key, signatures and certificates

17 ...Planky 22 Oct 2010 4:57 PM


So much of what I will write on this blog will assume a knowledge of crypto. I thought Id create a post I could
reference back to for many future posts to keep things simple and easy to understand.
Algorithms and keys
We all know what an algorithm is. In cryptography its not the algorithm you keep secret. The algorithm should be
designed in such a way that if it is discovered, unless the snooper has the key, it is useless to him/her. Many
algorithms are publicly published. Key secrecy is whats important. It even goes as far as to say that if you know
the algorithm, in other words, you know what mathematics was used and you also know the ciphertext itself, a
good encryption algorithm will still keep the plaintext data secret! It sounds impossible because if somebody told
me the answer is 27 and I know to get that number the algorithm divided by 10, added 4 then multiplied by 3,
Id be able to pretty quickly calculate the input was 50. But these algorithms use special one-way functions (which
well look at in a moment) which make that impossible (or at least so difficult youd never bother to attempt it) to
do.
There are 2 broad classes of algorithm symmetric and asymmetric. Symmetric algorithms use the same key for
both encryption and decryption. Asymmetric algorithms, such as public/private key cryptography use one key for
encryption and a different, though mathematically related key for decryption. That last sentence sounds counter-
intuitive. As does the idea that publishing your algorithm still maintains secrecy. But thats the truth. Of course an
algorithm is only secure until/if somebody finds a weakness.
There is a scenario known as the doomsday scenario. This is the day somebody publishes an algorithm for
predicting prime numbers. Almost the entirety of public/private key cryptography (used by protocols such as
SSL/TLS) is based on the notion that there is no pattern to a series of prime numbers, other than that they are
prime.
There is a possibility that somebody has already come up with a prime-prediction algorithm. It would certainly be
in their interest to keep it secret!
The encryption algorithm has 2 inputs plaintext and the key. It has one output, ciphertext.
If decrypting data, 2 inputs: ciphertext and the key. It has one output, plaintext.

One Way Functions
Mathematical functions where it is difficult (or impossible) to get back to the source values, knowing only the
output values, are known as one-way functions. There are many, but modular arithmetic gives us a method that is
used extensively in cryptography.
A simple example is where, on a 12 hour clock face, you add 5 hours to 9am. The answer is 2 pm. Or written down
we could say:
9+5=2
Because this is an example of modular arithmetic where the modulus is 12, wed actually write:
9+5=2(mod12)
Lets take a simple function:
3
x
where x=2
This is a function for turning 2 in to 9, because its the same as 3 * 3, which equals 9. There is a direct relationship
between the magnitude of x and the magnitude of the function result. Using modular arithmetic can give the
function a great property unpredictability.

x 1 2 3 4 5 6
3
x
3 9 27 81 243 729
3
x
(mod7)
3 2 6 4 5 1
Your password in Windows is stored as a one way function albeit one that is considerably more complex than
what youve just seen.
Hashes
There are some very useful algorithms that produce fixed length strings. For example it doesnt matter how much
data you feed in to an MD4 hashing algorithm, youll always get a 128 bit string out of it. That means there must
be rather a lot of input strings that will produce exactly the same output strings and indeed there are; theyre
called collisions; they are so rare, you might as well consider them to never happen, in just the same way you are
very unlikely to ever have 2 GUIDs of the same value if the programming framework you use has true randomness
and a uses big enough numbers for GUIDs. MD4 is a complicated one-way function, so predicting collisions is, to
all intents and purposes, impossible. Unless you keep trying and comparing the output strings. This is, in theory
the only way to find out. Its called a brute force attack. You just use computing power to very quickly run through
all possible combinations. This could take a long time. A very long time. Hashes are very useful because they allow
you to perform comparisons very quickly. If you have a large message, you can create an MD4 hash of the
message and send the hash to somebody over a slow network. They can then run a hash on data they hold, which
they believe to be identical and compare the hash you sent them, with the one they just generated. If they are the
same, it means the 2 datasets are the same.
So say if I take the string 1 and put it through hashing functions Ill end up with:
MD4: 8BE1EC697B14AD3A53B371436120641D
MD5: C4CA4238A0B923820DCC509A6F75849B
However, if I take the string 2, which is numerically almost identical to the first string Ill get massively different
results.
MD4: 2687049D90DA05D5C9D9AEBED9CDE2A8
MD5: C81E728D9D4C2F636F067F89CC14862C
All this stuff comes in very useful with digital signatures which Ill describe a little later.
Key Distribution
The problem with encrypting/decrypting data in the way Ive told you so far, is that you have to somehow get the
decryption key safely to your partner. It could easily be intercepted in this day of us all being permanently online.
Its the age-old problem that generals have had of communicating with their officers in the field for centuries. If
the messenger who has the key is captured, all your communication can be decrypted, whether or not
subsequent messengers know the key. This is called the key distribution problem.
3 mathematicians came up with an answer to this problem: Diffie, Hellman and Merkle. They do an exchange of
data which can be intercepted by anybody but which allows both sender and receiver to generate the same key
but doesnt allow the interceptor to generate the key. Sounds incredible? Its very simple to explain, now that you
understand modular arithmetic.
Follow the steps 1 through 4. In the last step both Alice and Bob have the same key: 9. From this point on they
can use 9 as their encryption and decryption key. All I can tell you is that it works. Why it works I have no idea I
am very poorly educated! However, Im happy to live with the fact that it just works. If its beyond you, youre not
alone. Maybe you are also poorly educated! I do think its really clever, neat and cool though.
Asymmetric Key Encryption
We use one key to encrypt, and a related key to decrypt data. You can actually swop the keys round. But the point
is you dont have one key. This gets round the key distribution problem. Theres a great way of describing the
difference between symmetric and asymmetric key encryption. It involves the use of a box to put messages in and
we have to assume the box, its clasps and the padlock used to lock it are impossible to penetrate.
Symmetric Key: You send a messenger out with a copy of the key. He gets it to your recipient who lives 10 miles
away. On the way he stops at a pub and has his pocket picked. The key is whisked off to a locksmith who copies it
and it is then secreted back in to the messengers pocket.
Some time later you send a messenger with the box containing your message. You are confident that your
recipient is the only one who can read the message because the original messenger returned and reported
nothing unusual about the key. The second messenger stops at the same pub. He is pick-pocketed. The copy key
is used to unlock the box and read the message. The box with its message intact is secreted back in to the
messengers pocket. You and your recipient have no idea that your communication has been compromised. There
is no secrecy
Asymmetric Key: Your recipient has a padlock and key. He keeps the key in a private place about his person. Lets
therefore call it a private key. He puts the padlock in to the box, but leaves it unlocked. He doesnt mind if
anybody sees the padlock. Its publicly viewable. Even though its not really a key, lets call it a public key. He
sends a messenger to you with the box. The messenger stops at the pub and is pick pocketed. All the snoopers
see is an open padlock. They secretly return the box. The messenger arrives at your door. You take the padlock
out of the box and put your message in to it. You use the open padlock to lock the box, snapping it shut and you
send the messenger on his way. He again stops at the pub and is pick-pocketed. They find only a padlocked box.
No key. They have no way of getting in to the box. They secretly return the box to the messengers pocket. The
messenger gets to your recipient, who use the key he secreted in a private place about his person (the private
key) and uses it to unlock the padlock and read the message. Secrecy is maintained.
You can see the process is a bit more complicated for asymmetric key than for symmetric key, so its not
something youd want to do often. So what is often done is that instead of putting a message in the box and
padlocking it, a symmetric key is put in the box and padlocked. That way, you solve the key distribution problem.
Thats what happens with computer cryptography mostly. Public/private key cryptography is used to transport a
symmetric key that is used for message exchanges. One reason for doing this is that asymmetric key crypto, or
public/private key crypto, as it is known, is expensive, in terms of computing power, whereas symmetric key
crypto is much more lightweight. When you see that a web site uses 256 bit encryption, they are talking about the
symmetric key that is used after the public/private key crypto was used to transport the symmetric key from
sender to receiver. Often the key lengths for public/private key cryptography is 2048 bits. You may have found
yourself confused when setting up IIS with 256 bit SSL encryption and seeing keys of 1024 or 2048 bits. This is
why its the difference between whats called the session key and the public/private keys.
Although the diagram above explains how 2 keys are used, where does all this public and private key malarkey
come in to play?
Lets take the example of an ecommerce web server that wants to provide SSL support so you can send your
credit card details securely over the Internet. Look at the public and private keys in the following diagram.
The public and private keys are held on the ecommerce web server. The private key is heavily protected in the
keystore. Many organisations will go as far as to have a special tamper-proof hardware device to protect their
private keys. The public key doesnt need to be protected because its, well, public. You could have daily printouts
of it in the newspapers and have it broadcast every hour, on the hour, on the radio. The idea is that it doesnt
matter who sees it.
The website generates the public and private keys. They have to be generated as a key-pair because they are
mathematically related to each other. You retrieve the public key from the website and use it as your encryption
key. Youre not just going to send your credit card information across the Internet yet. Youre actually going to
generate a symmetric key and that is going to become the plain-text input data to the asymmetric encryption
algorithm. The cipher-text will traverse the Internet and the ecommerce site will now use its private key to decrypt
the data. The resulting output plain-text will be the symmetric key you sent. Now that both you and the
ecommerce site have a symmetric key that was transported secretly, you can encrypt all the data you exchange.
This is what happens with a URL that starts https://.
There are still a couple of problems to solve here, but lets put them on to the back-burner for a little while. We
need to understand digital signatures and certificates for those problems. In the meantime lets have a peek at
the mathematics inside the public/private key algorithm. There is an interesting little story-ette around this
algorithm. A researcher at the UKs GCHQ called Clifford Cocks invented the algorithm in 1973. However, working
for GCHQ, his work was secret, so he couldnt tell anybody. About 3 years later, 3 mathematicians, Ron Rivest, Adi
Shamir and Leonard Adelman also invented it. They went on to create the security company RSA (which stands for
Rivest, Shamir and Adelman). It is said the RSA algorithm is the most widely used piece of software in the world.
First, well generate the public key. We pick 2 random giant prime numbers. In this case, Ill pick 2 small primes to
keep it simple; 17 and 11. We multiply them to get 187. We then pick another prime; 7. Thats our public key 2
numbers. Pretty simple.
Now we use the public key to generate the private key. We run it through the algorithm in the diagram above.
You can see we use modular arithmetic. Obviously the numbers would be massive in real life. But here, we end up
with a private key of 23. The function, 7 * d = 1(mod 160) has that look of simplicity, but its not like that at all.
With large numbers wed need to use the Extended Euclidean Algorithm. I have to say, my eyes glazed over and I
was found staring in to the distance when I read this about it:
The extended Euclidean algorithm is particularly useful when a and b are coprime, since x is the modular
multiplicative inverse of a modulo b.
Now we want to use that to encrypt a message.
To keep things simple, well send a single character; X. ASCII for X is 88. As we are the sender, we only know the
public keys 2 values: 187 and 7, or N and e. Running 88 through the simple algorithm gives us the value 11. We
send the ciphertext value 11 to the ecommerce web server.
The Web server has access to the private key, so it can decrypt the ciphertext.
The web server passes the plaintext through the algorithm shown above and gets us the original X that was
sent. The bit that says:
Plaintext = 11
23
(mod 187)
OK theres actually a problem here. In this message, every X would come out in ciphertext as the value 11. We
could perform a frequency analysis attack on the message. In the English language, certain letters tend to appear
more frequently than others. The letters e and i for example are very common, but x and z are uncommon.
There is a signature that could be used to find the content of a message. We therefore need to encrypt much
larger blocks of data than just one byte at a time.
Digital Signatures
Asymmetric keys, as mentioned earlier can be swopped around. If you use one key for encryption, you must use
the other key for decryption. This feature comes in very handy for the creation of digital signatures. Youve heard
of digitally signed documents, authenticode, digitally signed applications, digital certificates and so on.
In the diagram you can see all weve done is combined some plaintext in to the same message as its equivalent
ciphertext. When its time to check a digital signature, we reverse the process:
To check a message, we decrypt the encrypted portion, and get back plain text. We then compare that to the
plain text in the message. If the 2 sets of plain text are different, it means either:
The plaintext in the message has been altered and that accounts for the difference.
The ciphertext in the message has been altered and that accounts for the difference.
They have both been altered and that accounts for the difference.
In order to have a consistent message, the attacker would need to have access to the key that was used to
generate the ciphertext.
Do you remember earlier, I talked about hashes? Well, because a message might be quite large, its often best to
generate a hash of the message and encrypt that. If its an MD5 hash, it means youll only have to encrypt 128
bytes. When you come to perform the validation of the signature, you have to take the plain text portion and
generate a hash before you do the comparison. It just uses the CPU more efficiently.
In this case, the message consists of a small section of ciphertext because the string-size of the input plaintext
was reduced through hashing before it was encrypted. It also includes the plaintext of the message.
Depending on the data you are looking at, youll often even find the keys you can use to decrypt the message in
plaintext within the message body. It seems like complete madness because anybody who intercepts the message
could simply modify the plaintext portion of the message and then use the included key to generate a new
ciphertext equivalent. That would make the message consistent.
However, if the plaintext key included in the message is the message-issuers public key, then the attacker would
need access to the corresponding private key, which they wont get because its, well, private.
But even with this there is still a problem. How do you know the message came from the sender it purports to
come from? As an attacker, I could easily generate my own key-pair. I could then create a message that says I am
the issuer and use my private key to create the encrypted part of the message.
When you come to check the message youll know that it definitely wasnt tampered with in transit, but how do
you know you can trust the public key embedded in to the message? How do you know that its me that created
the message. Thats where digital certificates come in to play.
Certificates
Certificates are data structures that conform to a specification: X.509. But really they are just documents that do
what we just talked about. The plain text data is the public key, plus other distinguishing information like the
issuer, the subject name, common name and so on. It is then hashed and the hash is encrypted using the private
key of a special service called a certification authority (CA) a service that issues certificates.
When we protect a web server with an SSL certificate, we go through a 2 stage process. generating a certificate
request, and then finishing it off by receiving and installing the certificate. The request part, generates a public
and private key. The public key plus the distinguishing information is sent to the CA, which then creates a digitally
signed document, signed using the CAs private key. The document conforms to X.509 certificate standards. The
certificate is returned by the CA, and we install it on our web server.
Anytime anybody connects to the web server over SSL, they retrieve the certificate and perform a signature
validation on it. Remember it was signed by the CAs private key. So they have to have the CAs public key to
perform the validation. If you go in to Internet Explorers Internet Options and then to the Content tab, youll se a
Certificates button. That shows you all the CAs you have certificates (and therefore public keys) for. It means if you
see a certificate that was signed by a CA on a web site, in theory, the CA did a check to make sure the requester
was indeed the requester before issuing the certificate. It means that you have to trust that the CA did a good job
of checking the requesters validity before issuing the certificate.
Even this creates a minor problemette how do you know the CAs certificate wasnt created by an imposter of
some description? Well, it can have its certificate signed by a CA higher up the food-chain than itself. Eventually
you get to a CA at the top of the food chain and this is called a Root CA. Internet Explorer and all the other
browsers have all the main Root CAs for the Internet built-in. These are trusted organisations. They have to be!
They are so trusted, they are able to sign their own certificates.
Technorati Tags: crypto primer,crypto,cryptography primer,cryptography,encryption,decryption,asymmetric
key,symmetric key,public/private key,public key infrastructure,pki,certificates,digital
signatures,signatures,signing,digital certificates,x.509 certificates,planky,plankytronixx
You may from time to time play with a Visual Studio command line tool called makecert.exe. Its a tool that
creates self-signed certificates. If you are just using them for development purposes on a local machine they are
probably fine. You trust yourself, presumably. Sometimes you can use self-signed certificates on Internet-facing
services. For example if you upload your own self-signed certificate to a service and you are sure nobody
intercepted it while you were uploading it (because you were using SSL maybe), it means you can have private
conversations with the service and you can be sure the service is the service you issued the certificate to. If you
just sent the naked certificate, theyd be able to encrypt messages that only you could decrypt, because youd
have the private key. Its possible to also include the private key when you create a certificate. If you send one of
these certificates to an Internet service, they can digitally sign messages they send to you with your private key.
Because you are assured that you gave the private key only to the service, you can be sure the messages are
genuinely coming from that service and not an imposter. You have to trust that the service do a good job of
keeping your private key safe.
Of course it wouldnt be practical if every time you wanted to buy something on the Internet, in order to create an
SSL connection you had to first upload a self-signed certificate. Thats why there is a large infrastructure of CAs
and Root CAs built on the Internet. This infrastrucutre is called a Public Key Infrastructure or PKI. Many
organisations have their own internal PKIs.
Above: Internet Explorers list of Trusted Root CAs.
You can also see the chain of CAs up to that chains corresponding Root CA when you look at a certificate.
This shows an expired certificate that was issued to my BPOS (Office 365) account by the CA called Microsoft
Online Svcs BPOS EMEA CA1. Its certificate was in turn issued by Microsoft Services PCA which had its
certificate issued by Microsoft Root Certificate Authority. As its a Root CA, it appears in the Trusted Root CAs
container in Internet Explorer. As you walk up the chain you have to eventually get to a point where you trust the
certificate. If you dont, youll get a certificate error warning and a lot of messages advising you not to continue.
Ill write another post soon that goes through a complete SSL handshake. That's a great way to explain whats
happening in crypto.

Planky
Comments
kunku # 23 Oct 2010 11:17 PM
thanx planky.......i am only a beginner in the field of information security. this article helped me to
understand easily the concepts behind cryptography...
...Planky # 28 Oct 2010 3:49 AM
Great! Thanks.
I did a quick search for a crypto primer and only unearthed things primarily aimed at the advanced
mathematician so thought an article that skimmed the maths but gave an overview of the principles
involved was needed. Plus - my mathematics isn't strong enough to go any deeper than this article!
Planky
Keith # 29 Oct 2010 7:05 AM
I am pretty sure Euclid's algorithm is not used to calculate
Plaintext = 11^23(mod 187)
There is nothing special about that calculation. On the other hand, calculating the private key d=23 in
7 * d = 1 (mod 160)
does require Euclid's algorithm (when larger numbers are involved). Actually, it requires an extended
version because Euclid's algorithm only calculates the greatest common divisor of two numbers.
See en.wikipedia.org/.../Extended_Euclidean_algorithm
...Planky # 29 Oct 2010 2:58 PM
Thanks Keith. I'll update the post to show how improbable it is to calculate the private key, which is the
actual problem space.
I was always in danger of having somebody who knows what they are talking about picking me up on
the mathematics (or the "math" as they say in the US. Often wondered why the "mathematics" is plural
but the shortened version is singular)..
I might even edit the post to include elements of the article you reference and make it seem as though I
actually know all about the "math". But I'm sure I'll be rumbled.
Thanks for sticking with it actually Keith, I know when I come across stuff I already know, I just skim-read
it... Or perhaps this mistake was so obvious on the page it was shouting "I AM WRONG!".... :-)
Keith # 29 Oct 2010 4:29 PM
I should have added that the primer is a good introduction. I am currently taking a computer security
course and this is explained in terms very similar to how my professor did it.
...Planky # 30 Oct 2010 1:28 AM
I wonder if that means I could become a professor? Professor Planky has quite a nice ring to it doesn't
it?
Thanks Keith.
Tom # 31 Oct 2010 8:57 AM
Excellent article Planky. I definitely fall under the 'skim-read' category due to previous knowledge and as
such I didn't check your mathematics thoroughly but your front to back explanation is excellent and
concise. I think this may be one of the best that I have seen. Definitely a great primer for anyone looking
to understand basic encryption techniques and as it applied to secured web traffice. Well done!
Keep up the good work :)
...Planky # 31 Oct 2010 1:26 PM
Well - that's good to know! Looks like a professorship is in the bag! Just probably not for mathematics...
Thanks Tom.
Planky
No-one # 31 Oct 2010 3:18 PM
You should listen to the first 100 or so episodes of "Security Now", a podcast with Steve Gibson and Leo
Laporte. It complements and extends this material.
...Planky # 1 Nov 2010 3:14 AM
I just had a look at the site and downloaded a bunch of general interest stuff I'll listen to on my WP7
when I'm travelling/in the gym/driving....
Quite a range of material, incluing stuff I've always been interested in but never took the time to
investigate...
Thanks fo rthe pointer...
Planky
Ashish # 26 Dec 2010 9:24 AM
Thanks a lot , Very good explanation !!!!
dirce silviapinto dovalke1109n9aten dirce9 # 5 Feb 2011 12:52 AM
casa dia 1109n9fone1106n6dirce6= antonio almeida1106n6 orlanda dias
soares1061107n7f40232306dirce6=banco santanderdia 1109n9ate dirce9fone1109n9
Planky # 5 Feb 2011 3:53 AM
that last comment some kind of crypto?
Ramesh # 3 Mar 2011 4:53 AM
Excellent explain. Last line is crypto :-) ?
senthil murugan # 26 Apr 2011 12:26 AM
just i started to learn cryptography today. its an gud introduction on cryptography with the whole
picture in mind.
uoy knaht,
planky..
...Planky # 26 Apr 2011 3:06 AM
Hi Ramesh,
"dirce silviapinto dovalke1109n9aten dirce9" (now that's what I call a name!) wrote, as the last line of his
comment "soares1061107n7f40232306dirce6=banco santanderdia 1109n9ate dirce9fone1109n9". Surely
that's some cryptotext?
Hi Senthil Murugan,
"uoy knaht" to you too - I'm glad you found it useful...
azra # 13 Jun 2012 8:43 AM
thanks.. it z a nice article

Вам также может понравиться