* 12 *

Encryption

Fact of the week

Quantum computing is a theoretical way of using the properties of matter at the quantum mechanical (microscopic) level) in order to build computational systems. Quantum computers would not work like ordinary digital computers, but on a whole new principle. It has been theorized that quantum computers could break standard encryption codes is a fraction of a second. The race is on to build them.

Chapter 12 Gollmann
Some years ago it was thought that encryption (encipherment) had nothing to offer computer systems, today the reverse is true: encryption has become the latest buzzword for selling products. It is presented as the solution to everybody's problems. As usual, the truth lies somewhere in between the extremes.

To round off our discussion of software security, let's spend a lecture looking at a few of the central issues involved in encrypting information. Since encryption is such a huge subject we shall aim only for two rather modest accomplishments:

Encryption can be used either for transferring data over some communcations link or for storage. In practice, any useful encryption scheme requires a secret key. The security of the encryption depends on the security of the key. We shall take it for granted that keys are secure in this lecture.

Ciphers

A cipher is a scheme for transforming readable text into unreadable form. Ciphers were used as early as Julius Caesar, by permuting letters of the alphabet according to well defined rules. For instance, in the film 2001: A space odyssey, many have speculated about the name of the computer HAL. If we shift every letter one character to the right in the alphabet, we get IBM. Was this a crude cipher?

In times of war, ciphers and encryption have been developed into an extensive technology. These days we use general ciphers based on secret keys in order to increase their flexibility. Why do we need a key? A key is a parameter to an encryption algorithm. It turns a single encryption algorithm into a family of encryption algorithms. If we had only only one coding transformation, then it could only be used between those who shared a knowledge of the algorithm. Everyone would need to have their own separate, private algorithms in order to ensure private communication. With a parameter key, everyone can customize the algorithm for their own use, by using a variant of the common code. Moreover, sooner or later a code will be broken, if we have to rewrite an entire algorithm in order to recover secure communication, it will cost us a lot.

The general principle of ciphers is to change text into an unreadable form, according to a well defined algorithm. The algorithm must be designed in such a way that it is impractically difficult to invert the procedure without intimate knowledge of some secret information (a key).

If we only have to change only one parameter, then our encryption scheme is much cheaper to maintain. The algorithm makes the message unreadable, the key makes it private.

As Gollmann says:

Cryptography is rarely the solution to a security problem. Cryptography
is a translation mechanism, which converts a communications security problem
into a key-management problem. It is a classic case of security through obscurity.

Cryptography is a complex and highly mathematical discipline. It is well beyond the scope of this course. The purpose of this lecture is to provide a taster of how the two most important encryption schemes work, and how they are used.

DES

The standard encryption mechanism in the western world, for the last twenty years, has been DES, or the Data Encryption Standard. It is based on a symmetric cipher, i.e. a secret key which is shared by both parties, not an asymmetric cipher like RSA. The algorithm was developed during the 1970's by the American government. It is still in use today, in an extended form called triple-DES, or 3DES. Whereas the original DES used only one 56-bit key, 3DES uses three 56 bit keys (plus one parity byte) in order to increase the difficulty of breaking the cipher. It is an iteration of DES:
3DES_encrypt(k1,k2,k3,message) = DES_encrypt(k1,DES_decrypt(k2,DES_encrypt(k3,message)))
DES is based on the idea of transformation, permutation and masking. DES takes 32 bits on input at a time and uses a non-linear function to expand them into a 48 bit block. The 56 bit key is used as a database for extracting a number of 48 bit subkeys which are then used to mask out bits with a reversible exclusive-OR. These are then permuted and converted into output.

The basic DES alghorithm is used in a number of ways:

DES encryption requires the input/output buffers to be a multiple of 8-bytes, owing to the nature of the algorithm.

RSA

Asymmetric public/private key encryption came of age with the RSA algorithm. RSA is based on the idea of modulo arithmetic (clock arithmetic) and the difficulty of factoring products of large prime numbers which is made ambiguous by the cyclic nature of clock arithmetic.

Modulo arithemtic divides integers into equivalence classes. If

\begin{displaymath}a = b ~{\rm mod}~m \end{displaymath}

it means that

\begin{displaymath}b = \lambda m + a \end{displaymath}

for any integer $\lambda$. In other words, when we go a whole revolution around the clock, we come back to the same place. It is not difficult to convince oneself that the following properties hold modulo m:

\begin{displaymath}(a ~{\rm mod}~m) + (b ~{\rm mod}~m) = (a + b) ~{\rm mod}~m \end{displaymath}

More surprisingly perhaps:

\begin{displaymath}(a ~{\rm mod}~m) \times (b ~{\rm mod}~m) = (a \times b) ~{\rm mod}~m
\end{displaymath}

This can be understood, if we write it like this:

\begin{displaymath}(a - \lambda m) \times (b -\lambda' m) = (ab - \lambda'' m)
\end{displaymath}

Note that this means that

\begin{displaymath}(a ~{\rm mod}~m)^n = a^n ~{\rm mod}~m
\end{displaymath}

One strange property of modulo arithmetic is that division and multiplication are the same thing, modulo m, since we can always get to where we started by going further around the clock, rather than by going backwards. So, while the inverse a-1 of a number a, in normal arithmetic is always 1/a, that is not the case in modulo arithmetic (which always deals with integers). If the number m is a prime number p, then there are theorems which reveal special properties. In fact, for any integer which is not a multiple of p, it is always possible to find a-1such that

\begin{displaymath}a.a^{-1} = 1 ~{\rm mod}~p.
\end{displaymath}

Fermat's little theorem (no relation to its big brother) states the remarkable result that, for any prime number p,

\begin{displaymath}a^{p-1} = 1 ~{\rm mod}~p,
\end{displaymath}

provided a is not a mulitple of p. If we generalize this by taking powers of both sides, then we have

\begin{displaymath}(a^{p-1})^\alpha = a^{\alpha(p-1)}= 1^\alpha ~{\rm mod}~p = 1 ~{\rm mod}~p.
\end{displaymath}

The RSA algorithm is based on these properties. It begins by generating two very large prime numbers p and q and their product pq. The quantities p-1 and q-1 have a special significance due to Fermat's theorm. The encryption key is then any random number which has no common factors (is relatively prime) to (p-1)(q-1). The decryption key can then be calculated by

\begin{displaymath}d = e^{-1} ~{\rm mod}~(p-1)(q-1)
\end{displaymath}

If we now wish to encrypt a message M (whose length is less than pq) we compute the cipher text

\begin{displaymath}C(M) = M^e ~{\rm mod}~pq.
\end{displaymath}

The decryption is given by
C-1(M') = $\displaystyle {M'}^d ~{\rm mod}~pq$  
  = $\displaystyle (M^e ~{\rm mod}~pq)^d ~{\rm mod}~pq$  
  = $\displaystyle M^{ed} ~{\rm mod}~pq$  
  = $\displaystyle M^{\lambda(p-1)(q-1)+1} ~{\rm mod}~pq$  
  = $\displaystyle M ~{\rm mod}~pq$ (1)

The security of the RSA algorithm has been tested extensively and all indications are that it is very good, provided the product pq is sufficiently large.

The RSA algorithm is expensive to compute, so it is normally used only during the establishment of credentials. Thereafter, session keys are created and a cheaper form of encryption is used to transmit actual data.

Random numbers

In order to use encryption schemes we have to generate random numbers. Random numbers are used to create session keys and also to create long-term shared keys. Most random number generators require a seed. Truly random numbers are difficult to generate. Many programmers use the time of day as a random number seed, but this is easily guessed. A better way to generate a random number is to take the state of the system (e.g. the process table text) and crunch it through MD5 to produce a uniqe representation of the state of the system. This should be impossible to guess.
DESMD5Random(digest)

char digest[16];

   /* Make a decent random number by crunching some system states through
      MD5. We can use this as a seed for pseudo random generator */

{ MD5_CTX context;
  int len;
  unsigned char buffer[bufsize],buff[bufsize];
  FILE *pp;
  char pscomm[bufsize];

printf("Looking for a good random number seed...\n");
cfMD5Init (&context);
sprintf(buffer,"%d%d",APPSTARTTIME,digest);
cfMD5Update (&context, buffer,bufsize);

/* Have to use /bin/ps -ef or /usr/ucb/ps aux for Solaris */
 
if ((pp = cfpopen("/bin/ps aux","r")) == NULL)
   {
   printf("Couldn't open the process list with command %s\n",pscomm);
   perror("cfpopen");
   return;
   }

while (!feof(pp))
   {
   ReadLine(buffer,bufsize,pp);
   cfMD5Update (&context,buffer,bufsize);
   }

cfpclose(pp);
 
cfMD5Final (digest, &context);
}

This can be used to generate three random numbers for use as 3DES keys:
 char digest[16];
 char seed[8],key[8];

 DESMD5Random(digest);
 bcopy(digest,seed,8);
 des_random_seed(seed);
 des_random_key(key);
 printf("%8.8s",key);
 des_random_key(key);
 printf("%8.8s",key);
 des_random_key(key);
 printf("%8.8s",key);

3DES encryption

Programming

The encryption functions require a little setting up. Let us suppose that we are going to send a whole file (many buffers in succession) using a chaining cipher.
sprintf(in,"Message");

cfencrypt(in,out,DES1,DES2,DES3,strlen(in));

if ((send(sd,out,bufsize,0)) == -1)
   {
   return false;
   }

...(transfer)

if ((n_read = recv(sd,inbuffer,bufsize,0)) == -1)
   {
   return false;
   }

cfdecrypt(inbuffer,outbuffer,DES1,DES2,DES3,bufsize);

printf("%s",outbuffer);
The functions are defined as follows, in terms of the standard DES library functions from OpenSSL:
cfencrypt(in,out,key1,key2,key3,len)

char *in,*out;
char key1[8],key2[8];
int len;

{ char workvec[bufsize];
  des_key_schedule ks1,ks2,ks3;

des_set_key((C_Block *)key1,ks1);
des_set_key((C_Block *)key2,ks2);
des_set_key((C_Block *)key3,ks3);

memset(workvec,0,bufsize);

des_ede3_cbc_encrypt(in,out,(long)len,ks1,ks2,ks3,(C_Block *)workvec,DES_ENCRYPT);
}

The decryption must be set up like this:
cfdecrypt(in,out,key1,key2,key3,len)

char *in,*out;
char key1[8],key2[8],key3[8];
int len;

{ char workvec[bufsize];
  des_key_schedule ks1,ks2,ks3;

memset(workvec,0,bufsize);
des_set_key((C_Block *)key1,ks1);
des_set_key((C_Block *)key2,ks2);
des_set_key((C_Block *)key3,ks3);

des_ede3_cbc_encrypt(in,out,(long)len,ks1,ks2,ks3,(C_Block *)workvec,DES_DECRYPT);

out[len] = '\0';

Debug("Decrypted %d to [%d]\n",len,strlen(out)); 
}

If we don't initialize the workvectors and keys correctly, then the decryption will not work.

Sending session keys

If we want to user a one-time cipher, i.e. a different set of keys for each transmission, to confuse a potential attacker, then we still need a shared secret to get started.
  1. Establish a connection.
  2. Generate random session key(s).
  3. Encrypt session keys with pre-agreed keys.
  4. Send random session key(s) to remote host as part of protocol.
  5. Remote host descrypts session keys with pre-agreed keys.
  6. Remote host replies with message encrypted using session keys.
After the hosts have exchanged keys, they can continue to communicate using the session keys. Only the initial key agreement protocol requires the shared secret key.

SSL

SSL is the industry standard protocol for the web. Developed by Netscape, SSL is a transparent set of wrappers which allow normal socket based communication to be encrypted using the combination of RSA at the start with a message transfer cipher. The OpenSSL contains everything necessary to create applications which use SSL and includes a few examples of its use. Web browsers use this in order to pipe normal HTTP protocol over a secure line (HTTPS).

The OpenSSL package contains examples of how SSL applications work. We shall not repeat these here, since they take up a lot of space. However, let's compare a standard client connection with an SSL client connection. A normal client-side socket connection proceeds with calls to

sd = socket(..);
connect(sd,address..);

  {
  write(sd..)  or send(sd ..)
  read(sd... ) or recv(sd ..)
  }

close(sd);

Under SSL, a client opens a socket and then starts an SSL session on that socket. After this, it is simply a matter of swapping normal read/recv and write/send commands for their SSL counterparts.

sd = socket (..);

err = connect(sd ... );

  /* Now we have TCP conncetion. Start SSL negotiation. */

ssl = SSL_new (ctx);
SSL_set_fd (ssl, sd);

err = SSL_connect (ssl);

server_cert = SSL_get_peer_certificate (ssl);

printf ("Server certificate:\n");

str = X509_NAME_oneline (X509_get_subject_name (server_cert),0,0);

  {
  err = SSL_write (ssl, "Hello World!", strlen("Hello World!"));
  err = SSL_read (ssl, buf, sizeof(buf) - 1);                    
  }

SSL_shutdown (ssl);  /* send SSL/TLS close_notify */
close (sd);
SSL_free (ssl);
SSL_CTX_free (ctx);                                             
Thus, apart from the negotiations at the start and some extra cleaning up to do at the end, there is very little work to do for the programmer. However, the SSL certificates do have to be generated in a similar way to the procedure we went through to set up the secure shell in an earlier exercise.

Thought for the week

Some system administration packages boast that they are safe because they work over encrypted connections. Does encryption do anything to increase your trust of the data you receive from someone? Is encryption useful only as a means of privacy? What about authenticity?

Back