M ASTER  S OFTWARE QUALITY SOFTWARE SINCE 1958 |
  |
PROTECT YOUR PRIVACY -- PROTECT YOUR SECURITY
with the strongest line of data file and message encryption software available. |
1. INTRODUCTION There have been many cryptographic methods invented since writing itself was developed. This overview is designed to give prospective users of a cryptographic system a broad idea of where their system fits into the whole spectrum of cryptographic methods and algorithms. It also can give them an idea of how strong a particular method might be in relation to other methods, and a general concept of how the strength of a cryptographic system can be judged. We have tried to write this overview using a minimum of math, so that non-technical readers can understand most or all of the material. None of the subjects is covered to any depth, in the hope that the entire survey can be read in less than 1 hour. This treatment deals only with the different kinds of algorithms or methods used in cryptography. There are many other considerations that go into choosing a cryptographic algorithm. Most older methods are intended only for sending hand-written messages. Among modern methods, some are best-suited for protecting files on a computer, others are suited for encrypting messages, while others are specialized for exchanging cryptographic keys. These issues are considered elsewhere on this website. See www.mastersoftware.biz/crypt-ch.htm for example. TABLE OF CONTENTS 1. INTRODUCTION 2. CRYPTOGRAPHIC METHODS 2.1. Secret Method Cryptography 2.2. Inherent Key Cryptography 2.3. Hidden Key Cryptography 2.3.1. No key required 3. SECRET KEY CRYPTOGRAPHY 3.1. Substitution Ciphers 3.1.1. Data Encryption Standard (DES) 3.1.2. Advanced Encryption Standard (AES) 3.1.3. GK-Crypt 3.1.4. Linearity 3.2. Transposition Ciphers 3.3. Stream Ciphers 3.3.1. Random numbers 3.3.2. One-time pad 3.3.3. OP-Crypt 4. PUBLIC KEY CRYPTOGRAPHY 5. PRIVATE KEY CRYPTOGRAPHY 2. CRYPTOGRAPHIC METHODS Most cryptographic methods, from ancient to modern, fall into a few broad classes: Secret method Inherent key Hidden key Secret key Public key Private key In secret method cryptography the strength of the cipher comes from keeping the method itself secret. Most of the encryption methods of ancient times, such as the atbash cipher mentioned in "The DaVinci Code," were secret method ciphers. Inherent key methods rely on a secret key that is built right into the device, or right into the software. The scytale or skytale used by the pharaohs of ancient Egypt was an inherent key device; the diameter of the cylinder was the key. Some of the machine ciphers used during World War II were of this type. The machine would be constructed using a moving chain with lugs, or a turning wheel with pins that drove the mechanism. The chain or wheel could have billions of possible settings. When the government bought the machine, technicians would set the lugs or pins to some fixed arrangement before issuing the machines to the field. All of the machines in use would be fixed at the same setting. Hidden key methods depend on hiding the key for their strength. The most common method is to hide the key within the message itself. For example, it might be agreed that the fifth block of the message would contain the key. A more sophisticated variant is to combine the information hidden in the message with some other information, such as the date when the message was sent, to form the key. In modern computer file encryption programs, the key is sometimes placed in a key file which may be encrypted with some inherent key method. A slightly stronger version uses a control file which contains the name of the key file, so the user can give the key file any desired name, for example an innocent-sounding name like chicken_recipe. Secret key methods get their strength from a secret key that is shared by the sender and the receiver of the message. Users may try to keep the method secret, but they don't rely on that. The strength rests in the key, which must be carefully chosen. Secret key methods are sometimes called symmetric because the sender and the receiver use the same key. Public key methods rely on mathematical functions which are called one-way functions because they are easy to compute, but very difficult to reverse. One example is multiplying two large prime numbers. (Prime numbers are numbers that cannot be formed by multiplying two other numbers that are both greater than 1. So 2, 3, 5, 7, 11 and 13 are prime, but 4=2x2, 6=2x3 and 9=3x3 are not prime.) It is easy for a computer to multiply two 100-digit primes to get a 200-digit product, but very hard to find those 100-digit primes if you are given the 200-digit product. Each user of the public key system will have a public key, known to everyone, that people can use to send messages, and a private key that is used to read the messages. Public key methods are sometimes called asymmetric methods because the sender and receiver use different keys. Private key cryptography is the newest form. In private key methods, the sender and the receiver each have a separate private key, known only to themselves, and not known to each other. The sender encrypts the message with the sender's private key. Then certain mathematical functions are used to change the message so that the sender's private key is removed and replaced by the receiver's private key. This allows the receiver to decrypt and read the message. You can imagine the message as a jewel in a strongbox with two locks. First the sender's lock is placed on the box, then the receiver's lock is added. When the sender's lock is removed the receiver can remove his own lock to obtain the jewel. The NK-Crypt and N2-Crypt encryption packages use mathematical functions that can accomplish this securely. The following sections will discuss these classes of encryption methods in more detail. 2.1. Secret Method Cryptography In Secret Method cryptography, the method, and often the fact that cryptography is being used, are kept secret. This is its only source of security. One example is a cipher designed for young children where A is represented by a picture of an apple or an ape, B is represented by a bird or a boat, C by a carrot or a cat, and so forth. A message is a series of such pictures. The method is useful for teaching children the sounds of the letters, but offers no secrecy. Secret method cryptography is primarily of historic interest. In ancient times, the message might be worked into the design on the messenger's clothing, or written on the messenger's shaved scalp. In the weeks it took to deliver the message, the hair would grow back and conceal the message. A modern analogue would be the hidden patterns on the backs of marked playing cards. Secret method cryptography, no matter how clever, is not considered safe because anyone who learns the method or obtains a copy of the device, can read all of the messages, both past and future. Unfortunately, some of the computer encryption methods in use today are still of this type. For example, one method is to exclusive-or (combine) the contents of each computer file with its file name, scrambled in a fixed manner, over and over. For such weak secret key methods, simply obtaining a single message and its translation would allow an opponent to break the system. Once broken, a secret method is broken forever, since there are no keys the users can change to prevent reading future messages. These methods are still occasionally used either because the maker is not aware of the danger, or because secret method encryption is both easy to implement and easy to use. If an encryption product says that no keys are needed, and no secret files are used, then it is a good bet that it is a secret method product. 2.2. Inherent Key Cryptography Inherent Key methods rely on a secret key that is built right into the device, or right into the software. The turning grille cipher, which was used from the 17th through the early 20th centuries was an inherent key device. It used a thin square metal sheet with holes cut in it at various spots. The sender would place the grille over a sheet of paper, and write one letter of the message through each hole. When all of the holes had been used, the grille would be turned and letters would be written through the holes in their new positions. The receiver would read the message the same way. The positions of the holes constitute the secret key. Since the holes cannot be moved, the key is inherent in the device itself. Inherent key cryptography is nearly as weak as secret method cryptography. It is not considered safe because anyone who obtains a copy of the same software, or the same device, can read all of your messages, both past and future. Thus the vendor, any government agency, and even any other purchaser could read your files. Some inherent key software products are distributed with a different key built into each copy of the program, just as combination locks are sold with different settings. This is better, but anyone who copies the software from your computer, or from a backup disk, or from your online backup service, can read all of your files or messages. Since inherent key cryptographic products are generally produced by people with little knowledge of cryptography, the built-in keys and the encryption method are also likely to be weak. Once again, if the product says that no keys or passwords are required, be very wary. It is not likely to be a strong product. 2.3. Hidden Key Cryptography Hidden Key methods depend on hiding the cryptographic key for their strength. It is like people hiding the keys to their home under the doormat, on the lintel, or inside a fake stone in a flowerpot. Hidden key cryptographic software is now very popular with the less-experienced suppliers of privacy products. It offers the user the convenience of not needing to enter a password or keyword each time the software is used, but this convenience comes with the price of reduced security. These products typically keep the key in a file that is itself encrypted with an inherent key that is built into the program. Anyone who gets access to your computer, or to your backup disks, can copy that key file, and then can read your files or your messages. There is nothing inherently wrong with the idea of storing file keys or message keys in a key file, sometimes called a master file. Many of the best encryption products also do this. But the master file must itself be protected with a strong key, sometimes called a master key, otherwise the file keys are vulnerable. The master key must not be an inherent key (see the preceding section) or a hidden key, but must be a secret key kept by the user. Some products will take the hidden key idea several steps. They will have separate key files, which are encrypted with key file superkeys kept in a superkey file, which is encrypted with a master key stored in a hidden file whose name is stored in a control file which is encrypted with an inherent key. This is still not safe!! Someone who gets access to your computer, or to some other computer on your local network, can use the encryption software on your computer to read your files. They don't need any keys or special knowledge to do this, because the encryption software already has that ability built in. As long as the top-level key is kept on your computer, whether it is in a key file or built into the encryption software, someone can find that key and use it to read your files. 2.3.1. No key required The same blanket warning applies equally to all cryptographic packages that can be run without entering a key or password. These products may use secret method cryptography, inherent key cryptography, or hidden key cryptography. Use caution. If the product says that no keys or passwords are required, then you are taking a serious risk. If the product allows only a short password, then you are taking a serious risk. If the product allows long passwords, but you choose a short simple password, then you are taking a serious risk. The risk is present even when a very strong encryption algorithm is being used. For example, a disk encryption product may use 256-bit AES encryption, and automatically produce the 256-bit key using the latest random-number generating techniques, but if the user can access the contents of the disk without a password, or by using at most a simple password, then the product is not safe. 3. SECRET KEY CRYPTOGRAPHY Nearly all of the cryptographic methods developed over the past 400 years are secret key methods. Nearly all of the cryptographic methods in use today by govermments, and especially by the military, are secret key methods. Secret key methods generally fall into two classes, block ciphers and stream ciphers. Block ciphers are further divided into substitution ciphers and transposition ciphers. Substitution ciphers take blocks of one or more characters at at time and replace them with other blocks according to some fixed rule. The most common example is the simple substitution cipher, often found on newspaper puzzle pages, where each letter of the alphabet is consistently replaced by a different letter of the alphabet. A slightly more complex example might use a table of 2-letter or 3-letter combinations, and use that table to substitute for 2 or 3 letters of the message at a time. Transposition ciphers change the order of the characters in a message. A simple example is where the letters of the message are written into a rectangular grid going left to right across the rows. Then the letters are taken out of the grid going top to bottom down the columns. As a general rule, ciphers that use only transposition, no matter how complex, are not considered secure. The transposition must be combined with other encryption steps to be secure. Stream ciphers generate a stream of keys, such as letters or numbers. Each key in the stream is used to encrypt one character of the message. In classical methods the key stream was simply some key word or key phrase repeated over and over for the length of the message. The most common modern method is to use a random number generator to produce the stream of keys. As a general rule, the random number generators used in most statistical applications are not strong enough to produce a secure encryption. 3.1. Substitution Ciphers Substitution ciphers generally take one character or block of characters at a time and replace it with another. The simplest form of substitution cipher uses a table in which all of the blocks of a given size appear in scrambled order. You replace a given block with the corresponding block from the table. For example, using blocks of 3 uppercase letters (trigrams), you might have AAA WCY AAB GNR AAC TDW AAD ZOF etc. Various methods are used to make this type of substitution depend upon the key. For example the key could be used to scramble the order of the blocks in either or both of the lists. The table method becomes impractical for blocks of more than 3 characters, especially on computers, where there are 256 different characters rather than 26. A table of 4-character blocks would contain 256x256x256x256 entries, or about 4.3 billion entries, each containing 4 characters. For larger blocks of characters the substitution is usually accomplished by doing a series of operations on single characters or pairs of characters within the block. A simple example using blocks of N characters would use a table of pairs. First the pair 1,2 is substituted, then the pair 2,3, and so forth. When the end of the block is reached, it would wrap around using pair N,1. This method starts to be secure when at least 5 complete cycles are taken, particularly if a different pair substitution table is used for each round. Here is an example of one round COMPUTER CO becomes XM XMMPUTER MM becomes KD XKDPUTER DP becomes RH XKRHUTER HU becomes FL XKRFLTER LT becomes BC XKRFBCER CE becomes WT XKRFBWTR TR becomes SV XKRFBWSV VX becomes LD DKRFBWSL 3.1.1. Data Encryption Standard (DES) Some modern block substitution ciphers are the Data Encryption Standard (DES), the Advanced Encryption Standard (AES), and the newer GK-Crypt algorithm. The Data Encryption Standard treats each block of 8 bytes, or 64 bits, as 16 units of 4-bits each. (4-bit units are sometimes called nibbles or nybbles.) The left half of the block is used to generate eight 4-bit quantities. These 32 bits are permuted (shuffled) in a fixed manner and then exclusive-ored (combined) with the right half. Then the left and right halves are swapped, and this operation is repeated for 16 rounds. (There are also some bit permutations at the beginning and end, but they add no strength. They are there to make the encryption hard to simulate in software.) The DES was state-of-the-art for the mid-1970's, but its small 56-bit key size made it obsolete by about 1990. The trend has been away from bitwise hardware-oriented encryption. 3.1.2. Advanced Encryption Standard (AES) The Advanced Encryption Standard (AES) is also a block cipher, using a block size of 16 bytes, or 128 bits, and a key size of 128 bits in current implementations. The key may be increased to 192 bits or 256 bits in the future, when 128 becomes inadequate. The first step in AES is to expand the key to 128 bytes, to make 10 round keys of 16 bytes each. This is done by performing repeated substitution and exclusive-or operations on the key bytes. The 10 round keys are then used to perform 10 rounds of encryption. Each round consists of 3 steps. First the message is combined with the key using an exclusive-or operation. In the second step the 16-byte message is considered as 4 words of 4 bytes each. These 4 words are shifted left by 1, 2, 3 and 4 byte positions, respectively. In the third step each column of the 4x4 block of bytes is mixed using a linear transformation, that is, each column vector is multiplied by certain matrix of values. This last step is called a Hill cipher for Lester Hill who invented it around 1920. It was used by both the US and Russian military in the period between the two World Wars. Methods for breaking the Hill cipher were developed by Jack Levine at UNC Raleigh. Some features of the AES are (1) it has a fixed block size, (2) it uses the same key for every block, (3) it uses the same permutation in every round, (4) the permutations are simple shifts applied within a single 4-byte word. 3.1.3. GK-Crypt The new GK-Crypt encryption algorithm is also a block cipher, using 9 rounds of substitution and permutation. Version 1 of GK-Crypt uses 16-byte, or 128-bit blocks, and 80-byte, or 640-bit keys. Version 2 of GK-Crypt uses variable-length blocks of 16 to 32 bytes, or 128 to 256 bits, and 640-bit keys. The first step in GK-Crypt is to expand the key from 80 bytes to 360 bytes for version 1, or to 690 bytes for version 2. Like the DES and the AES, each round of GK-Crypt begins by combining the message with the key for that round. Then each byte of the message is combined with another message byte. In version 1 of GK-Crypt, this is done in a fixed way, which is the same for every round. In version 2 of GK-Crypt, the choices of which bytes to combine depend on the key for each round. Finally, the message block is permuted (shuffled). Unlike both DES and AES, this is done in a key-dependent way, and is different for each round. After each block has been encrypted, a new key is generated for the next block. This process also uses a sequence of substitutions and key-dependent permutations. A separate portion of the key is set aside for generating the next key, and this portion is also changed for each new block. This means that not only is the key for the next block different, but the transformation to generate the next key is also different. GK-Crypt is much stronger than AES, or even quadruple AES, because (1) it has a variable block size, (2) it uses a different key for every block, (3) it uses a different permutation in every round, (4) it permutes the entire block in a complex way. 3.1.4. Linearity Linearity is an important concept in the design of all substitution ciphers. It can be illustrated by the familiar process of flipping a coin. Suppose that you flip a fair coin 100 times, and you get 56 heads. You immediately know that you got 44 tails, because the number of heads plus the number of tails always add up to the total number of flips. Addition is a linear function. The exclusive-or operation which is widely used in computer cryptography is also a linear function. When a message byte and a key byte are exclusive-ored, the bits of each byte are lined up. Everywhere that there is a 1 in the key byte, the corresponding bit of the message is inverted. That is, if the message bit was a 0 it becomes a 1, and if it was a 1 it becomes a 0. Everywhere there is a 0 in the key, the corresponding bit of the message stays unchanged. Message 0 0 0 0 1 1 1 1 Key 0 1 1 0 1 0 0 1 --------------- Result 0 1 1 0 0 1 1 0 For cryptography, linear functions are the weakest kind. Systems of linear equations are easy to solve. The methods have been known for hundreds of years. (Notice that the matrix multiplication step of the AES is a purely linear operation.) Therefore, if an encryption method can be described by linear equations, and a few characters of the message are known, it is easy to solve the linear equations and determine the key. That let's you find the rest of the message. When one bit of the key in a linear system changes, a fixed set of bits in the message will all change simultaneously. This is a weakness which is easy to exploit. For example, if the same message has been encrypted with 2 different keys, an attacker can look at which bits are the same in each message, and which bits are different, and determine a good deal about the keys. The ideal in cryptography is to have a perfectly non-linear system. In such a system, when one bit of the key changes, or any combination of key bits change, each bit of the cipher message will have exactly a 50-50 chance of changing. A perfect non-linear substitution is not possible, but every well-designed encryption system tries to make its substitutions as non-linear as possible. The 8 S-boxes (substitution functions) in DES are highly non-linear. The substitution table in AES has been mathematically designed to be highly non-linear. However, this mathematical regularity means that the substitution can be described by a mathematical function, which is itself a weakness. GK-Crypt strikes a balance between strong non-linearity and mathematical regularity. GK-Crypt uses 3 substitution tables. One is highly structured, similar to the AES table, and the other two are somewhat random. All 3 tables were constructed to be strongly non-linear. They have been chosen so that they are also as independent as possible from one another. That is, there are no strong linear relationships between the 3 tables. Such relationships might also weaken the encryption. 3.2. Transposition Ciphers The best-known transposition ciphers are the route ciphers. In a typical route cipher, the letters of the message are written into a rectangular grid one at a time, going across the rows, something like this T H I S I S A R O U T E C I P H E R Then the letters will be read out of the grid in a different order, for example, reading down the columns TRPHOHIUESTRIESCAI or diagonally TRHPOIHUSETIRESCAI or reading alternately up and down the columns PRTHOHEUISTREISCIA. The Union used these types of route ciphers in the US Civil War, and it is believed that they were never broken by the Confederacy. Today, these simple route transpositions make fairly elementary hobbyist ciphers, easily solved by paper and pencil methods. In the late 19th century the first transposition cipher utilizing a key was developed. It is called a columnar transposition, or more precisely, a complete columnar transposition if all of the columns are the same length, and an incomplete columnar transposition otherwise. The letters of the message are written into a rectangular grid, as before, but this time the columns are numbered in some order. This numbering of the columns is the key. 3 7 1 6 4 8 2 5 --------------- T H I S I S A C O L U M N A R T R A N S P O S I T I O N The letters are read out of the grid by reading down the columns, but the columns are taken out in the order specified by the key. In this example, the column numbered 1 is read out first, IUNO, then the column numbered 2, ARS, and so forth. IUNOARSTORTINPCTISMSNHLAISAO This is a major improvement over the route transposition, but this type of cipher can still be solved by skilled hobbyists using paper and pencil methods. Double columnar transposition was used by the US, Germany, and other nations from about 1900 to 1945. The message was scrambled by a columnar transposition, and that message was scrambled again by a second columnar transposition, usually using the same key for both steps. The French broke the German version, but it is believed that the Germans did not break the American version. Today advanced hobbyists routinely solve such messages using paper and pencil. In general, a transposition cipher by itself, no matter how sophisticated, is not secure. However, in the 1890's a French cryptographer named Etienne Bazeries showed that even a simple transposition cipher combined with a simple substitution cipher could provide a decent degree of security. This fact is at the heart of most modern block ciphers, including the AES and GK-Crypt. 3.3. Stream Ciphers The basic idea in a stream cipher is that each character in a message is enciphered using one character of the key. The first stream ciphers, which were called polyalphabetic ciphers, were developed in the 16th to 18th centuries. They used a short key, such as a single word, which was repeated as many times as needed for the length of the message. Typically, the key word would be written across the top of the page during encryption, and the message would be written in columns beneath it, leaving blank lines for writing in the encrypted message. Most of these classical stream ciphers used a substitution table, or tableau. For English this is normally a 26x26 grid of letters. Each of the 26 rows contains the 26 letters of the alphabet in some order. For the simpler ciphers, the letters might be the normal alphabet shifted by one position, left or right for each successive row, or written backwards and shifted. For the more sophisticated versions, each row might be scrambled. To encrypt one letter of the message, the letter would be found on the top row of the grid, and the key letter would be found along the left column of the grid. The letter lying in the same column as the message letter, on the row corresponding to the key letter would be the enciphered letter. Decryption would be the reverse process. The key letter would be found along the left edge, and the encrypted letter would be found in the corresponding row. Then the letter at the top of that column would be the letter from the original message. Slightly more complex variations of these classical ciphers would have the message letter written on an index row above the top row of the grid, and the key letters written in an index column to the left of the grid. This provided for more keying possibilities because different index rows and columns could be used with any grid. Most modern stream ciphers are simply elaborations on these principles. The message is combined character-by-character with the key stream to produce the encrypted message. In some of the oldest and weakest systems, the key stream was produced by taking the keyword and cycling the bits around. That is, the leftmost bit of a character would be moved to the righthand end, and the rest of the bits would be moved left one position. This would form the next character of the key stream. One of the simplest techniques was to use the file name, followed by the file name with each character cycled 1 bit position, then 2 bit positions, etc., until, after 8 shifts, it was back to the original file name. This produced a repeating key that was 8 times as long as the file name. This method is scarcely any stronger than the 16th century method of repeating the keyword over and over. 3.3.1. Random numbers Far more common is to use a random number generator to produce the key stream. The two most common forms of random number generators, more properly called pseudorandom number generators, are linear congruential and linear feedback generators. These will be described briefly, since a more detailed treatment would require deeper math. There is one basic mathematical concept that needs to be introduced in order to understand linear congruential generators. This is the idea of modular division. If n and p are positive integers, then the residue of n modulo p means the remainder when n is divided by p. For example, if n is 42 and p is 10, then the residue of n modulo p is 2. This is denoted 42 = 2 (mod 10). The basic idea in a linear congruential generator is that you start with some number, called the seed, then you multiply this number by a fixed multiplier, add a fixed constant, and then take the residue of this quantity modulo a fixed divisor. The result is the next random, or pseudorandom, number. This number becomes the next seed, and the process can be repeated to get a sequence of pseudorandom numbers. The process can be denoted as mr + c (mod d). Here m is the multiplier, r is the last random number, c is the constant, and d is the divisor. The operation mod indicates the modular division or remainder operation. There are two types linear congruential generators that are most commonly used in practice. The most common type uses a prime number as the modular divisor. This prime is usually made as large as one computer word can hold. When the divisor is a prime the constant c is usually 0. As a simple example, let d be 7, let m be 2, and let c be 0. Start with a seed value of 1. The next term will be 2x1(mod 7), or 2(mod 7), which is just 2. The third term will be 2x2(mod 7), which is 4. The fourth term is 2x4(mod 7), or 8(mod 7), which is 1. Since the first and fourth terms are the same, the sequence will repeat. We say that the period of the sequence is 3, since it contains just 3 numbers 1, 2, 4 that repeat endlessly. When 7 is the divisor 2 is not a good choice for the multiplier, since the period is only 3. If 3 were chosen as the multiplier, then the sequence would be 1, 3, 2, 6, 4, 5 with a period of 6. So 3 is a better choice. The second common type of linear congruential generator uses a power of 2 as the divisor. In this case c must be an odd number, and the multiplier m must be one of the numbers 5, 9, 13, etc. This type of generator is faster than the prime generators because the modular division can be done very fast with a power of 2. The values of m, c and d must be carefully chosen so that the generator will have good random properties, and will have a long period, that is, will generate many numbers before repeating. There are many statistical tests that can be used to test the random properties of such a generator. Random number generators of this type generally have good distribution properties, but do poorly on what is called the runs-up test. They tend to produce sequences of increasing numbers, such as 2, 19, 31, 177, 2161, 327648, 471255, etc., that are too long for real randomness. More importantly, linear congruential generators are not cryptographically strong. Each term in the sequence is determined from just the one term before, and that term fits within a single computer word, usually no more than 32 bits. It is easy for an attacker to try all 232 values until the correct seed is guessed. The data which is used to calculate the terms in the sequence is called the internal state of the generator. With a 32-bit computer 1 bit is used for the sign of the number and the other 31 bits In order to achieve today's standard of 128-bit security, it is necessary to add together the outputs of at least 5 linear congruential generators. These 5 generators may all use the same value of p, but they must use different values of m. Naturally, using 5 generators takes 5 times as long to compute each random number. Another type of random number generator is the linear feedback random number generator. This type of generator uses a larger seed of k computer words, x1, x2, x3, ..., xk. Each new term in the sequence is generated by xn = xn-a $ xn-b $ xn-c $ ... $ xn-k Here $ represents some combining operation, usually addition or exclusive-or. The terms xn-a, xn-b, etc., represent one or more of the last k terms of the sequence. Sometimes a more general linear function is used, such as xn = c1xn-1 + c2xn-2 + ... + ckxn-k where c1, c2, etc. are integer coefficients (multipliers). Usually most of these coefficients will be 0 for the sake of speed. The problem with all of the linear feedback generators is that the random numbers can be expressed by linear equations, and linear equations are easy to solve. Given a small amount of known, or guessed text, an attacker can solve the equations to determine the seeds. Once the seeds are known, the entire random key stream is known, so the entire message can be read. The bottom line is that neither type of random number generator is safe. Linear congruential generators have small seeds which can be exhaustively tried. Linear feedback generators have large seeds, but lead to simple equations that are easily solved. 3.3.2. One-time pad The original One-Time Pad was an encryption method where each character of the message was combined with a separate character from the keystream. It is called "one-time pad" because secret agents were actually sent into the field with pads of special water-soluble paper on which hundreds of completely random characters were written. These characters were used as keys which could be used for encrypting a message by the tableau method described above. After each message was sent, the agent would flush away that sheet. The person receiving the message would use an identical sheet, perhaps a carbon copy, to decipher the message. The key to the success of this method is that the keys have to be completely random. During World War II the British actually had a woman draw lettered balls out of a Bingo cage, while the Germans used a system with a swinging pendulum. With true random keys, any character of the message can become any other character in the encrypted message with equal likelihood. This can be formally proven to be perfectly secure. That is why so many encryption products claim to be like the one-time pad, regardless of whether they are similar or not. To achieve this absolute security, each character of the random key must be completely independent of any other character, and no part of the random key stream can ever be used again. This cannot be done using pseudorandom numbers. Most products on the market today that claim to use the One-Time Pad use a pseudorandom number generator as described in the preceding section. Such systems are insecure because pseudorandom numbers eventually repeat, because they can be determined by exhaustive trial, and/or they are generated by simple linear relationships that can be solved easily. 3.3.3. OP-Crypt The OP-Crypt package makes several improvements over other products which claim to use the one-time pad. Data is not encrypted one character at a time. It is encrypted in blocks using a block cipher stronger than AES (the Advanced Ecryption Standard). If an opponent knew some of the message, they could not determine the key that was used to produce the encrypted message. Therefore they could not determine any part of the keystream. The key stream is not generated by a linear mathematical function, which can be guessed or solved. The key stream is generated from true random data derived from random physical processes such as gamma rays and solar flares. Each user is furnished with more than 10,000,000 characters of such true random data. The key for each message is obtained by encrypting randomly chosen portions of the already-random key data with a different encryption key for each block. There are actually 4 keys for each block of the message, two key-encrypting keys which encrypt two separate portions of the random data to obtain two message-encrypting keys. Each message block is encrypted twice, making it impossible for an attacker to determine the two message keys. Of course the user need not be aware of any of this. All of the keys are generated internally by the software, and the user never sees them. 4. PUBLIC KEY CRYPTOGRAPHY Public Key cryptography is a relatively new idea, dating back only to the 1970's. The basic idea is that each party has two functions, E to encrypt a message, and D to decrypt the message. The function E is known to everyone. It is publicly available, either by asking the person with whom you wish to communicate, or by some kind of key-distributing network. The leading method for public key cryptography uses two large prime numbers, say two 150-digit primes. Each user of the system secretly chooses two such primes, and forms their 300-digit product. This product is made public along with a large number E. Whenever a person wants to send a message to the receiver, they treat the message as a series of 300-digit numbers. They then encrypt each block by raising the number to the E power modulo that 300-digit product. (See the discussion of modular division in the section on random numbers.) The receiver, who knows the two primes, can easily calculate the secret decrypting number D. The receiver raises each encrypted block of the message to the D power modulo the 300-digit product, and gets back the original message. A third party who wanted to read the encrypted message would need to factor the 300-digit product to determine the two primes in order to calculate D. Factoring large numbers can be very difficult and can take massive amounts of computing time. There are two drawbacks to this form of public key cryptography. First, it is very slow. Raising a large number to a large power modulo a large number takes a lot of time, even with the most sophisticated methods. This is why some public key software packages include data compression functions. Second, improvements in factoring algorithms may make this form of encryption insecure. Some years ago mathematicians learned how to use elliptic curves to factor large numbers. Suddenly all of the public key messages that had been sent using 50-digit or 100-digit primes were no longer secure. People who had collected and recorded such messages could now read them, and learn the secrets they contained. People who were using public key cryptography had to switch to 150-digit or 200-digit primes if they wanted security. 5. PRIVATE KEY CRYPTOGRAPHY The newest form of encryption is Private Key Cryptography. In this form, each party has a private key which nobody else knows. The private key is never shared with other parties, so there is no need for any form of key exchange or key distribution. The great advantage of private key cryptography is that any two parties anywhere who have the private key software can securely exchange messages without having to make any prior arrangements. No third party who intercepts the messages can read them, even if the third party also has the same private key software. The method is to treat each message as a series of numbers. The sender performs a mathematical encryption operation on each block of the message using the sender's encryption key. The receiver then performs a second mathematical encryption operation on each block of the message using the receiver's encryption key. These mathematical operations are chosen so that they commute, that is you would get the same result if you did the two encryptions in the opposite order. For example, ordinary addition commutes, so that 2+5 gives the same result as 5+2. This makes it possible for the sender to remove the first encryption by decrypting the message using the sender's decryption key. That leaves the message encrypted only with the receiver's key. So now the receiver can decrypt the message using the receiver's own decryption key. The disadvantage, of course, is that the message must be transmitted 3 times so that the sender and receiver can perform their respective encryption and decryption steps. For many applications, however, the advantage of not needing to exchange encryption keys outweighs this disadvantage.