The Code Book - Part 1
Library

Part 1

The code book.

The evolution of secrecy from Mary Queen of Scots to quantum cryptography.

Simon Singh.

Introduction.

For thousands of years, kings, queens and generals have relied on efficient communication in order to govern their countries and command their armies. At the same time, they have all been aware of the consequences of their messages falling into the wrong hands, revealing precious secrets to rival nations and betraying vital information to opposing forces. It was the threat of enemy interception that motivated the development of codes and ciphers: techniques for disguising a message so that only the intended recipient can read it.

The desire for secrecy has meant that nations have operated codemaking departments, responsible for ensuring the security of communications by inventing and implementing the best possible codes. At the same time, enemy codebreakers have attempted to break these codes, and steal secrets. Codebreakers are linguistic alchemists, a mystical tribe attempting to conjure sensible words out of meaningless symbols. The history of codes and ciphers is the story of the centuries-old battle between codemakers and codebreakers, an intellectual arms race that has had a dramatic impact on the course of history.

In writing The Code Book The Code Book, I have had two main objectives. The first is to chart the evolution of codes. Evolution is a wholly appropriate term, because the development of codes can be viewed as an evolutionary struggle. A code is constantly under attack from codebreakers. When the codebreakers have developed a new weapon that reveals a code's weakness, then the code is no longer useful. It either becomes extinct or it evolves into a new, stronger code. In turn, this new code thrives only until the codebreakers identify its weakness, and so on. This is a.n.a.logous to the situation facing, for example, a strain of infectious bacteria. The bacteria live, thrive and survive until doctors discover an antibiotic that exposes a weakness in the bacteria and kills them. The bacteria are forced to evolve and outwit the antibiotic, and, if successful, they will thrive once again and reestablish themselves. The bacteria are continually forced to evolve in order to survive the onslaught of new antibiotics. and outwit the antibiotic, and, if successful, they will thrive once again and reestablish themselves. The bacteria are continually forced to evolve in order to survive the onslaught of new antibiotics.

The ongoing battle between codemakers and codebreakers has inspired a whole series of remarkable scientific breakthroughs. The codemakers have continually striven to construct ever-stronger codes for defending communications, while codebreakers have continually invented more powerful methods for attacking them. In their efforts to destroy and preserve secrecy, both sides have drawn upon a diverse range of disciplines and technologies, from mathematics to linguistics, from information theory to quantum theory. In return, codemakers and codebreakers have enriched these subjects, and their work has accelerated technological development, most notably in the case of the modern computer.

History is punctuated with codes. They have decided the outcomes of battles and led to the deaths of kings and queens. I have therefore been able to call upon stories of political intrigue and tales of life and death to ill.u.s.trate the key turning points in the evolutionary development of codes. The history of codes is so inordinately rich that I have been forced to leave out many fascinating stories, which in turn means that my account is not definitive. If you would like to find out more about your favorite tale or your favorite codebreaker then I would refer you to the list of further reading, which should help those readers who would like to study the subject in more detail.

Having discussed the evolution of codes and their impact on history, the book's second objective is to demonstrate how the subject is more relevant today than ever before. As information becomes an increasingly valuable commodity, and as the communications revolution changes society, so the process of encoding messages, known as encryption, will play an increasing role in everyday life. Nowadays our phone calls bounce off satellites and our e-mails pa.s.s through various computers, and both forms of communication can be intercepted with ease, so jeopardizing our privacy. Similarly, as more and more business is conducted over the Internet, safeguards must be put in place to protect companies and their clients. Encryption is the only way to protect our privacy and guarantee the success of the digital marketplace. The art of secret communication, otherwise known as cryptography, will provide the locks and keys of the Information Age. otherwise known as cryptography, will provide the locks and keys of the Information Age.

However, the public's growing demand for cryptography conflicts with the needs of law enforcement and national security. For decades, the police and the intelligence services have used wire-taps to gather evidence against terrorists and organized crime syndicates, but the recent development of ultra-strong codes threatens to undermine the value of wire-taps. As we enter the twenty-first century, civil libertarians are pressing for the widespread use of cryptography in order to protect the privacy of the individual. Arguing alongside them are businesses, who require strong cryptography in order to guarantee the security of transactions within the fast-growing world of Internet commerce. At the same time, the forces of law and order are lobbying governments to restrict the use of cryptography. The question is, which do we value more-our privacy or an effective police force? Or is there a compromise?

Although cryptography is now having a major impact on civilian activities, it should be noted that military cryptography remains an important subject. It has been said that the First World War was the chemists' war, because mustard gas and chlorine were employed for the first time, and that the Second World War was the physicists' war, because the atom bomb was detonated. Similarly, it has been argued that the Third World War would be the mathematicians' war, because mathematicians will have control over the next great weapon of war-information. Mathematicians have been responsible for developing the codes that are currently used to protect military information. Not surprisingly, mathematicians are also at the forefront of the battle to break these codes.

While describing the evolution of codes and their impact on history, I have allowed myself a minor detour. Chapter 5 Chapter 5 describes the decipherment of various ancient scripts, including Linear B and Egyptian hieroglyphics. Technically, cryptography concerns communications that are deliberately designed to keep secrets from an enemy, whereas the writings of ancient civilizations were not intended to be indecipherable: it is merely that we have lost the ability to interpret them. However, the skills required to uncover the meaning of archaeological texts are closely related to the art of codebreaking. Ever since reading describes the decipherment of various ancient scripts, including Linear B and Egyptian hieroglyphics. Technically, cryptography concerns communications that are deliberately designed to keep secrets from an enemy, whereas the writings of ancient civilizations were not intended to be indecipherable: it is merely that we have lost the ability to interpret them. However, the skills required to uncover the meaning of archaeological texts are closely related to the art of codebreaking. Ever since reading The Decipherment of Linear B The Decipherment of Linear B, John Chadwick's description of how an ancient Mediterranean text was unraveled, I have been struck by the astounding intellectual achievements of those men and women who have been able to decipher the scripts of our ancestors, thereby allowing us to read about their civilizations, religions and everyday lives. was unraveled, I have been struck by the astounding intellectual achievements of those men and women who have been able to decipher the scripts of our ancestors, thereby allowing us to read about their civilizations, religions and everyday lives.

Turning to the purists, I should apologize for the t.i.tle of this book. The Code Book The Code Book is about more than just codes. The word "code" refers to a very particular type of secret communication, one that has declined in use over the centuries. In a code, a word or phrase is replaced with a word, number or symbol. For example, secret agents have codenames, words that are used instead of their real names in order to mask their ident.i.ties. Similarly, the phrase is about more than just codes. The word "code" refers to a very particular type of secret communication, one that has declined in use over the centuries. In a code, a word or phrase is replaced with a word, number or symbol. For example, secret agents have codenames, words that are used instead of their real names in order to mask their ident.i.ties. Similarly, the phrase Attack at dawn Attack at dawn could be replaced by the codeword could be replaced by the codeword Jupiter Jupiter, and this word could be sent to a commander in the battlefield as a way of baffling the enemy. If headquarters and the commander have previously agreed on the code, then the meaning of Jupiter will be clear to the intended recipient, but it will mean nothing to an enemy who intercepts it. The alternative to a code is a cipher, a technique that acts at a more fundamental level, by replacing letters rather than whole words. For example, each letter in a phrase could be replaced by the next letter in the alphabet, so that A A is replaced by is replaced by B, B B, B by by C C, and so on. Attack at dawn Attack at dawn thus becomes thus becomes Buubdl bu ebxo Buubdl bu ebxo. Ciphers play an integral role in cryptography, and so this book should really have been called The Code and Cipher Book The Code and Cipher Book. I have, however, forsaken accuracy for snappiness.

As the need arises, I have defined the various technical terms used within cryptography. Although I have generally adhered to these definitions, there will be occasions when I use a term that is perhaps not technically accurate, but which I feel is more familiar to the non-specialist. For example, when describing a person attempting to break a cipher, I have often used codebreaker codebreaker rather than the more accurate rather than the more accurate cipherbreaker cipherbreaker. I have done this only when the meaning of the word is obvious from the context. There is a glossary of terms at the end of the book. More often than not, though, crypto-jargon is quite transparent: for example, plaintext plaintext is the message before encryption, and is the message before encryption, and ciphertext ciphertext is the message after encryption. is the message after encryption.

Before concluding this introduction, I must mention a problem that faces any author who tackles the subject of cryptography: the science of secrecy is largely a secret science. Many of the heroes in this book never gained recognition for their work during their lifetimes because their contribution could not be publicly acknowledged while their invention was still of diplomatic or military value. While researching this book, I was able to talk to experts at Britain's Government Communications Headquarters (GCHQ), who revealed details of extraordinary research done in the 1970s which has only just been decla.s.sified. As a result of this decla.s.sification, three of the world's greatest cryptographers can now receive the credit they deserve. However, this recent revelation has merely served to remind me that there is a great deal more going on, of which neither I nor any other science writer is aware. Organizations such as GCHQ and America's National Security Agency continue to conduct cla.s.sified research into cryptography, which means that their breakthroughs remain secret and the individuals who make them remain anonymous. gained recognition for their work during their lifetimes because their contribution could not be publicly acknowledged while their invention was still of diplomatic or military value. While researching this book, I was able to talk to experts at Britain's Government Communications Headquarters (GCHQ), who revealed details of extraordinary research done in the 1970s which has only just been decla.s.sified. As a result of this decla.s.sification, three of the world's greatest cryptographers can now receive the credit they deserve. However, this recent revelation has merely served to remind me that there is a great deal more going on, of which neither I nor any other science writer is aware. Organizations such as GCHQ and America's National Security Agency continue to conduct cla.s.sified research into cryptography, which means that their breakthroughs remain secret and the individuals who make them remain anonymous.

Despite the problems of government secrecy and cla.s.sified research, I have spent the final chapter of this book speculating about the future of codes and ciphers. Ultimately, this chapter is an attempt to see if we can predict who will win the evolutionary struggle between codemaker and codebreaker. Will codemakers ever design a truly unbreakable code and succeed in their quest for absolute secrecy? Or will codebreakers build a machine that can decipher any message? Bearing in mind that some of the greatest minds work in cla.s.sified laboratories, and that they receive the bulk of research funds, it is clear that some of the statements in my final chapter may be inaccurate. For example, I state that quantum computers-machines potentially capable of breaking all today's ciphers-are at a very primitive stage, but it is possible that somebody has already built one. The only people who are in a position to point out my errors are also those who are not at liberty to reveal them.

1 The Cipher of Mary Queen of Scots

On the morning of Sat.u.r.day, October 15, 1586, Queen Mary entered the crowded courtroom at Fotheringhay Castle. Years of imprisonment and the onset of rheumatism had taken their toll, yet she remained dignified, composed and indisputably regal. a.s.sisted by her physician, she made her way past the judges, officials and spectators, and approached the throne that stood halfway along the long, narrow chamber. Mary had a.s.sumed that the throne was a gesture of respect toward her, but she was mistaken. The throne symbolized the absent Queen Elizabeth, Mary's enemy and prosecutor. Mary was gently guided away from the throne and toward the opposite side of the room, to the defendant's seat, a crimson velvet chair.

Mary Queen of Scots was on trial for treason. She had been accused of plotting to a.s.sa.s.sinate Queen Elizabeth in order to take the English crown for herself. Sir Francis Walsingham, Elizabeth's Princ.i.p.al Secretary, had already arrested the other conspirators, extracted confessions, and executed them. Now he planned to prove that Mary was at the heart of the plot, and was therefore equally culpable and equally deserving of death.

Walsingham knew that before he could have Mary executed, he would have to convince Queen Elizabeth of her guilt. Although Elizabeth despised Mary, she had several reasons for being reluctant to see her put to death. First, Mary was a Scottish queen, and many questioned whether an English court had the authority to execute a foreign head of state. Second, executing Mary might establish an awkward precedent-if the state is allowed to kill one queen, then perhaps rebels might have fewer reservations about killing another, namely Elizabeth. Third, Elizabeth and Mary were cousins, and their blood tie made Elizabeth all the more squeamish about ordering her execution. In short, Elizabeth would sanction Mary's execution only if Walsingham could prove beyond any hint of doubt that she had been part of the a.s.sa.s.sination plot. sanction Mary's execution only if Walsingham could prove beyond any hint of doubt that she had been part of the a.s.sa.s.sination plot.

[image]

Figure 1 Mary Queen of Scots.( Mary Queen of Scots.(photo credit 1.1) The conspirators were a group of young English Catholic n.o.blemen intent on removing Elizabeth, a Protestant, and replacing her with Mary, a fellow Catholic. It was apparent to the court that Mary was a figurehead for the conspirators, but it was not clear that she had actually given her blessing to the conspiracy. In fact, Mary had authorized the plot. The challenge for Walsingham was to demonstrate a palpable link between Mary and the plotters.

On the morning of her trial, Mary sat alone in the dock, dressed in sorrowful black velvet. In cases of treason, the accused was forbidden counsel and was not permitted to call witnesses. Mary was not even allowed secretaries to help her prepare her case. However, her plight was not hopeless because she had been careful to ensure that all her correspondence with the conspirators had been written in cipher. The cipher turned her words into a meaningless series of symbols, and Mary believed that even if Walsingham had captured the letters, then he could have no idea of the meaning of the words within them. If their contents were a mystery, then the letters could not be used as evidence against her. However, this all depended on the a.s.sumption that her cipher had not been broken.

Unfortunately for Mary, Walsingham was not merely Princ.i.p.al Secretary, he was also England's spymaster. He had intercepted Mary's letters to the plotters, and he knew exactly who might be capable of deciphering them. Thomas Phelippes was the nation's foremost expert on breaking codes, and for years he had been deciphering the messages of those who plotted against Queen Elizabeth, thereby providing the evidence needed to condemn them. If he could decipher the incriminating letters between Mary and the conspirators, then her death would be inevitable. On the other hand, if Mary's cipher was strong enough to conceal her secrets, then there was a chance that she might survive. Not for the first time, a life hung on the strength of a cipher.

The Evolution of Secret Writing Some of the earliest accounts of secret writing date back to Herodotus, "the father of history" according to the Roman philosopher and statesman Cicero. In statesman Cicero. In The Histories The Histories, Herodotus chronicled the conflicts between Greece and Persia in the fifth century B.C B.C., which he viewed as a confrontation between freedom and slavery, between the independent Greek states and the oppressive Persians. According to Herodotus, it was the art of secret writing that saved Greece from being conquered by Xerxes, King of Kings, the despotic leader of the Persians.

The long-running feud between Greece and Persia reached a crisis soon after Xerxes began constructing a city at Persepolis, the new capital for his kingdom. Tributes and gifts arrived from all over the empire and neighboring states, with the notable exceptions of Athens and Sparta. Determined to avenge this insolence, Xerxes began mobilizing a force, declaring that "we shall extend the empire of Persia such that its boundaries will be G.o.d's own sky, so the sun will not look down upon any land beyond the boundaries of what is our own." He spent the next five years secretly a.s.sembling the greatest fighting force in history, and then, in 480 B.C B.C., he was ready to launch a surprise attack.

However, the Persian military buildup had been witnessed by Demaratus, a Greek who had been expelled from his homeland and who lived in the Persian city of Susa. Despite being exiled he still felt some loyalty to Greece, so he decided to send a message to warn the Spartans of Xerxes' invasion plan. The challenge was how to dispatch the message without it being intercepted by the Persian guards. Herodotus wrote: As the danger of discovery was great, there was only one way in which he could contrive to get the message through: this was by sc.r.a.ping the wax off a pair of wooden folding tablets, writing on the wood underneath what Xerxes intended to do, and then covering the message over with wax again. In this way the tablets, being apparently blank, would cause no trouble with the guards along the road. When the message reached its destination, no one was able to guess the secret, until, as I understand, Cleomenes' daughter Gorgo, who was the wife of Leonidas, divined and told the others that if they sc.r.a.ped the wax off, they would find something written on the wood underneath. This was done; the message was revealed and read, and afterward pa.s.sed on to the other Greeks.

As a result of this warning, the hitherto defenseless Greeks began to arm themselves. Profits from the state-owned silver mines, which were usually shared among the citizens, were instead diverted to the navy for the construction of two hundred warships. shared among the citizens, were instead diverted to the navy for the construction of two hundred warships.

Xerxes had lost the vital element of surprise and, on September 23, 480 B.C B.C., when the Persian fleet approached the Bay of Salamis near Athens, the Greeks were prepared. Although Xerxes believed he had trapped the Greek navy, the Greeks were deliberately enticing the Persian ships to enter the bay. The Greeks knew that their ships, smaller and fewer in number, would have been destroyed in the open sea, but they realized that within the confines of the bay they might outmaneuver the Persians. As the wind changed direction the Persians found themselves being blown into the bay, forced into an engagement on Greek terms. The Persian princess Artemisia became surrounded on three sides and attempted to head back out to sea, only to ram one of her own ships. Panic ensued, more Persian ships collided and the Greeks launched a full-blooded onslaught. Within a day, the formidable forces of Persia had been humbled.

Demaratus' strategy for secret communication relied on simply hiding the message. Herodotus also recounted another incident in which concealment was sufficient to secure the safe pa.s.sage of a message. He chronicled the story of Histaiaeus, who wanted to encourage Aristagoras of Miletus to revolt against the Persian king. To convey his instructions securely, Histaiaeus shaved the head of his messenger, wrote the message on his scalp, and then waited for the hair to regrow. This was clearly a period of history that tolerated a certain lack of urgency. The messenger, apparently carrying nothing contentious, could travel without being hara.s.sed. Upon arriving at his destination he then shaved his head and pointed it at the intended recipient.

Secret communication achieved by hiding the existence of a message is known as steganography steganography, derived from the Greek words steganos steganos, meaning "covered," and graphein graphein, meaning "to write." In the two thousand years since Herodotus, various forms of steganography have been used throughout the world. For example, the ancient Chinese wrote messages on fine silk, which was then scrunched into a tiny ball and covered in wax. The messenger would then swallow the ball of wax. In the sixteenth century, the Italian scientist Giovanni Porta described how to conceal a message within a hard-boiled egg by making an ink from a mixture of one ounce of alum and a pint of vinegar, and then using it to write on the sh.e.l.l. The solution penetrates the porous sh.e.l.l, and leaves a message on the surface of the hardened egg alb.u.men, which can be read only when the sh.e.l.l is removed. Steganography also includes the practice of writing in invisible ink. As far back as the first century ounce of alum and a pint of vinegar, and then using it to write on the sh.e.l.l. The solution penetrates the porous sh.e.l.l, and leaves a message on the surface of the hardened egg alb.u.men, which can be read only when the sh.e.l.l is removed. Steganography also includes the practice of writing in invisible ink. As far back as the first century A.D. A.D., Pliny the Elder explained how the "milk" of the thithymallus plant could be used as an invisible ink. Although transparent after drying, gentle heating chars the ink and turns it brown. Many organic fluids behave in a similar way, because they are rich in carbon and therefore char easily. Indeed, it is not unknown for modern spies who have run out of standard-issue invisible ink to improvise by using their own urine.

The longevity of steganography ill.u.s.trates that it certainly offers a modic.u.m of security, but it suffers from a fundamental weakness. If the messenger is searched and the message is discovered, then the contents of the secret communication are revealed at once. Interception of the message immediately compromises all security. A thorough guard might routinely search any person crossing a border, sc.r.a.ping any wax tablets, heating blank sheets of paper, sh.e.l.ling boiled eggs, shaving people's heads, and so on, and inevitably there will be occasions when the message is uncovered.

Hence, in parallel with the development of steganography, there was the evolution of cryptography cryptography, derived from the Greek word kryptos kryptos, meaning "hidden." The aim of cryptography is not to hide the existence of a message, but rather to hide its meaning, a process known as encryption encryption. To render a message unintelligible, it is scrambled according to a particular protocol which is agreed beforehand between the sender and the intended recipient. Thus the recipient can reverse the scrambling protocol and make the message comprehensible. The advantage of cryptography is that if the enemy intercepts an encrypted message, then the message is unreadable. Without knowing the scrambling protocol, the enemy should find it difficult, if not impossible, to recreate the original message from the encrypted text.

Although cryptography and steganography are independent, it is possible to both scramble and hide a message to maximize security. For example, the microdot is a form of steganography that became popular during the Second World War. German agents in Latin America would photographically shrink a page of text down to a dot less than 1 millimeter in diameter, and then hide this microdot on top of a full stop in an apparently innocuous letter. The first microdot to be spotted by the FBI was in 1941, following a tip-off that the Americans should look for a tiny gleam from the surface of a letter, indicative of smooth film. Thereafter, the Americans could read the contents of most intercepted microdots, except when the German agents had taken the extra precaution of scrambling their message before reducing it. In such cases of cryptography combined with steganography, the Americans were sometimes able to intercept and block communications, but they were prevented from gaining any new information about German spying activity. Of the two branches of secret communication, cryptography is the more powerful because of this ability to prevent information from falling into enemy hands. in diameter, and then hide this microdot on top of a full stop in an apparently innocuous letter. The first microdot to be spotted by the FBI was in 1941, following a tip-off that the Americans should look for a tiny gleam from the surface of a letter, indicative of smooth film. Thereafter, the Americans could read the contents of most intercepted microdots, except when the German agents had taken the extra precaution of scrambling their message before reducing it. In such cases of cryptography combined with steganography, the Americans were sometimes able to intercept and block communications, but they were prevented from gaining any new information about German spying activity. Of the two branches of secret communication, cryptography is the more powerful because of this ability to prevent information from falling into enemy hands.

In turn, cryptography itself can be divided into two branches, known as transposition transposition and and subst.i.tution subst.i.tution. In transposition, the letters of the message are simply rearranged, effectively generating an anagram. For very short messages, such as a single word, this method is relatively insecure because there are only a limited number of ways of rearranging a handful of letters. For example, three letters can be arranged in only six different ways, e.g., cow, cwo, ocw, owc, wco, woc. However, as the number of letters gradually increases, the number of possible arrangements rapidly explodes, making it impossible to get back to the original message unless the exact scrambling process is known. For example, consider this short sentence. It contains just 35 letters, and yet there are more than 50,000,000,000,000,000,000,000,000,000,000 distinct arrangements of them. If one person could check one arrangement per second, and if all the people in the world worked night and day, it would still take more than a thousand times the lifetime of the universe to check all the arrangements.

A random transposition of letters seems to offer a very high level of security, because it would be impractical for an enemy interceptor to unscramble even a short sentence. But there is a drawback. Transposition effectively generates an incredibly difficult anagram, and if the letters are randomly jumbled, with neither rhyme nor reason, then unscrambling the anagram is impossible for the intended recipient, as well as an enemy interceptor. In order for transposition to be effective, the rearrangement of letters needs to follow a straightforward system, one that has been previously agreed by sender and receiver, but kept secret from the enemy. For example, schoolchildren sometimes send messages using the "rail fence" transposition, in which the message is written with alternate letters on separate upper and lower lines. The sequence of letters on the lower line is then tagged on at the end of the sequence on the upper line to create the final encrypted message. For example: previously agreed by sender and receiver, but kept secret from the enemy. For example, schoolchildren sometimes send messages using the "rail fence" transposition, in which the message is written with alternate letters on separate upper and lower lines. The sequence of letters on the lower line is then tagged on at the end of the sequence on the upper line to create the final encrypted message. For example: [image]

The receiver can recover the message by simply reversing the process. There are various other forms of systematic transposition, including the three-line rail fence cipher, in which the message is first written on three separate lines instead of two. Alternatively, one could swap each pair of letters, so that the first and second letters switch places, the third and fourth letters switch places, and so on.

Another form of transposition is embodied in the first ever military cryptographic device, the Spartan scytale scytale, dating back to the fifth century B.C B.C. The scytale is a wooden staff around which a strip of leather or parchment is wound, as shown in Figure 2 Figure 2. The sender writes the message along the length of the scytale, and then unwinds the strip, which now appears to carry a list of meaningless letters. The message has been scrambled. The messenger would take the leather strip, and, as a steganographic twist, he would sometimes disguise it as a belt with the letters hidden on the inside. To recover the message, the receiver simply wraps the leather strip around a scytale of the same diameter as the one used by the sender. In 404 appears to carry a list of meaningless letters. The message has been scrambled. The messenger would take the leather strip, and, as a steganographic twist, he would sometimes disguise it as a belt with the letters hidden on the inside. To recover the message, the receiver simply wraps the leather strip around a scytale of the same diameter as the one used by the sender. In 404 B.C B.C. Lysander of Sparta was confronted by a messenger, b.l.o.o.d.y and battered, one of only five to have survived the arduous journey from Persia. The messenger handed his belt to Lysander, who wound it around his scytale to learn that Pharnabazus of Persia was planning to attack him. Thanks to the scytale, Lysander was prepared for the attack and repulsed it.

[image]

Figure 2 When it is unwound from the sender's scytale (wooden staff), the leather strip appears to carry a list of random letters; When it is unwound from the sender's scytale (wooden staff), the leather strip appears to carry a list of random letters; S, T, S, F S, T, S, F,.... Only by rewinding the strip around another scytale of the correct diameter will the message reappear.

The alternative to transposition is subst.i.tution. One of the earliest descriptions of encryption by subst.i.tution appears in the Kma-Stra Kma-Stra, a text written in the fourth century A.D A.D. by the Brahmin scholar Vtsyyana, but based on ma.n.u.scripts dating back to the fourth century B.C B.C. The Kma-Stra Kma-Stra recommends that women should study 64 arts, such as cooking, dressing, ma.s.sage and the preparation of perfumes. The list also includes some less obvious arts, namely conjuring, chess, bookbinding and carpentry. Number 45 on the list is recommends that women should study 64 arts, such as cooking, dressing, ma.s.sage and the preparation of perfumes. The list also includes some less obvious arts, namely conjuring, chess, bookbinding and carpentry. Number 45 on the list is mlecchita-vikalp mlecchita-vikalp, the art of secret writing, advocated in order to help women conceal the details of their liaisons. One of the recommended techniques is to pair letters of the alphabet at random, and then subst.i.tute each letter in the original message with its partner. If we apply the principle to the Roman alphabet, we could pair letters as follows: [image]

Then, instead of meet at midnight, the sender would write CUUZ VZ CGXSGIBZ. This form of secret writing is called a subst.i.tution cipher because each letter in the plaintext is subst.i.tuted for a different letter, thus acting in a complementary way to the transposition cipher. In transposition each letter retains its ident.i.ty but changes its position, whereas in subst.i.tution each letter changes its ident.i.ty but retains its position.

The first doc.u.mented use of a subst.i.tution cipher for military purposes appears in Julius Caesar's Gallic Wars Gallic Wars. Caesar describes how he sent a message to Cicero, who was besieged and on the verge of surrendering. The subst.i.tution replaced Roman letters with Greek letters, rendering the message unintelligible to the enemy. Caesar described the dramatic delivery of the message: message to Cicero, who was besieged and on the verge of surrendering. The subst.i.tution replaced Roman letters with Greek letters, rendering the message unintelligible to the enemy. Caesar described the dramatic delivery of the message: The messenger was instructed, if he could not approach, to hurl a spear, with the letter fastened to the thong, inside the entrenchment of the camp. Fearing danger, the Gaul discharged the spear, as he had been instructed. By chance it stuck fast in the tower, and for two days was not sighted by our troops; on the third day it was sighted by a soldier, taken down, and delivered to Cicero. He read it through and then recited it at a parade of the troops, bringing the greatest rejoicing to all.

Caesar used secret writing so frequently that Valerius Probus wrote an entire treatise on his ciphers, which unfortunately has not survived. However, thanks to Suetonius' Lives of the Caesars LVI Lives of the Caesars LVI, written in the second century A.D A.D., we do have a detailed description of one of the types of subst.i.tution cipher used by Julius Caesar. He simply replaced each letter in the message with the letter that is three places further down the alphabet. Cryptographers often think in terms of the plain alphabet plain alphabet, the alphabet used to write the original message, and the cipher alphabet cipher alphabet, the letters that are subst.i.tuted in place of the plain letters. When the plain alphabet is placed above the cipher alphabet, as shown in Figure 3 Figure 3, it is clear that the cipher alphabet has been shifted by three places, and hence this form of subst.i.tution is often called the Caesar shift cipher Caesar shift cipher, or simply the Caesar cipher. A cipher is the name given to any form of cryptographic subst.i.tution in which each letter is replaced by another letter or symbol. subst.i.tution in which each letter is replaced by another letter or symbol.

[image]

Figure 3 The Caesar cipher applied to a short message. The Caesar cipher is based on a cipher alphabet that is shifted a certain number of places (in this case three), relative to the plain alphabet. The convention in cryptography is to write the plain alphabet in lower-case letters, and the cipher alphabet in capitals. Similarly, the original message, the plaintext, is written in lower case, and the encrypted message, the ciphertext, is written in capitals. The Caesar cipher applied to a short message. The Caesar cipher is based on a cipher alphabet that is shifted a certain number of places (in this case three), relative to the plain alphabet. The convention in cryptography is to write the plain alphabet in lower-case letters, and the cipher alphabet in capitals. Similarly, the original message, the plaintext, is written in lower case, and the encrypted message, the ciphertext, is written in capitals.

Although Suetonius mentions only a Caesar shift of three places, it is clear that by using any shift between 1 and 25 places it is possible to generate 25 distinct ciphers. In fact, if we do not restrict ourselves to shifting the alphabet and permit the cipher alphabet to be any rearrangement of the plain alphabet, then we can generate an even greater number of distinct ciphers. There are over 400,000,000,000,000,000,000,000,000 such rearrangements, and therefore the same number of distinct ciphers.

Each distinct cipher can be considered in terms of a general encrypting method, known as the algorithm algorithm, and a key key, which specifies the exact details of a particular encryption. In this case, the algorithm involves subst.i.tuting each letter in the plain alphabet with a letter from a cipher alphabet, and the cipher alphabet is allowed to consist of any rearrangement of the plain alphabet. The key defines the exact cipher alphabet to be used for a particular encryption. The relationship between the algorithm and the key is ill.u.s.trated in Figure 4 Figure 4.

An enemy studying an intercepted scrambled message may have a strong suspicion of the algorithm, but would not know the exact key. For example, they may well suspect that each letter in the plaintext has been replaced by a different letter according to a particular cipher alphabet, but they are unlikely to know which cipher alphabet has been used. If the cipher alphabet, the key, is kept a closely guarded secret between the sender and the receiver, then the enemy cannot decipher the intercepted message. The significance of the key, as opposed to the algorithm, is an enduring principle of cryptography. It was definitively stated in 1883 by the Dutch linguist Auguste Kerckhoffs von Nieuwenhof in his book replaced by a different letter according to a particular cipher alphabet, but they are unlikely to know which cipher alphabet has been used. If the cipher alphabet, the key, is kept a closely guarded secret between the sender and the receiver, then the enemy cannot decipher the intercepted message. The significance of the key, as opposed to the algorithm, is an enduring principle of cryptography. It was definitively stated in 1883 by the Dutch linguist Auguste Kerckhoffs von Nieuwenhof in his book La Cryptographie militaire: La Cryptographie militaire: "Kerckhoffs' Principle: The security of a cryptosystem must not depend on keeping secret the crypto-algorithm. The security depends only on keeping secret the key." "Kerckhoffs' Principle: The security of a cryptosystem must not depend on keeping secret the crypto-algorithm. The security depends only on keeping secret the key."

[image]

Figure 4 To encrypt a plaintext message, the sender pa.s.ses it through an encryption algorithm. The algorithm is a general system for encryption, and needs to be specified exactly by selecting a key. Applying the key and algorithm together to a plaintext generates the encrypted message, or ciphertext. The ciphertext may be intercepted by an enemy while it is being transmitted to the receiver, but the enemy should not be able to decipher the message. However, the receiver, who knows both the key and the algorithm used by the sender, is able to turn the ciphertext back into the plaintext message. To encrypt a plaintext message, the sender pa.s.ses it through an encryption algorithm. The algorithm is a general system for encryption, and needs to be specified exactly by selecting a key. Applying the key and algorithm together to a plaintext generates the encrypted message, or ciphertext. The ciphertext may be intercepted by an enemy while it is being transmitted to the receiver, but the enemy should not be able to decipher the message. However, the receiver, who knows both the key and the algorithm used by the sender, is able to turn the ciphertext back into the plaintext message.

In addition to keeping the key secret, a secure cipher system must also have a wide range of potential keys. For example, if the sender uses the Caesar shift cipher to encrypt a message, then encryption is relatively weak because there are only 25 potential keys. From the enemy's point of view, if they intercept the message and suspect that the algorithm being used is the Caesar shift, then they merely have to check the 25 possibilities. However, if the sender uses the more general subst.i.tution algorithm, which permits the cipher alphabet to be any rearrangement of the plain alphabet, then there are 400,000,000,000,000,000,000,000,000 possible keys from which to choose. One such is shown in Figure 5 Figure 5. From the enemy's point of view, if the message is intercepted and the algorithm is known, there is still the horrendous task of checking all possible keys. If an enemy agent were able to check one of the 400,000,000,000,000,000,000,000,000 possible keys every second, it would take roughly a billion times the lifetime of the universe to check all of them and decipher the message.

[image]

Figure 5 An example of the general subst.i.tution algorithm, in which each letter in the plaintext is subst.i.tuted with another letter according to a key. The key is defined by the cipher alphabet, which can be any rearrangement of the plain alphabet. An example of the general subst.i.tution algorithm, in which each letter in the plaintext is subst.i.tuted with another letter according to a key. The key is defined by the cipher alphabet, which can be any rearrangement of the plain alphabet.

The beauty of this type of cipher is that it is easy to implement, but provides a high level of security. It is easy for the sender to define the key, which consists merely of stating the order of the 26 letters in the rearranged cipher alphabet, and yet it is effectively impossible for the enemy to check all possible keys by the so-called brute-force attack. The simplicity of the key is important, because the sender and receiver have to share knowledge of the key, and the simpler the key, the less the chance of a misunderstanding.

In fact, an even simpler key is possible if the sender is prepared to accept a slight reduction in the number of potential keys. Instead of randomly rearranging the plain alphabet to achieve the cipher alphabet, the sender chooses a keyword keyword or or keyphrase keyphrase. For example, to use JULIUS CAESAR as a keyphrase, begin by removing any s.p.a.ces and repeated letters (JULISCAER), and then use this as the beginning of the jumbled cipher alphabet. The remainder of the cipher alphabet is merely the remaining letters of the alphabet, in their correct order, starting where the keyphrase ends. Hence, the cipher alphabet would read as follows.

[image]

The advantage of building a cipher alphabet in this way is that it is easy to memorize the keyword or keyphrase, and hence the cipher alphabet. This is important, because if the sender has to keep the cipher alphabet on a piece of paper, the enemy can capture the paper, discover the key, and read any communications that have been encrypted with it. However, if the key can be committed to memory it is less likely to fall into enemy hands. Clearly the number of cipher alphabets generated by keyphrases is smaller than the number of cipher alphabets generated without restriction, but the number is still immense, and it would be effectively impossible for the enemy to unscramble a captured message by testing all possible keyphrases.

This simplicity and strength meant that the subst.i.tution cipher dominated the art of secret writing throughout the first millennium A.D A.D. Codemakers had evolved a system for guaranteeing secure communication, so there was no need for further development-without necessity, there was no need for further invention. The onus had fallen upon the codebreakers, those who were attempting to crack the subst.i.tution cipher. Was there any way for an enemy interceptor to unravel an encrypted message? Many ancient scholars considered that the subst.i.tution cipher was unbreakable, thanks to the gigantic number of possible keys, and for centuries this seemed to be true. However, codebreakers would eventually find a shortcut to the process of exhaustively searching all keys. Instead of taking billions of years to crack a cipher, the shortcut could reveal the message in a matter of minutes. The breakthrough occurred in the East, and required a brilliant combination of linguistics, statistics and religious devotion. codebreakers, those who were attempting to crack the subst.i.tution cipher. Was there any way for an enemy interceptor to unravel an encrypted message? Many ancient scholars considered that the subst.i.tution cipher was unbreakable, thanks to the gigantic number of possible keys, and for centuries this seemed to be true. However, codebreakers would eventually find a shortcut to the process of exhaustively searching all keys. Instead of taking billions of years to crack a cipher, the shortcut could reveal the message in a matter of minutes. The breakthrough occurred in the East, and required a brilliant combination of linguistics, statistics and religious devotion.

The Arab Crypta.n.a.lysts At the age of about forty, Muhammad began regularly visiting an isolated cave on Mount Hira just outside Mecca. This was a retreat, a place for prayer, meditation and contemplation. It was during a period of deep reflection, around A.D A.D. 610, that he was visited by the archangel Gabriel, who proclaimed that Muhammad was to be the messenger of G.o.d. This was the first of a series of revelations which continued until Muhammad died some twenty years later. The revelations were recorded by various scribes during the Prophet's life, but only as fragments, and it was left to Ab Bakr, the first caliph of Islam, to gather them together into a single text. The work was continued by Umar, the second caliph, and his daughter Hafsa, and was eventually completed by Uthmn, the third caliph. Each revelation became one of the 114 chapters of the Koran.

The ruling caliph was responsible for carrying on the work of the Prophet, upholding his teachings and spreading his word. Between the appointment of Ab Bakr in 632 to the death of the fourth caliph, Al, in 661, Islam spread until half of the known world was under Muslim rule. Then in 750, after a century of consolidation, the start of the Abbasid caliphate (or dynasty) heralded the golden age of Islamic civilization. The arts and sciences flourished in equal measure. Islamic craftsmen bequeathed us magnificent paintings, ornate carvings, and the most elaborate textiles in history, while the legacy of Islamic scientists is evident from the number of Arabic words that pepper the lexicon of modern science such as algebra, alkaline algebra, alkaline and and zenith zenith.

The richness of Islamic culture was to a large part the result of a wealthy and peaceful society. The Abbasid caliphs were less interested than their predecessors in conquest, and instead concentrated on establishing an organized and affluent society. Lower taxes encouraged businesses to grow and gave rise to greater commerce and industry, while strict laws reduced corruption and protected the citizens. All of this relied on an effective system of administration, and in turn the administrators relied on secure communication achieved through the use of encryption. As well as encrypting sensitive affairs of state, it is doc.u.mented that officials protected tax records, demonstrating a widespread and routine use of cryptography. Further evidence comes from many administrative manuals, such as the tenth-century Adab al-Kuttb Adab al-Kuttb ("The Secretaries' Manual"), which include sections devoted to cryptography. ("The Secretaries' Manual"), which include sections devoted to cryptography.

The administrators usually employed a cipher alphabet which was simply a rearrangement of the plain alphabet, as described earlier, but they also used cipher alphabets that contained other types of symbols. For example, a in the plain alphabet might be replaced by # in the cipher alphabet, b might be replaced by +, and so on. The monoalphabetic subst.i.tution cipher monoalphabetic subst.i.tution cipher is the general name given to any subst.i.tution cipher in which the cipher alphabet consists of either letters or symbols, or a mix of both. All the subst.i.tution ciphers that we have met so far come within this general category. is the general name given to any subst.i.tution cipher in which the cipher alphabet consists of either letters or symbols, or a mix of both. All the subst.i.tution ciphers that we have met so far come within this general category.

Had the Arabs merely been familiar with the use of the monoalphabetic subst.i.tution cipher, they would not warrant a significant mention in any history of cryptography. However, in addition to employing ciphers, the Arab scholars were also capable of destroying ciphers. They in fact invented crypta.n.a.lysis crypta.n.a.lysis, the science of unscrambling a message without knowledge of the key. While the cryptographer develops new methods of secret writing, it is the crypta.n.a.lyst who struggles to find weaknesses in these methods in order to break into secret messages. Arabian crypta.n.a.lysts succeeded in finding a method for breaking the monoalphabetic subst.i.tution cipher, a cipher that had remained invulnerable for several centuries.

Crypta.n.a.lysis could not be invented until a civilization had reached a sufficiently sophisticated level of scholarship in several disciplines, including mathematics, statistics and linguistics. The Muslim civilization provided an ideal cradle for crypta.n.a.lysis, because Islam demands justice in all spheres of human activity, and achieving this requires knowledge, or provided an ideal cradle for crypta.n.a.lysis, because Islam demands justice in all spheres of human activity, and achieving this requires knowledge, or ilm ilm. Every Muslim is obliged to pursue knowledge in all its forms, and the economic success of the Abbasid caliphate meant that scholars had the time, money and materials required to fulfill their duty. They endeavored to acquire the knowledge of previous civilizations by obtaining Egyptian, Babylonian, Indian, Chinese, Farsi, Syriac, Armenian, Hebrew and Roman texts and translating them into Arabic. In 815, the Caliph al-Ma'mn established in Baghdad the Bait al-Hikmah ("House of Wisdom"), a library and center for translation.

At the same time as acquiring knowledge, the Islamic civilization was able to disperse it, because it had procured the art of papermaking from the Chinese. The manufacture of paper gave rise to the profession of warraqn warraqn, or "those who handle paper," human photocopying machines who copied ma.n.u.scripts and supplied the burgeoning publishing industry. At its peak, tens of thousands of books were published every year, and in just one suburb of Baghdad there were over a hundred bookshops. As well as such cla.s.sics as Tales from the Thousand and One Nights Tales from the Thousand and One Nights, these bookshops also sold textbooks on every imaginable subject, and helped to support the most literate and learned society in the world.

In addition to a greater understanding of secular subjects, the invention of crypta.n.a.lysis also depended on the growth of religious scholarship. Major theological schools were established in Basra, Kufa and Baghdad, where theologians scrutinized the revelations of Muhammad as contained in the Koran. The theologians were interested in establishing the chronology of the revelations, which they did by counting the frequencies of words contained in each revelation. The theory was that certain words had evolved relatively recently, and hence if a revelation contained a high number of these newer words, this would indicate that it came later in the chronology. Theologians also studied the Hadth Hadth, which consists of the Prophet's daily utterances. They tried to demonstrate that each statement was indeed attributable to Muhammad. This was done by studying the etymology of words and the structure of sentences, to test whether particular texts were consistent with the linguistic patterns of the Prophet.

Significantly, the religious scholars did not stop their scrutiny at the level of words. They also a.n.a.lyzed individual letters, and in particular they discovered that some letters are more common than others. The letters a and l are the most common in Arabic, partly because of the definite article al-, whereas the letter j appears only a tenth as frequently. This apparently innocuous observation would lead to the first great breakthrough in crypta.n.a.lysis. they discovered that some letters are more common than others. The letters a and l are the most common in Arabic, partly because of the definite article al-, whereas the letter j appears only a tenth as frequently. This apparently innocuous observation would lead to the first great breakthrough in crypta.n.a.lysis.

Although it is not known who first realized that the variation in the frequencies of letters could be exploited in order to break ciphers, the earliest known description of the technique is by the ninth-century scientist Ab Ysf Ya'qb ibn Is-hq ibn as-Sabbh ibn 'omrn ibn Ismal al-Kind. Known as "the philosopher of the Arabs," al-Kind was the author of 290 books on medicine, astronomy, mathematics, linguistics and music. His greatest treatise, which was rediscovered only in 1987 in the Sulaimaniyyah Ottoman Archive in Istanbul, is ent.i.tled A Ma.n.u.script on Deciphering Cryptographic Messages; A Ma.n.u.script on Deciphering Cryptographic Messages; the first page is shown in the first page is shown in Figure 6 Figure 6. Although it contains detailed discussions on statistics, Arabic phonetics and Arabic syntax, al-Kind's revolutionary system of crypta.n.a.lysis is encapsulated in two short paragraphs: One way to solve an encrypted message, if we know its language, is to find a different plaintext of the same language long enough to fill one sheet or so, and then we count the occurrences of each letter. We call the most frequently occurring letter the "first," the next most occurring letter the "second," the following most occurring letter the "third," and so on, until we account for all the different letters in the plaintext sample.

Then we look at the ciphertext we want to solve and we also cla.s.sify its symbols. We find the most occurring symbol and change it to the form of the "first" letter of the plaintext sample, the next most common symbol is changed to the form of the "second" letter, and the following most common symbol is changed to the form of the "third" letter, and so on, until we account for all symbols of the cryptogram we want to solve.

Al-Kind's explanation is easier to explain in terms of the English alphabet. First of all, it is necessary to study a lengthy piece of normal English text, perhaps several, in order to establish the frequency of each letter of the alphabet. In English, e is the most common letter, followed by t, then a, and so on, as given in Table 1 Table 1. Next, examine the ciphertext in question, and work out the frequency of each letter. If the most common letter in the ciphertext is, for example, J then it would seem likely that this is a subst.i.tute for e. And if the second most common letter in the ciphertext is P, then this is probably a subst.i.tute for t, and so on. Al-Kind's technique, known as the ciphertext is, for example, J then it would seem likely that this is a subst.i.tute for e. And if the second most common letter in the ciphertext is P, then this is probably a subst.i.tute for t, and so on. Al-Kind's technique, known as frequency a.n.a.lysis frequency a.n.a.lysis, shows that it is unnecessary to check each of the billions of potential keys. Instead, it is possible to reveal the contents of a scrambled message simply by a.n.a.lyzing the frequency of the characters in the ciphertext.

[image]

Figure 6 The first page of al-Kind's ma.n.u.script The first page of al-Kind's ma.n.u.script On Deciphering Cryptographic Messages On Deciphering Cryptographic Messages, containing the oldest known description of crypta.n.a.lysis by frequency a.n.a.lysis. (photo credit 1.2) However, it is not possible to apply al-Kind's recipe for crypta.n.a.lysis unconditionally, because the standard list of frequencies in Table 1 Table 1 is only an average, and it will not correspond exactly to the frequencies of every text. For example, a brief message discussing the effect of the atmosphere on the movement of striped quadrupeds in Africa would not yield to straightforward frequency a.n.a.lysis: "From Zanzibar to Zambia and Zaire, ozone zones make zebras run zany zigzags." In general, short texts are likely to deviate significantly from the standard frequencies, and if there are less than a hundred letters, then decipherment will be very difficult. On the other hand, longer texts are more likely to follow the standard frequencies, although this is not always the case. In 1969, the French author is only an average, and it will not correspond exactly to the frequencies of every text. For example, a brief message discussing the effect of the atmosphere on the movement of striped quadrupeds in Africa would not yield to straightforward frequency a.n.a.lysis: "From Zanzibar to Zambia and Zaire, ozone zones make zebras run zany zigzags." In general, short texts are likely to deviate significantly from the standard frequencies, and if there are less than a hundred letters, then decipherment will be very difficult. On the other hand, longer texts are more likely to follow the standard frequencies, although this is not always the case. In 1969, the French author Georges Perec wrote Georges Perec wrote La Disparition La Disparition, a 200-page novel that did not use words that contain the letter e. Doubly remarkable is the fact that the English novelist and critic Gilbert Adair succeeded in translating La Disparition La Disparition into English, while still following Perec's shunning of the letter e. Ent.i.tled into English, while still following Perec's shunning of the letter e. Ent.i.tled A Void A Void, Adair's translation is surprisingly readable (see Appendix A Appendix A). If the entire book were encrypted via a monoalphabetic subst.i.tution cipher, then a naive attempt to decipher it might be stymied by the complete lack of the most frequently occurring letter in the English alphabet.

Table 1 This table of relative frequencies is based on pa.s.sages taken from newspapers and novels, and the total sample was 100,362 alphabetic characters. The table was compiled by H. Beker and F. Piper, and originally published in This table of relative frequencies is based on pa.s.sages taken from newspapers and novels, and the total sample was 100,362 alphabetic characters. The table was compiled by H. Beker and F. Piper, and originally published in Cipher Systems: The Protection Of Communication Cipher Systems: The Protection Of Communication.

Letter Percentage Percentage a 8.2 8.2.

b 1.5 1.5.

c 2.8 2.8.

d 4.3 4.3.

e 12.7 12.7.

f 2.2 2.2.

g 2.0 2.0.

h 6.1 6.1.

i 7.0 7.0.

j 0.2 0.2.

k 0.8 0.8.

l 4.0 4.0.

m 2.4 2.4.

n 6.7 6.7.

o 7.5 7.5.

p 1.9 1.9.

q 0.1 0.1.

r 6.0 6.0.

s 6.3 6.3.

t 9.1 9.1.