Science Focus ( Issue 21)

The spat about WhatsApp’s new user agreement has prompted another flurry of discussion of online secur i t y. We of ten see mes sages s tat i ng “ th i s conversation is encrypted” – but how exactly does modern day encryption work, and how safe are we? If you’ve been to an escape room, you’ve probably encountered a cipher or two – scrambled messages that you have to solve in order to escape. The best known of encryption methods is probably the Caesar cipher, dating back to the Romans, which shifts all the letters by a certain number of positions in the alphabetical order. Unfortunately, this is also the easiest cipher to break, because of something called frequency analysis. In English, certain characters occur most frequently; for example, at this point in the article (*), the letter “e” has appeared 81 times, more than any other letter. One can look at a Caesar cipher text and see which letter appears the most; it is then not very difficult to figure out the amount of shifting involved. Thus began the evolution of ciphers; instead of shifting by one letter, people tried shifting with a so- called code word, then by sentences, using the so- called Vigenère cipher. This culminated in the famous Enigma, which basically amounts to a series of rather clever shifts, but that was also famously cracked during the Second World War by Alan Turing at Bletchley Park. When that happened, people realized that a new way of encryption was needed. Enter RSA. Named after its inventors (footnote 1), Rivest, Shamir and Adleman, RSA is the main method of encryption in the modern day, but is fundamentally different from the ciphers I mentioned above. The difference l ies in the types of keys used; whi le the Caesar cipher and Enigma both use symmetr ic keys, RSA is an example of asymmetric, or public key cryptography. In symmetric cryptography the key used to encrypt the message and decrypt the received message is the same; however, RSA makes use of two different keys, the public and private keys – the key locking up the box is not the same key used to unlock it! It all seems rather unintuitive at first, but the encryption actually relies on only two concepts – prime numbers and modular arithmetic. Prime numbers are numbers greater than one that can only be divided by one and itself; they have been the subject of a previous article in Science Focus (Issue 020), and are very interesting in their own right. Primes can also get arbitrarily large – currently the largest known prime that we can actually write down has 24,862,048 digits. RSA encryption relies on the fact that factoring large numbers is generally very slow, even for people with enormous computing power, so breaking the code requires a huge amount of computing power that is simply not worth it (footnote 2). Another technique i s what mathemat icians cal l modular ar ithmet ic, which i s, essent ial ly, a generalization of telling the time. In 24-hour time system, we would tell the time modulo 24 – three hours after 23:00 is always 02:00 and never 26:00 (footnote 3). This is actually just division in disguise: to get the value of n mod a , just divide n by a enough times until you reach some value that is smaller than a – the remainder of our division is exactly the value we require. With these two concepts in mind, the RSA algorithm is not too complicated – we’ll break down the steps in some detail here [1]. To demonstrate, we pick two prime numbers, say p = 11 and q = 13. Multiply them together to get the public key n = 143. (In reality these primes are huge – the current recommendation is 2048 bits (footnote 4), but for the sanity of our editor, we’ll keep them relatively small!) We also choose another number e which has no common factors, i.e. coprime, with 10 (i.e. p - 1) and 12 (i.e. q - 1); here we choose e = 7. The choice of n is unique for everyone, and e and n are known as the “public key”. They are published in a public directory that computers can use when their owners want to send messages to each other. Say Cliff wants to send me a very simple message – the most simple one possible – “Hi”. As expected, if you want a computer to work, you have to turn letters into numbers. Thankfully, a system already exists – American Standard Code for Information Interchange (ASCII), which turns letters and symbols into 7-bit numbers, works just fine. In ASCII “H” is represented by the number 72, and “i” by the number 105. 重門深鎖：現代密碼學 Lock i ng Up : Modern Day En crypt i on By Sonia Choy 蔡蒨珩