This article was machine-translated from the Japanese version.
Introduction
Hello. I go by Unigiri.
This is an article for Day 7 of Table Game Tansu Advent Calendar 2024, but the content has nothing to do with board games or TRPGs.
Like last year, this is an article where I talk about things I love.
This year, I’ll talk about cryptography to my heart’s content.
Given the motivation of just filling a single open slot in the schedule, I’ll barely revise, write in colloquial style, and won’t fact-check.
I’d be happy if you read it with the feeling of listening to someone from the same lab spouting off casually over drinks.
Prerequisites
What is cryptography
Let’s start by defining terms. Definitions are important.
Cryptography is a method of making text nicely hard to read.
When you have text that you don’t want unspecified large numbers of people to read, there are broadly two types of countermeasures.
Steganography and cryptography.
Steganography is a method of hiding the existence of the text.
It’s like writing on paper with mandarin orange juice and letting it dry — it looks like blank paper at a glance, but when you hold it over fire, the text appears.
Cryptography is a method of making the text unreadable as-is.
The existence of the made-unreadable text itself becomes known to unspecified large numbers of people, but to read it, you need to know the method for converting it back to a readable form.
The original text is called plaintext, the made-unreadable text is called ciphertext, the process of making it unreadable is called encryption, and the process of restoring it to the original text is called decryption.
Also, a third party who doesn’t know the decryption method trying to convert ciphertext back to plaintext is called cryptanalysis (decryption), and such a third party will be referred to as an attacker in this article.
What is commonly called “暗号” (angō) in Japanese corresponds to cryptography.
As an aside, something that cannot be restored from ciphertext to plaintext is not cryptography.
In other words, hash values are not ciphertext. What a hash is will be explained later.
Types of cryptography
Cryptography also broadly falls into two types.
Codes and ciphers.
A code is an alternative expression of some text.
It’s like rephrasing “I have succeeded in a surprise attack” as “Tora tora tora,” or rephrasing going to the restroom in the food service industry as “Number 4.”
A cipher is something made unreadable by a consistent method regardless of the plaintext content.
A code cannot express anything beyond pre-agreed information, but a cipher can express various kinds of information.
A slightly difficult example: if you create a rule assigning “あ” to 1, “い” to 2, “う” to 3 and so on, sequentially assigning numbers, you can express various texts like “あいうえお” or “こんにちは” using numbers — that kind of thing.
This article will primarily deal with ciphers.
Introduction to Several Ciphers
Caesar cipher
Now that the necessary definitions are done, let’s finally talk about actual cipher methods.
Starting with a simple one.
The Caesar cipher is a method that is simple, has existed since ancient times, and is very famous.
It’s also called the Caesar cipher or shift cipher, and in English it’s sometimes called ROT.
The method is simple: shift each character by N positions.
Explaining with the case of shifting 1 character backward: shift a one position backward to b, shift b similarly to c, shift c similarly… and so on.
If the plaintext is abcde, the ciphertext becomes bcdef, and Hello, I'm Unigiri. Glad to have you here. becomes Ifmmp, J'n Vojhjsj. Hmbe up ibwf zpv ifsf. With longer text it looks quite ciphertext-like.
The weaknesses of this method are vulnerability to frequency analysis and brute-force search.
In the Caesar cipher, a given character is always converted to a specific character. In the case of shifting one character backward, a always becomes b.
In other words, for example, if a appears 20 times in the plaintext, b will necessarily appear 20 times in the ciphertext.
Each language has characters that tend to be used frequently, so you can make guesses like “this character appearing many times probably was originally this character…” and work through the cryptanalysis.
Also, if you know it’s encrypted with a Caesar cipher, you can always crack it by checking all patterns.
In English, there are 26 letters in the alphabet, so trying shifting by 1, then by 2, then by 3… trying all 26 patterns means that among the 26 decryption results, the plaintext will definitely be there.1
And generally, plaintext is meaningful text, so you can tell at a glance which one is the plaintext.
Overall, it falls into the category of being easy to use as a cipher method but also easy to crack.
Vigenère cipher
The weakness of the Caesar cipher is that the number of positions to shift is always constant.
So wouldn’t it be harder to crack if we varied the shift amounts, like shifting the 1st character back by 1, the 2nd by 5, the 3rd by 9…? That’s the basic idea behind the Vigenère cipher.
However, if you shift completely randomly, you won’t be able to decrypt it, so the shifting pattern is predetermined.
For example, deciding that the 1st character shifts by 5, the 2nd by 10, the 3rd by 15, and from the 4th character onward, the shift amounts cycle as 5, 10, 15, 5, 10, 15, …
The rule decided for the first 3 characters is called the key.
I won’t explain the strengths and weaknesses of this method in detail here, but the important point is that a third party attempting cryptanalysis doesn’t know the key length.
For details, please look up Vigenère cipher or Vigenère square.
By the way, the person who devised the cryptanalysis method for the Vigenère cipher was Charles Babbage. Amazing.
Enigma
The Caesar cipher and Vigenère cipher can be encrypted and decrypted easily with just paper and pen.
But as time passed, humans started using machines, and something called computers appeared.
By using these, it became possible to handle complex encryption procedures that would take too long with paper and pen.
The prime example is Enigma.
Enigma is not the name of a cipher, but the name of a machine that performs encryption and decryption.
Inside the machine are gears and removable plugs, and moving these performs complex encryption.
The basic idea isn’t that different from the Caesar cipher — it just converts one character to another — but the conversion rules are so complex that the difficulty of cryptanalysis is incomparable to the Caesar or Vigenère ciphers.
Enigma was used in Nazi Germany during World War II.
It was used for communications within the German military, so the decryption method was discovered by Britain, which was on the Allied side.
There are many fascinating episodes related to Enigma, but they diverge from the article’s theme so I’ll omit them.
If you’re interested, please watch the film “The Imitation Game,” which depicts the British side attempting to crack it.
There are many parts that differ from historical facts, but you can get the general atmosphere and enjoy Benedict Cumberbatch’s acting, so I recommend it.
How computers represent natural language
Here the discussion veers slightly off track.
That’s because I want to talk about how computers handle text in languages like English and Japanese.
First, computers fundamentally live in a world of only 0s and 1s, but they can also handle numbers of 2 and above, like what we call 2 or 3.
Expressing numbers of 2 and above using only 0 and 1 is called binary.
Explaining binary would turn this into a Fundamental Information Technology Engineer exam prep article, so I’ll skip it. Just understanding that computers can comprehend numbers of 2 and above is sufficient.
Then, to express English and Japanese on computers, numbers are assigned to each character of each language.
In other words, doing exactly what I gave as an example earlier in the article:
Creating a rule that assigns “あ” to 1, “い” to 2, “う” to 3 sequentially, enabling various texts like “あいうえお” or “こんにちは” to be expressed using numbers — that kind of thing.
There are many types of rules for which numbers are assigned to which characters, and these are collectively called character codes.
Mojibake (garbled text) that sometimes occurs when using computers happens because the devices exchanging text are using different character codes.
By the way, the concept underlying the idea of assigning numbers to non-numeric things is called Gödel numbering, but that would be a massive tangent so I’ll skip it here.
By converting human language into numbers that computers can handle in this way, there is one major advantage.
That is, it can be dropped into the world of mathematics.
What’s good about dropping it into the world of mathematics is that it enables doing nicely complex things. What kind of explanation is this…
I’m not a mathematician so I can only give fuzzy explanations, but by going through the procedure of computation, it becomes possible to apply processing far more complex than Enigma.
Symmetric-key cryptography and public-key cryptography
Returning to the topic of cryptography, modern cryptography since the spread of computers has incredibly tedious explanations, so I’ll skip it roughly.
Here I’ll only mention that there are two categories called symmetric-key cryptography and public-key cryptography.
Symmetric-key cryptography
In the Vigenère cipher section, I mentioned that the rule established for converting characters is called a key.
The method where the encrypting person and the decrypting person use the same key is called symmetric-key cryptography.
The Vigenère cipher and Enigma are symmetric-key ciphers.
It’s necessary to share the key between the encrypting person and the decrypting person in advance through some method.
Also, this key must be kept secret. Because if the key is exposed, anyone can decrypt it.
Public-key cryptography
In contrast to symmetric-key cryptography, the method where the encrypting person and the decrypting person use different keys is called public-key cryptography.
You might momentarily think “Is that even possible!?” but it becomes possible when you use math power.
A representative example utilizes the difficulty of prime factorization, but naturally the explanation becomes difficult so I’ll skip it here.
The advantage of public-key cryptography is that it’s fine if the key used for encryption is exposed to unspecified large numbers of people.
Let’s call the key for encryption A and the key for decryption B.
First, the person who wants ciphertext creates keys A and B, publishes only key A in a place anyone can see, and says “Please use this key when sending me ciphertext!”
Other people who see this encrypt text using key A and send the ciphertext.
The recipient decrypts the ciphertext using key B, which they keep secret. This completes the encrypted communication.
An attacker can know key A and the ciphertext, but cannot crack it because they don’t know key B for decryption.
The happy point is being able to skip the step required in symmetric-key cryptography of secretly sharing the key in advance through some method.
By the way, digital signatures can also be created using the idea of public-key cryptography, but that slightly diverges from the cryptography topic so I’ll skip it here.
Things That Are Not Ciphers But Look Like Ciphers
I’ve roughly said what I wanted to say about cryptography, so here I’ll talk about things that at first glance look like ciphers but actually aren’t.
Hash
As an example of the statement that something that cannot be restored from ciphertext to plaintext cannot be called cryptography, I’ll discuss hashes.
There is something called a hash function that generates a somewhat short value from certain data, and the short value generated by a hash function is called a hash value.
For example, if you give a text file containing こんにちは to a hash function called MD5, you get the hash value f5271ace09a56600e1cef7663d932807.
Also, it has the property that the same hash value can only be obtained when the original data is exactly the same.2
The hash value f5271ace09a56600e1cef7663d932807 can only be obtained when the original data contains こんにちは.
Visually, and considering the property that different texts always produce different hash values, hash values seem somewhat cipher-like.
You probably don’t handle hash functions much in everyday life, but the human body has something similar. Fingerprints.
Fingerprints differ from person to person, and no two people have the same fingerprint pattern.3
Therefore, if you already know what shape of fingerprint a certain person has, you can tell that person touched something just by examining the fingerprint on a desk.
However, without such information, that fingerprint becomes something like a cipher whose owner cannot be inferred. This is slightly unconvincing, but please let it slide.
Returning to the hash value discussion.
Hash values look like ciphertext, but they are not ciphertext.
That’s because hash functions have the property that they can generate a hash value from original data, but cannot restore the original data from the hash value.4
Being able to decrypt from ciphertext to plaintext is a condition for calling something a cipher, and hash functions and hash values do not satisfy this condition.
They are not ciphers, but this thing called a hash is quite convenient and is frequently used in the computing world.
Undeciphered languages
Generally, languages spoken by humans are not ciphers.
But if the people who can understand the meaning of those words disappear, it becomes something like a cipher.
One such example is Linear B.
Linear B was a script used in mainland Greece and Crete in antiquity, but it remained an undeciphered script for a long time.
It was ultimately deciphered in 1952 by an amateur researcher named Michael Ventris, and the decipherment process has aspects in common with cipher cryptanalysis.
I truly love Linear B, but this article’s theme is cryptography, not ancient languages, so I’ll omit details.
If you’re interested, please read The Man Who Deciphered Linear B: The Story of Michael Ventris. New copies may no longer be in bookstores, but large libraries might have it.
Conclusion
I’m satisfied having talked a lot!
Cryptography is a very enjoyable field. It’s interesting as theory, and it’s also practical.
This time I skipped most of the theoretical aspects and only wrote what I wanted to write, but I’d be happy if even a little of the fun came through.
Finally, I’d like to introduce reference materials.
If anyone wants to know a bit more about cryptography, I recommend these two books.
The Code Book by Simon Singh
It’s written in quite plain language and is very easy to read.
Simon Singh is truly skilled at treating seemingly arcane themes in an understandable way.
It’s essentially the proper version of my article.
By the way, in the anime The Melancholy of Haruhi Suzumiya, Nagato is reading this book.
This “Nagato is reading it” reference is a classic, but it feels like it’s about to stop landing with people…
Introduction to Cryptography, 3rd Edition: Alice in the Land of Secrets by Hiroshi Yuki
This is a technical book written for people with some prerequisite knowledge, but it also covers historical ciphers like the Caesar cipher and is very carefully written.
If you want to know about cryptographic technologies handled by computers, I believe there’s nothing more understandable than this book.
Some of you might think “Isn’t it 25 patterns?” but here I’m counting 26 to include the meaningless encryption of shifting by 0 characters. ↩︎
Discussion of hash collisions is omitted here ↩︎
Discussion of fingerprint collisions is omitted here ↩︎
Methods for deriving plaintext from hash values such as rainbow tables are omitted here ↩︎