Cryptography: Everything You Never Wanted to Know
Ever since humans first started using written language, they’ve been keeping it secret. Not the writing, itself, obviously, but the messages contained therein. What we call “cryptography” — the practice of keeping messages private in the face of adversaries, often with coded text — has been a fundamental concept since hieroglyphics, some 4,000 years ago.
But it should come as no surprise that in today’s fast-paced world of digital communications and payment systems, cryptography has become even more essential to ensuring data security.
In fact, cryptography is the heartbeat of keeping your and your customers’ data safe, your communications with online payment mechanisms secure, and mission critical secrets for enterprise and government, well, secret. It is essential to the triad of confidentiality, integrity, and availability (CIA), a common model that forms the basis for the development of security systems, and even to saving lives!
Not only is cryptography the super cool science of keeping secrets, but it extends its reach into mathematics, science, technology, politics, and human rights. At Salesforce, it helps us live up to our number one value of Trust. And it’s also one of the biggest buzzwords in tech, right along with cryptocurrency and NFTs.
There are also concerns of existing cryptography standards being broken in the future due to — get ready for some jargon — post-quantum cryptography, where the overhead of cryptanalysis is drastically reduced, meaning it may be possible to break some cryptosystems. We’re still quite a while away from that so don’t get too worried…yet. We’re gonna cover what has been and what is first.
Every Superhero has an Origin Story
Throughout time, cryptography has been used to save lives. Unfortunately in some cases, the opposite has been true, where bad or broken cryptography has led to dramatic, moribund events. For instance:
The Caesar Cipher: A Cipher (a set of rules or a combination of two or more algorithms) named in honor of Julias Caesar who used it to encrypt military and official government messages. Using a transposition of three, the plaintext would be rotated by three letters down the alphabet to produce the fixed output of cipher text. So the words “trust” would have the ciphertex output of “wuxvw”. W is three letters down from T and so on.
The Babington Plot: In the year 1587, Mary, Queen of Scots, was imprisoned and watched under lock and key. Finding it nearly impossible to communicate with the outside world, she was able to create her own cipher consisting of a nomenclature (letters and symbols) and send them to her allies by smuggling parchments with the code through beer barrels. Eventually this was cracked by a method called frequency analysis (identifying the most common ciphertexts and guessing what they would attribute back to) and Mary's plot of treason against Queen Elizabeth was met with a grim fate. If she had rotated her nomenclature more frequently it may have saved her life.
The Enigma Machine: During WWII, the German military used this mechanical cryptographic device to encrypt messages and make them undecipherable to unintended snoopers. Think of a typewriter stuck in a box with multiple rotors attached. When you would press a key, a light would appear on the end of the board highlighting a letter. The intention of this was that every time you press the letter A, you would not get the same letter twice in a row as the mechanical wiring of the box would change due to the rotors changing how the keys would output text. Alan Turing eventually cracked the device. It is believed that the cracking of the enigma machine ended WWII early and saved the lives of countless people.
Let’s Look Behind the Mask
Now, you might be wondering what all this has to do with you and your data. But it’s important to understand how the different eras of cryptography eventually led us to how we communicate securely on the internet today and how we protect ourselves against threat actors. Not only is it technology, it is a fine art we must be grateful for in today's cybersecurity world. It’s also important to understand some basic terms when diving into a super complex subject. Some basics that will help you go from zero to cryptographic hero:
Plaintext: A message in its pure original form
Ciphertext: Altered form of plaintext message, unreadable to (if we’ve done our job well) anyone except the intended recipient, also known as a cryptogram
Cipher: A series of well-defined steps that helps implement encryption; a cryptographic algorithm, also known as an encryption engine
Encryption: The art of turning plaintext into ciphertext; putting plaintext through the encryption engine
Decryption: The reverse process of encryption
Key: A crypto key is a string of bits used by a cryptographic algorithm/cipher to transform plain text into ciphertext; controls the operation of a cryptographic algorithm, also known as a cryptovariable
Key Space: Total number of possible values for a key in a cryptographic algorithm or other security measures such as a password. Thus meaning, a four digit number like a PIN would only have a keyspace of 10,000. The numbers 0000 - 9999. 10,000 possible guesses.
Hash: A one way variable input mathematical function which produces a fixed ciphertext output. More on this later in the blog.
Hidden in Plain Sight
Today, the algorithms and keys we use on the Internet are pretty much everywhere unless for some reason you went out of your way to disable your security. In fact, presuming that you are reading this through a browser, you should notice a lock symbol in the left hand side of your search bar. That symbol denotes that your connection to the web is secure, and that security is put in place using, you guessed it, cryptography.
That protocol in particular is Secure Sockets Layer/Transport Layer Security (SSL/TLS), and is what we use to encrypt Hypertext Transfer Protocol (HTTP). It allows us to transmit information over the internet and populate information to a website. This can be both metadata and values you add in fields, like passwords and credit card information. However, since it’s in plaintext, anyone could see it if snooping. By negotiating with the website using Public Key Encryption, we can use the website's certificate to then create an encrypted and authenticated channel of communication. You can verify the Public Key belongs to a website by validating it against what is known as a Certificate Authority.
Keys to the Universe
You can have the strongest algorithms in the world but your key is the mechanism that locks and unlocks it. If anyone were to get your key, they could decipher all of your communications. The key is the heart of cryptography and keeping it secure and private is sacred to ensuring your crypto systems are in good standing.
When I want to encrypt something at rest, let’s say a large volume of data sitting on a disk, I would use what’s known as a symmetric key. One key to rule them all. This means I have the same key that does the encryption and decryption operation. Symmetric encryption is great for bulk and resource-heavy encryption (think terabytes of information). Some algorithms/ciphers we use today for symmetric key encryption would include DES, 3DES and AES. So when protecting customer data, we would use these types of algorithms to keep you and your customers’ data safe. I use the plaintext, I put it through an encryption engine (the cipher/algorithm), I put my key into the engine and that enables the cryptography to take place. Just like starting a car, the key is needed to turn it on.
Ok cool, now we have looked at encrypting data at rest. But what if I want to communicate with you via the wire and I need to send a key to you? Surely if someone were to snoop on that communication gateway they could see the private key? Yes, they could. There are some ways around this and thus we enter the world of public key encryption.
With symmetric keys, there is one key to rule them all. With asymmetric encryption, there are two keys to rule them all. This is more resource-intensive due to higher computational overhead. One is a public key, which means the whole public internet can see it. The second is your private key that only you, and you alone, should ever be in possession of. When the two keys are matched, the decryption process can take place.
So, let’s say I wanted to send you an encrypted message over the wire. I would first reach out to you and get your public key. With your public key, I would then encrypt the message I intend only to be sent to you. So again, everyone will have your public key but they don’t have access to your private key. No one can decrypt the message except the holder of the private key.
Ok great, now I know how to encrypt. But how can you trust a message actually came from me?
Why don’t we switch around the key usage? I want to prove I sent you a message and provide integrity along with the message. After all, someone at the end of the wire could be pretending to be someone else. So before I send you an encrypted message, you ask for me to verify the integrity of it. This time, I will sign the message with my private key. Remember, I am the only one who possesses my private key, so surely if I use it to sign my message it must come from me right?
When we sign a message here, what we really mean is we sign its hash. A hash is a one way variable input mathematical function that produces a fixed ciphertext output, also known as a message digest. So for example, if the message were to state “Salesforce Rocks!”, the hash would look like this:
I will calculate my digest and then encrypt it with my private key, thus creating what is known as a digital signature. Then when you receive the hash from me, you can match my public key against it and see that it matches. If the message was intercepted at any stage the hash would change. Say, for example, our snooper doesn’t like capital letters and decides to intercept and change the message to “salesforce rocks!”, the fixed ciphertext output would be completely different and look like this:
Then when you receive the message it would be different to what was originally signed and the signed digest would not match the received digest, thus meaning non-repudiation (the ability to not be able to deny) would not be possible meaning you couldn’t trust the sender or the integrity of the message itself. The change in hash values from a minor change to the original plaintext is known as the crypto-avalanche effect.
We can be super secure in our transmission by having someone sign the message with their private key, and then using the recipient's public key to encrypt the message and thus full asymmetric encryption is at play and your communication channel is secure and verified. Asymmetric encryption is used in every communication pathway you have on the internet to ensure you can trust the end recipient. Some examples of asymmetric encryption algorithms would be RSA, DSA, and Elliptic Curve.
Now, asymmetric encryption can be resource-intensive so we have what’s called “hybrid cryptography.” This time, the sender will generate a symmetric session key, otherwise known as a shared key, but before sending that key, they will take the recipient's public key and encrypt the session key. They then send it over the wire encrypted with an asymmetric key, the recipient will decrypt with their private key and voila, encrypted communication can take place using the session key (e.g. 128 Bit AES) and will allow both parties to perform faster symmetric encryption.
Cool, right? Only, how do I know someone's public key is actually their public key? Remember that key which was requested above from the recipient so we could encrypt the message we sent to them. Welcome to the world of public key infrastructure (PKI) and certificate authorities (CA). Think of PKI as the invisible cryptographic backbone to your communications online, and CAs are essentially third-party cryptographic verification bodies which validate that a person's public key belongs to them. This takes place using digital certificates.
Just like when you complete a course or participate in an activity that upskills you, you receive a certificate from that educational body with a stamp saying “person A is hereby certified by this institution”, a digital certificate contains your public key and is issued by a trusted third party. So the CA is essentially saying “Yes, this public key belongs to Person A.”
Have you ever gone to a website and it states, “warning do not enter here, there is an issue with the certificate”? That’s because that entity does not have a certificate which can be validated by a trusted third party. There is a lot more to the secret sauce than what has been mentioned here but that is a high-level overview to give you some basic understanding.
You Get a Cape, We Get a Cape
What does all of this have to do with you and Salesforce? Well, cryptography is what helps us keep your data safe and your communication channels secure. Amongst myriad security measures, Salesforce uses the mechanisms outlined above to ensure customers’ data is kept safe both at rest and in transit. Most importantly, we have ways of making sure keys are safe and secure — Hardware Security Modules (HSM). These are special pieces of cryptographic hardware designed for the storage of cryptographic keys and they also allow us to compute the cryptographic keys on these devices rather than just sitting in a file on our servers, adding an extra layer of security. So next time you have bulk data lying about, make sure to encrypt it, or if you are communicating with someone on the wire about super secret confidential information, then use public key encryption to ensure you do so securely and with integrity. Again, don’t forget, cryptography saves lives!
If you are aspiring for a career in cybersecurity and want to know more about cryptography, I would highly recommend doing some deep learning on PKI and cryptography. You can find some great resources on YouTube like these from Neso Academy and Pico Cetef (look up concepts like Kerckhoffs Principle), and check out a book called Serious Cryptography: A Practical Introduction to Modern Encryption for a deeper dive into how computational cryptography works, block ciphers versus stream ciphers, and more. It’s also worth looking at one of the original public key algorithms (though some would say it’s a symmetric key exchange using an asymmetric framework), the Diffie-Hellman Key Exchange.
Most cybersecurity courses will have modules on secure communication too. There is also the concept of steganography, which means hiding the message in plain sight (one could sneak all the works of Shakespeare into a picture of a dog). Steganography can be the enemy of data loss prevention (DLP). And my door is always open so please drop me a message on LinkedIn if you want to know more, I’d love to chat!