Plaintext: Data and files that can be read by humans
Ciphertext: Encrypted files that are protected from viewing
Algorithm (or cipher): The encryption process
Key: A secret value used to encrypt or decrypt
Why is key length important? Longer key = more combinations = harder to crack the encryption by brute force (trying lots of combinations).
But! Longer keys need more computer power, memory, and network bandwidth.
How does it work? One key (the same key) to encrypt and decrypt. This key is usually a password.
Examples:
Your WiFi password is a symmetric key. (It's stored on your phone and in the Wireless Access Point).
Your BitLocker password to unlock your laptop.
Your iPhone is encrypted with your passcode or PIN.
What is it used for? Symmetric is used to protect ALL "bulk data". That means confidential data and files, databases, documents, and even entire devices (Full Disk Encryption). If you have something to hide, you use symmetric encryption. We sometimes call this "data at rest".
Algorithm: AES (Advanced Encryption Standard): Current standard: 128/192/256-bit keys.
Problem: Symmetric encryption is really fast (you can unlock your phone in 1 second). BUT if anyone else wants to access your data, you have to tell them the key.
We call this the "key distribution problem" - how to share the key securely? (The answer is combining symmetric with asymmetric. We will discuss this later).
How does it work? Two keys (usually files). One public, one private key. If one key encrypts, the other key decrypts. The private key must be kept secret.
Example:
You use your private key to login to a secure server with SSH. The server has a copy of your public key. The keys match, so the server knows it's you.
What is it used for? Asymmetric is used for two things.
1) Proving who we are (authentication). We can login using our private key if someone has our public key. We can also "sign" messages using our private key, other people can check those messages came from us, using our public key.
2) Sending someone a symmetric key in a secure way. This is called key exchange. We will talk more about this soon.
Algorithms: RSA (2048-bit key size) and ECC (Elliptic Curve Cryptography) (256-bit). What's the difference between RSA and ECC? ECC is more modern. It has the same security as RSA, but the key files are smaller and faster. So ECC is good for weaker devices, like mobile phones or smart devices (Internet of Things).
Problem: Asymmetric is very slow and it can't handle a lot of data. We can only use it for small tasks. Any "bulk data" like big files or internet traffic must use symmetric too.
How does it work? You can put any data (text, files, images, or anything) into a hash algorithm. It will always give you the same unique "digest" or code that only represents that data. If you make any change to the data, you get a new hash.
Example:
File A -> Hash = d5b1ccd4
File A -> Hash again = d5b1ccd4
File A -> Hash again = d5b1ccd4
File A -> Hash again = d5b1ccd4
but...
File B -> Hash = 79f9a1eb (different contents)
What is it used for? Hashing is used for two things.
1) Safe password storage. Hashing is one way (in theory). We can hash passwords or even pictures (like a fingerprint). This hash still represents that person's unique password, but it shouldn't be cracked.
2) Checking whether contents of files have changed (integrity checking).
Algorithms: SHA-256 is the modern standard. MD5 is insecure and it can be cracked easily.
What is it used for? Digital signatures prove someone sent a message or communication. It provides authentication (proof of identity), integrity (proof the message wasn't changed), and non-repudiation (evidence of who sent the message).
How does it work? It uses asymmetric encryption combined with hashing. To prove you sent a messsage, you sign it with your private key. Then the message gets hashed. The receiver of the message can open it with your public key. They can check the hash. This proves the message came from you (because the public key matches your private key).
Note: there is NO confidentiality. If you sign a message with your private key, ANYONE with your public key can open it.
What is it used for? If we use symmetric encryption (for bulk data), we need a safe way to send people the key. We can do that with asymmetric encryption (which can only be used for small amounts of data).
Example:
You want to connect to a website with HTTPS. Your browser will use symmetric encryption to protect data. But to do this, it has to tell the webserver the same symmetric key your browser is using. We can use asymmetric encryption to help us send the symmetric key.
Both your browser and the webserver make a private key. They use these private keys to encrypt a small message, which has the symmetric key inside. They can both open it.
What is Perfect Forward Secrecy (PFS)?
There's one problem with key exchange. If an attacker gets the private key, they can decrypt the entire conversation.
PFS solves this. When your browser and the server make private keys, they are temporary (this is called an ephemeral key, sometimes called a session key). Once they have been used to send the symmetric key, the private keys get deleted.
The attacker can never decrypt everything, because they can never get the private keys.
How do we do PFS?
We use the "Diffie-Hellman key exchange protocol". Sometimes you can see this written as ECDHE (Elliptic Curve Diffie Hellman).
Remember that encryption needs powerful hardware. We can get two devices to help with this:
Trusted Platform Module (TPM). This is a chip in your phone or laptop.
Hardware Security Module (HSM). This is a dedicated machine (similar to a switch or router) that handles encryption keys.
Both of these are "tamper proof". If anyone hacks them or tries to open them physically, they'll delete the keys.
Salting: Hashing isn't enough to protect passwords. Think about this example. Two users choose the same password "cairo". Their hash will be the exact same. If a hacker captures the hash, then guesses one password or uses social engineering to get the password, now the hacker also knows the other person's password too.
Over time, hackers build large databases of well-known hash values. These are called rainbow tables. To stop a rainbow table attack, we "salt" passwords. We add random data to the password before hashing. This means that every hash is unique, even if people have the same password.
Salts must be random and never re-used.
Key stretching: If you must use weak passwords, or weak algorithms like MD5, you can key stretch. That means hashing many times (like 1000 times). The hacker will still crack the hash, but it will take much longer to brute force. Hopefully, you can detect the attack in time.
Blockchain: A public system ("decentralised") where files are linked together by hashes. The files are called the "public ledger". This means that no one can make changes or delete data from the ledger, because everyone in the world can see and verify the hashes. This means we can do cryptocurrency transactions or digital voting, and we can prove integrity with the public hash.
Quantum computers: A new type of computer that could brute force encryption in the future, if they become strong enough. To stop this, governments are making new algorithms. The terms for these are "quantum-proof" or "quantum-resistant" algorithms. They can sometimes be called "post-quantum cryptography".
Obfuscation: Obfuscation is not encryption. It's similar to hashing. It makes computer code hard to read. Once a program is complete, you can send it through a code obfuscation algorithm. This is a one-way function that scrambles the code, but the code will still work normally. It protects code from being stolen by other companies. For example, the source code of a public website can be obfuscated.
Steganography: This is hiding data within other data (images, audio, video). The files will still open normally but there is invisible data hidden inside. This is not encryption, it's just "hiding". If someone knows the data is there, they can extract it.
Digital certificate: An encrypted, hashed file containing the user's identity and the user's public key.
Public Key Infrastructure: (PKI) This is using digital certificates with asymmetric encryption to prove identity on the internet.
Example:
How do you know a website is real and not operated by scammers? The website can buy a digital certificate: When a customer visits, the website can sign the certificate with their private key (they are the only ones who have the private key). Any customer can check the certificate using the website's public key.
Certificate Authority (CA): A company that does identity checks and issues digital certificates.
How does it work? You contact a CA and you make a "Certificate Signing Request" (CSR). This usually involves paying them. You give them proof of your company's identity and you give them your public key (never your private key). The CA then gives you a digital certificate.
The "chain of trust": The "root CA" which is the biggest, most trusted CA, usually keeps their private key offline on an air gapped server. The reason for this is, if the root CA (example companies: DigiCert or VeriSign) ever gets hacked, every digital certificate they ever issued is now compromised.
The problem with this is, if you keep the private key offline, you can't use it to create new certificates. So the root CA creates ONE digital certificate and they issue it to a smaller company. This smaller company is called an "intermediate CA". That intermediate CA can then prove its identity and set up a website. They can use their own private key to sell digital certificates to other customers. But the root CA company is still offline and safe.
CAs keep a list of cancelled, expired, and revoked certificates. This is called the Certificate Revocation List (CRL). If a site's private key is stolen, for example, the CA can revoke their certificate. Then visitors will get a warning in their browser.
Your browser can check the CRL live in real time by making a request. The protocol used to make this request is called Online Certificate Status Protocol (OCSP).
However, checking the CRL every time with OCSP uses a lot of network bandwidth and it's slow. To help with this, web servers can use OCSP Stapling. At regular intervals, they check the CRL and they get proof that the web server has a valid certificate (not revoked or cancelled). Then they "staple" or attach this proof to every visitor's browser. The visitor's browser doesn't have to go check the CRL because the webserver already did it for them.
Types of Certificates:
Examples of when digital certificates and PKI are used:
TLS 1.2 and 1.3 (HTTPS): Securing website connections.
Email Security: Signing and encrypting emails (S/MIME).
Code Signing: Verifying who produced software
VPN connections: Authenticating users and devices.