9 minutes
ICOM6045 Cryptography

Overview
Security threats (explain basically)
- content
- eavesdropping: the message would be listened by others when you communicate with your friends, using email or whatsapp
 - masquerading: identity stealing
 - message tampering: someone modify the message in the middle
 - replaying: receive the same message twice, it is dangerous in transactions, but not in instant message, etc
 
 - machine
- infiltration: receive marewares (can control PC or smart phones) from hackers
 
 - traffic
- traffic analysis: only interestes in how to communicate
 - denial of service: make others cannot access the services
 
 
Security services (implemented by security mechanisms)
- security policy
- from risk analysis, we can have some security policies to deal with
 - may not cover all possible risks, should have a reasonable trade-off between risks and available resources
 
 - ISO security architecture
- authentication
 - access control
 - data confidentiality
 - data integrity
 - nonrepudiation
 
 - basic computer security concepts (CIA)
- related to cryptography
- confidentiality: prevent unauthorized access
 - integrity: prevent unauthorized modification
 
 - related to access control
- availability: prevent unauthorized withholding of information or resources
 - authenticity: verify the origin of data
 - accountability: audit information and trace to the responsible party if there are security issues
 
 
 - related to cryptography
 
Encryption and Decryption (focus on classical ones)
Types (still use today)
- symmetric key encryption (single key cryptosystem)
- using the same encryption and decryption key
- e.g. AES, 3DES, RC4, pdf password
 
 - usage:
- file encryption
 - others with lots of data
 
 - rely on substitution and permutation, do not need to much computing
 
 - using the same encryption and decryption key
 public key encryption (two key cryptosystem)
- using different encryption and decryption keys
- public key (encryption key): everyone knows it
 - private key (decryption key): only I know
 - e.g. RSA (still popular today)
 
 - usage:
- communication
 - digital signature
 - others not cover lots of data
 
 - rely on complex math, need much computing
 
if you use the e-banking system or https, you should use both of them
- using different encryption and decryption keys
 
Monoalphabetic substitutions
- concept
- substitution: each letter is substituted by another one
 - permutation: reorder the elements of a series
 
 - caesar cipher
- each letter is translated to the letter a fixed number of letters after it in the alphabet
- Julius Caesar used a shift of 3
 
 - attack (short message)
- find the similarity of ciphertext (e.g. "wr" means "to" -> "wrr" means "too")
 - guess the pattern(the number of shifting)
 
 
 - each letter is translated to the letter a fixed number of letters after it in the alphabet
 - cryptanalysis (mainly for long message)
- count the frequency of the letter, find the difference according to the frequency of normal articles
 - weakness: the frequency of distribution reflects the distribution of the underlying alphabet
- make it stronger -> polyalphabetic substitution ciphers
- using combine or multiple distributions -> different tables
 - if you use 2 tables, it is not safe (if someone can guess the tables)
 - 26 tables may be safer, but it is difficult to maintain
- solution (vigenere cipher): use a keyword (not be encrypted, but use different method or channel), letters in the key will be used to select a particular permutation (e.g. juliet, which needs to be repeated)
 
 
 
 - make it stronger -> polyalphabetic substitution ciphers
 
 index of coincidence
a measure of the variation between frequencies in a distribution (n is the number of ciphertext)
\[IC=\sum_{i=a}^{i=z}\frac{Freq(i)\times(Freq(i)-1)}{n\times(n-1)}\]
IC ranges from 0.0384 (polyalphabetic substitution with perfectly flat distribution) to 0.068(monoalphabetic substitution from common English)
Alphabets IC 1 0.068 2 0.052 3 0.047 4 0.044 5 0.044 10 0.041 large 0.038 the value of IC
- close to or above 0.068 -> monoalphabetic substitution
 - close to or below 0.038 -> polyalphabetic substitution
 
kasiski method: determine when a pattern of encrypting permutations has repeated to predict the number of alphabets used for substitutions
- search for repeated sequence of characters
 - calculate the distance between each two of them
 - the estimated key length is the common divisor
 
analyze polyalphabetic cipher
- use the kasiski method to predict likely numbers of enciphering alphabets
 - if no numbers emerge fairly regularly, the encryption is not probably not a polyalphabetic substitution
 - divide the ciphertext into serval sequences, according to the key length
 - calculate IC for each sequence
 
The "perfect" substitution cipher
- use infinite or a large nonrepeating keys
- problems
- absolute synchronization between sender and receiver, difficult to distribute
 - need an unlimited number of keys
 
 
 - problems
 vernam cipher
- use random number as the key
 e.g.
plaintext V E R N A M C I P H E R encoded 21 4 17 13 0 12 2 8 15 7 4 17 random 76 48 16 82 33 3 58 11 60 5 48 88 sum 97 52 33 95 33 15 60 19 75 12 52 105 mod 26 19 0 7 17 18 15 8 19 23 12 0 1 ciphertext T A H R S P I T X M A B possible attack
- random number generator
 - most common type of random number generator: linear congruential random number generator
 - seed: the initial value \(r_0\)
 successive random number \(r_{i+1}\)
\[r_{i+1}=(a\times r_i+b)\mod\ n\]
- if someone knows a, b and n, he or she can use the current random number to calculate the next random number
 
cracking the random number generator becomes solving a systems of equations
Permutations (transpositions)
- rearrange the symbols of a message and try to break established patterns
- columnar transposition: rearrangement of the characters of the plaintext into columns
- e.g. THIS IS A MESSAGE TO SHOW HOW A COLUMN TRANSPOSITION WORKS

 - cryptanalysis
- compute the letter frequencies: all letters appear with their normal frequencies implies that a transposition has been performed
 - break the text into columns by compare a block of ciphertext characters against characters successively farther away in the ciphertext
 
 
 - e.g. THIS IS A MESSAGE TO SHOW HOW A COLUMN TRANSPOSITION WORKS
 
 - columnar transposition: rearrangement of the characters of the plaintext into columns
 - complexity
- the execution time of the algorithm is proportional to the length of the message
 - store all characters
 - output cannot be generated until all characters have been read (not good for long message)
 
 - alternative method: permute the characters of the plaintext with a fixed period d
- e.g. d = 5, and the permutation is (2 4 5 1 3)

 
 - e.g. d = 5, and the permutation is (2 4 5 1 3)
 
Confusion and Diffusion
- confusion: change symbols
- bad confusion: caesar cipher
 - good confusion: polyalphabetic substitution with a long key
 
 - diffusion: change order
- bad diffusion: the substitution ciphers
 - good diffusion: the transposition ciphers
 
 - DES provides good confusion and diffusion
- substitution provides confusion
 - permutation provides diffusion
 
 - purpose of cryptography -> data confidentiality

 
Ciphers
- stream cipher: convert one symbol of plaintext immediately into a symbol of ciphertext, e.g. the substitution cipher
- advantages:
- speed of transformation
 - low error propagation: each symbol is separately encoded
 
 - disadvantages:
- low diffusion: all information of a symbol is contained in the symbol, subject to attack like frequency count, digram analysis, IC and Kasiski method
 - possible for malicious insertions and modifications
 
 
 - advantages:
 - block cipher: encrypt a group of plaintext symbol as one block, e.g. the columnar transposition cipher
- advantages:
- diffusion: information of plaintext is diffused into several ciphertext symbols
 - immunity to insertions: impossible to insert a single symbol in a block
 
 - disadvantages:
- slowness of encryption
 - error propagation: an error will affect all other characters in the same block
 
 
 - advantages:
 
Rotor Machine
- contains three rotors (key)
 - is implemented by polyalphabetic substitution (not fixed substitution)
- change one position, change the key
 - the number of keys is \({26}^3\)
 
 - procedure
- use original setting to do substitution
 - after one input, the next input may generate the rotation of the first rotor (one position)
 - if the first rotor rotates a cirle, jump to the second rotor
 - after one input, the next input may generate the rotation of the second rotor (one position)
 - until three rotors finish their rotation -> original status -> again and again
 
 - utility: enigma machine (by German, with a 256-element rotor, use all rotors twice for each letter)
 - example

 
Modern Cryptography
DES (data encryption standard)
- a block cipher with 56-bit key (64-bit including parity bits), 64-bit block
- the reason for 64-bit: 64-bit is 8 bytes, each character uses 1 byte in the computer, 8 characters means we can use ascii code to represent
 - parity bits: just like checksum, solving the nosie problem, but today we may not use this

 
 - most commonly use block cipher
- in 1990s, this method was not safe any more, then generate "triple DES" (1999) and AES (2001, use until now)
 - 3-DES
- still use today, but someone thinks out of date
 - use this method due to orginial DES hardware can be used (save money)

 
 
 - based on "Feistel" network structure
- means for both encryption and decryptionn use the same hardware (box)
 - DES use this structure for 16 times, but using different subkeys
- method, ciphertext, etc. are public, but you cannot decrpyt it (due to lots of substitution and permutation)
 
 - mathmatics
- \(L_{i+1}=R_i\)
 - \(R_{i+1}=L_i\bigoplus f(K_i,R_i)\)
 - inverse is the same
- \(R_i=L_{i+1}\)
 - \(L_i=R_{i+1}\bigoplus f(K_i,L_{i+1})\)
 
 - f function

- S-Box
- public
 - fixed (6 bits -> 4 bits)
 - design criteria (early undisclosed)
- the S-boxes were carefully tuned to increase resistance against differential cryptanalysis
 - no S-box is a linear or affine function of its input: the 4 output bits cannot be expressed as a system of linear equations of the 6 input bits
 - changing 1 bit in the input of an S-box results in changing at least 2 output bits: the S-boxes diffuse their information well throughout their outputs
 - the S-boxes were chosen to minimize the difference between the number of 1s and 0s when any single input bit is held constant: holding a single bit constant as a 0 or 1 and changing the bits around it should not lead to disproportionately many 0s or 1s in the output

 
 
 - P-Box
- public
 - fixed

 
 - the only secret is the key, without it, you cannot decrypt the ciphertext
- but it is not safe now, due to computing power (only \(2^{56}\) possible keys)
 
 
 - S-Box
 
 
 - designed to facilitate hardware implementation
 
AES (advanced encryption standard)
- mandates a block size of 128 bits and a choice of key size from 128, 192, 256 bits (called AES-128, AES-192, AES-256)
- using 128-bit key (16 bytes, 4 4-octet)

 
 - using 128-bit key (16 bytes, 4 4-octet)
 - diffusion and confusion
- diffusion: by permutations in ShiftRows and MixColumns
 - confusion: by substitutions (S-Box) in SubBytes

 
 
Hash
- cryptographic hash function
- compression: for any size of input x, the output length of y = h(x) is small
 - features
- efficiency: easy to compute h(x) for any input x
 - one-way: given any value y, it is computationally infeasible to find a value x such that h(x) = y
 - collision resistance
- given x and h(x), if is infeasible to find y ≠ x such that h(x) = h(y)
 - it is infeasible to find x and y, with y ≠ x, such that h(x) = h(y)
 
 
 - common hash functions: MD5, SHA-1, SHA-256
- basic structure of SHA-1
- split message into 512-bit blocks
 - the former result would be the next input
 - if the memory is spare, padding

 
 
 - basic structure of SHA-1
 
 - HMAC (hash message authentication code)
- today, we use digital signature to verfity identity
 - in old days, we uese HMAC to verify indentity

 - the use of a shared secret key that provides authentication in addition to integrity protection
 - features
- faster than encryption in software
 - use of any hash function is permitted in any export product from US
 - used in IPsec and TLS
