Version 1.5
Version 0.99.1 - please mail me with comments, corrections etc.
One round of IDEA looks like this :
X1 X2 X3 X4 | | | | V V V V Z1_1->Mul Z2_1->Add Z3_1->Add Z4_1->Mul | | | | | | | | +-------------|------>Xor<------------+ | | | | | | | | | | | | +--------|--->Xor<------|------------+ | | | | | | | | | | | | | | V V | | | | Z5_1->Mul-->Add | | | | | | | | | | | | | | | | V V | | | | Add<--Mul<-Z6_1 | | | | | | | | V | | | V | Xor<-----------|--------|-----+------>Xor | | V | | V | Xor<------+--------------|---------->Xor | | | | | +-----------\/----------+ | | ____________/\___________ | | | | | V V V V
And seven more of these rounds. and after those rounds the output transformation :
| | | | | +-----------\/----------+ | | ____________/\___________ | | | | | V V V V Z1_9->Mul Z2_9->Add Z3_9->Add Z4_9->Mul | | | | Y1 Y2 Y3 Y4
The 128 bit key is partitioned into 8 subblocks, that are directly used for the first 8 subkeys. (Keys Z1_1...Z1_6 Z2_1 Z2_2 ). The master key is then cyclicly rotated 25 positions to the left, and again partitioned into 8 blocks, which are used for the next 8 subkeys. This procedure is repeated until all 52 subkeys are generated.
Decryption with IDEA works exactly the same, only the subkeys are different. The subkeys for decryption are derived as follows from the encryption subkeys :
round 1 Z1_9^-1 -Z2_9 -Z3_9 Z4_9^-1 Z5_8 Z6_8 round 2 Z1_8^-1 -Z3_8 -Z2_8 Z4_8^-1 Z5_7 Z6_7 round 3 Z1_7^-1 -Z3_7 -Z2_7 Z4_7^-1 Z5_6 Z6_6 ... round 8 Z1_2^-1 -Z3_2 -Z2_2 Z4_2^-1 Z5_1 Z6_1 output Z1_1^-1 -Z2_1 -Z3_1 Z4_1^-1
The Zn_m denote the same subkeys used in the encryption. The ^-1 means the multiplicative inverse (modulo 2^16+1), and the - is simply the negative of a subkey.
There are 60x60x24x365 = 3.15x10^7 seconds in a year.
With 10^38 keys, it would take this hypothetical computer 3x10^38/(10^9 x 3x10^7) = 10^22 years to find one key.
If every person on earth had one such computer, it would 'only' take 10^22 / 5x10^9 = 2x10^12 years. This is hundreds of times longer than the age of the earth (4.5x10^9 years). So we can conclude that IDEA is quite safe from a brute-force attack.
With public-key cryptography the whole situation is different. Here there are two keys, one public key and one private or secret key. The public key can be used to encrypt messages, but once the messages are encrypted with the public key, they can only be decrypted with the secret key.
Thus the public key can be transmitted in any manner, since it doesn't have to be secret. The only thing that must be kept secret is the private key, and that is easier since this key never has to be transmitted to anyone.
Public-key cryptography was invented in 1976 by Whitfield Diffie and Martin Hellman , and independently by Ralph Merkle. The basis of any public-key algorithm is a mathematical function that can easily be calculated, but that is very hard to reverse without some extra knowledge. The RSA public-key algorithm was invented in 1978 by Rivest, Shamir and Adleman, and thus derives its name from their initials. RSA relies on the difficulty of factoring a large number into into its prime components. Remember, a prime is a number that can only divided by 1 and by itself. If a number is not prime, but composite, it can be uniquely factored. For example, 55=5x11. However, when the number to be factored is not 55, but a few hundred digits long, this is a very hard problem.
C = M^e mod n
.
C is now the encrypted message.
If we wish to decode the message, we calculate :
M = C^d mod n
.
Example:
p=5, q =11, n=55, (p-1)(q-1) = 40. e must be between 3 and 40, and no common
factors with 40.
We choose e = 7.
de = 1 mod (p-1)(q-1)
.
d = 23, because 23*7 = 161 = 1 mod 40
Encryption:
M = 2 => C = 2^7 mod 55 = 18
Decryption: M = C^d = 18 ^ 23 mod 55
= 18 ^1 * 18^2 * 18^4* 18^16 mod 55
doing the modulo-operation on each of the numbers
= 18 * 49 * 36 * 26 mod 55 = 2
Our public key consists of the numbers (e, n) and our secret key is (d, n).
The trick is that given p and q, d can easily be calculated, but without knowing p and q, d cannot be found.
The various operations that have to be done (modular exponentiation, modular inverses etc) can be done quite efficiently.
The Prime Number Theorem states that there are on the order of N/lnN primes below the number N. With N 2^256, there are 6.5*10^74 primes below 2^256 (1.1*10^77). If we wish to know the number of primes between 2^256 and 2^255, we have to subtract the number of primes below 2^255 . There are about 3.2*10^74 primes below 2^255. And thus, there are about 3.3*10^74 primes of exactly 256 bits. As the number of particles in the Universe if estimated at around 10^80, it is obvious that running out of primes, or listing all the primes is totally out of the question even for 512 bit keys.
Finding such primes is not a very hard problem. This is a good thing, because if it was using PGP would be impossible ! To find such a prime, we just generate a random number of appropriate size, and test if it is a prime. If it isn't we try again until we succeed. There are several very fast algorithms that can test if a number is prime very quickly. The fastest ones are probabilistic. That is, they will never declare a prime number composite, but they will with a certain chance declare a composite number prime. But if we test the same number again and again against different bases, the chance that it is composite and yet not found by any of these test gets very small. There do exist exact primality tests (that prove for certain if a number is prime), but they take a lot more work. PGP first does a trial division with all primes up to 8191, and then uses the Fermat-test with bases 2, 3, 5, 7.
It has not been proven that breaking RSA is as hard as factoring n. Therefore, it just might be that there is some method for breaking RSA that does not require the factorization of a large number. And second, factoring itself has not been proven to be a truly hard problem. I use "hard problem" here in the mathematical sense. If time required to solve a problem grows as a linear function of the input, or as a polynomial function, a problem is considered tractable. For example, counting n objects takes in principle just n operations. But mean the workload rises as an exponential function of the number of inputs (like 2^n, or 10^n or worse) than the problem is considered intractable, because exponential functions grow so fast.
Just for those who wish to plug in numbers, here is the
heuristic runtime formula (not proven to be correct) for the
General Number Field Sieve:
exp[(1.932+o(1))*(log n)^1/3* loglog(n)^(2/3)]
Log is the natural logarithm, and exp(x) = e^x = 2.718..^x
Here's a table from Applied Cryptography, referenced with an unpublished paper (as of Feb. 1995) by Andrew Odlyzko "Progress in Integer Factorization and Discrete Logarithms"
Mips years required to factor a number with the GNFS:
Bits | Mips-years |
---|---|
512 | 30,000 |
768 | 2*10^8 |
1024 | 3*10^11 |
1280 | 1*10^14 |
1536 | 3*10^16 |
2048 | 3*10^20 |
In another table, making generous assumptions about the increase in computer power and assuming that factoring algorithms will be as fast the Special Number Field Sieve (which today can only be used on special numbers), Bruce Schneier gives the following keylenghts against various opponents :
Year | vs Individual | vs Corporation | vs Government |
---|---|---|---|
1995 | 768 | 1280 | 1536 |
2000 | 1024 | 1280 | 1536 |
2005 | 1280 | 1536 | 2048 |
2010 | 1280 | 1536 | 2048 |
2015 | 1536 | 2048 | 2048 |
Please note that there is lots of assumptions and handwaving in these numbers. Very few of the factorizations experts dare to make predictions beyond a few years.
Also keep in mind that there are, especially for Governments, other methods for obtaining your keys than spending billions of dollars and waiting one or more years for the factorization of your key. Burglary and electronic snooping (Tempest) are much more effective and much cheaper.
This attack is of little consequence for PGP, because anybody who can measure the time it takes your CPU to process a message with this accuracy, can also just copy the key from memory. The attack is no more dangerous than a virus or trojan designed to intercept your passphrase and secret key.
More information about factoring and the two main algorithms used for large numbers:
A block Lanczos algorithm for finding dependencies over GF(2) - Peter L. Montgomery:
A simple checksum, that adds up all the all the bytes in a file and then truncates the result to last n digits, or a CRC (Cyclical Redundancy Check) are examples of hashes, but they are not one-way. In order to be one-way, the functions must also have the following properties :
In some applications, one-wayness as defined above is not enough. The hash function must also be collision-resistant. That is, it must be hard to find two random M and M' that have the same hash-value.
In order to satisfy all the requirement, a hash value must have a certain length. It is obvious that as soon as we have documents that are longer than the output of the hash function, collisions must occur. This is called the pigeon-hole principle. If our hash function can output 4 distinct values, and we hash 5 documents, at least two documents will have the same hash value.
A related problem is posed by the 'birthday-attack'. This method is called so because it based on the often unknown fact, that if there are 23 people in the room, there's a 50% chance that two of them will have the same birthday.
This follows from the following calculation.
The chance that two people do not share their birthdays, is 364/365 The chance that three people ... is 364/365 * 363/365.
And so on, until we are certain that among 366 people at least two will the same birthday. But with 364/365 * 363/365 * ... * 343/365 ~= 0.5, we already have a 50/50 chance of two people NOT having the same birthday, and thus also a 50/50 chance of them sharing it.
It's the same with hashes. If we have a hash function with a length of n bits, we have a good chance of finding two messages with the same hash value if we try hashing 2^0.5*n random documents.
First 4 variables are initialized :
A=0x01234567 B=0x89abcdef C=0xfedcab98 D=0x76543210
These four variables are now copied into the variables a, b, c, d and the main loop begins. This loop is run as long as there are message blocks.
+-------------------------+ | Message Block | +-------------------------+ / | | \ / | | \ / | | \ / | | \ / | | \ / | | \ +-------+ +-------+ +-------+ +-------+ A-+---| round |--->| round |--->| round |--->| round |->Add-------------A B-|+--| |--->| |--->| |--->| |---|->Add---------B C-||+-| 1 |--->| 2 |--->| 3 |--->| 4 |---|---|->Add-----C D-|||+| |--->| |--->| |--->| |---|---|---|->Add-D ||||+-------+ +-------+ +-------+ +-------+ | | | | +|||---------------------------------------------------+ | | | +||-------------------------------------------------------+ | | +|-----------------------------------------------------------+ | +---------------------------------------------------------------+
The final MD5-hash is are the A, B, C, D after the last message block is processed.
In the rounds there are 4 nonlinear functions :
F(X, Y, Z) = (X And Y) Or ( (Not X) And Z) G(X, Y, Z) = (X And Z) Or ( Y And (Not Z)) H(X, Y, Z) = X Xor Y Xor Z I(X, Y, Z) = Y Xor (X Or (Not Z))
One round looks like this :
+------------------------------------------------------------------+ | | | +----+ | | | | | +->| a |-------------------------+ | | | | | +----+ | M_j t_i | | | | | | | | b |+------------------------|-----|-----|-----------------+ | | | \ | | | | | +----+ \ __________ | | | | | | | +->/Nonlinear \ V V | V | | c |----->| Function |---->Add-->Add-->Add-->Rotateleft->Add-+ | | +->\__________/ +----+ / | | / | d |+ | | +----+
Now if we write this as :
FF(a,b,c,d,M_j,s,t_i) denotes a = b+((a+F(b,c,d) + M_j + t_i <<<s) GG(a,b,c,d,M_j,s,t_i) denotes a = b+((a+G(b,c,d) + M_j + t_i <<<s) HH(a,b,c,d,M_j,s,t_i) denotes a = b+((a+H(b,c,d) + M_j + t_i <<<s) II(a,b,c,d,M_j,s,t_i) denotes a = b+((a+I(b,c,d) + M_j + t_i <<<s)
(<<<s is a rotating the bits to left over s positions)
/* Round 1 */ FF (a, b, c, d, M_0, 7, 0xd76aa478); /* 1 */ FF (d, a, b, c, M_1, 12, 0xe8c7b756); /* 2 */ FF (c, d, a, b, M_2, 17, 0x242070db); /* 3 */ FF (b, c, d, a, M_3, 22, 0xc1bdceee); /* 4 */ FF (a, b, c, d, M_4, 7, 0xf57c0faf); /* 5 */ FF (d, a, b, c, M_5, 12, 0x4787c62a); /* 6 */ FF (c, d, a, b, M_6, 17, 0xa8304613); /* 7 */ FF (b, c, d, a, M_7, 22, 0xfd469501); /* 8 */ FF (a, b, c, d, M_8, 7, 0x698098d8); /* 9 */ FF (d, a, b, c, M_9, 12, 0x8b44f7af); /* 10 */ FF (c, d, a, b, M_10,17, 0xffff5bb1); /* 11 */ FF (b, c, d, a, M_11,22, 0x895cd7be); /* 12 */ FF (a, b, c, d, M_12, 7, 0x6b901122); /* 13 */ FF (d, a, b, c, M_13,12, 0xfd987193); /* 14 */ FF (c, d, a, b, M_14,17, 0xa679438e); /* 15 */ FF (b, c, d, a, M_15,22, 0x49b40821); /* 16 */ /* Round 2 */ GG (a, b, c, d, M_1, 5, 0xf61e2562); /* 17 */ GG (d, a, b, c, M_6, 9, 0xc040b340); /* 18 */ GG (c, d, a, b, M_11,14, 0x265e5a51); /* 19 */ GG (b, c, d, a, M_0, 20, 0xe9b6c7aa); /* 20 */ GG (a, b, c, d, M_5, 5, 0xd62f105d); /* 21 */ GG (d, a, b, c, M_10, 9, 0x2441453); /* 22 */ GG (c, d, a, b, M_15,14, 0xd8a1e681); /* 23 */ GG (b, c, d, a, M_4, 20, 0xe7d3fbc8); /* 24 */ GG (a, b, c, d, M_9, 5, 0x21e1cde6); /* 25 */ GG (d, a, b, c, M_14, 9, 0xc33707d6); /* 26 */ GG (c, d, a, b, M_3, 14, 0xf4d50d87); /* 27 */ GG (b, c, d, a, M_8, 20, 0x455a14ed); /* 28 */ GG (a, b, c, d, M_13, 5, 0xa9e3e905); /* 29 */ GG (d, a, b, c, M_2, 9, 0xfcefa3f8); /* 30 */ GG (c, d, a, b, M_7, 14, 0x676f02d9); /* 31 */ GG (b, c, d, a, M_12,20, 0x8d2a4c8a); /* 32 */ /* Round 3 */ HH (a, b, c, d, M_5, 4, 0xfffa3942); /* 33 */ HH (d, a, b, c, M_8, 11, 0x8771f681); /* 34 */ HH (c, d, a, b, M_11,16, 0x6d9d6122); /* 35 */ HH (b, c, d, a, M_14,23, 0xfde5380c); /* 36 */ HH (a, b, c, d, M_1, 4, 0xa4beea44); /* 37 */ HH (d, a, b, c, M_4, 11, 0x4bdecfa9); /* 38 */ HH (c, d, a, b, M_7, 16, 0xf6bb4b60); /* 39 */ HH (b, c, d, a, M_10,23, 0xbebfbc70); /* 40 */ HH (a, b, c, d, M_13, 4, 0x289b7ec6); /* 41 */ HH (d, a, b, c, M_0, 11, 0xeaa127fa); /* 42 */ HH (c, d, a, b, M_3, 16, 0xd4ef3085); /* 43 */ HH (b, c, d, a, M_6, 23, 0x4881d05); /* 44 */ HH (a, b, c, d, M_9, 4, 0xd9d4d039); /* 45 */ HH (d, a, b, c, M_12,11, 0xe6db99e5); /* 46 */ HH (c, d, a, b, M_15,16, 0x1fa27cf8); /* 47 */ HH (b, c, d, a, M_2, 23, 0xc4ac5665); /* 48 */ /* Round 4 */ II (a, b, c, d, M_0, 6, 0xf4292244); /* 49 */ II (d, a, b, c, M_7, 10, 0x432aff97); /* 50 */ II (c, d, a, b, M_14,15, 0xab9423a7); /* 51 */ II (b, c, d, a, M_5, 21, 0xfc93a039); /* 52 */ II (a, b, c, d, M_12, 6, 0x655b59c3); /* 53 */ II (d, a, b, c, M_3, 10, 0x8f0ccc92); /* 54 */ II (c, d, a, b, M_10,15, 0xffeff47d); /* 55 */ II (b, c, d, a, M_1, 21, 0x85845dd1); /* 56 */ II (a, b, c, d, M_8, 6, 0x6fa87e4f); /* 57 */ II (d, a, b, c, M_15,10, 0xfe2ce6e0); /* 58 */ II (c, d, a, b, M_6, 15, 0xa3014314); /* 59 */ II (b, c, d, a, M_13,21, 0x4e0811a1); /* 60 */ II (a, b, c, d, M_4, 6, 0xf7537e82); /* 61 */ II (d, a, b, c, M_11,10, 0xbd3af235); /* 62 */ II (c, d, a, b, M_2, 15, 0x2ad7d2bb); /* 63 */ II (b, c, d, a, M_9, 21, 0xeb86d391); /* 64 */
The constant t_i are the integer parts of 2^32* abs(sin(i)), with i in radians.
Finally, as the last message block is processed, the numbers a, b, c, d form the MD5 hash for the file that has been processed.
Are you confused now? Well, at least the bits in the message are, and that's what it was all about.
[ Table of Contents | About this FAQ | Glossary ]