Sie sind auf Seite 1von 13

Blowsh

Kevin Allison Keith Feldman Ethan Mick

Introduction
Blowsh is a Feistel network block cipher with a 64 bit block size and a variable key size up to 448 bits long. The Blowsh algorithm is unencumbered by patents and is free to use for any one is any situation. Blowsh consists of two parts: key-expansion and data encryption. During the key expansion stage, the inputted key is converted into several subkey arrays total 4168 bytes. There is the Parray, which is eighteen 32-bit boxes, and the S-boxes, which are four 32-bit arrays with 256 entires each. All of these boxes are initialized with a xed string, the hexadecimal digits of pi (less the number 3). After the string initialization, the rst 32 bits of the key are XORed with P1 (the rst 32-bit box in the P-array). The second 32 bits of the key are XORed with P2, and so on, until all 448, or fewer, key bits have been XORed. Cycle through the key bits by returning to the beginning of the key, until the entire P-array has been XORed with the key.

Key-Expansion
32 bits 32 bits 32 bits 10 bits

Key:

32 bits

P-Array:

32 bits

32 bits

32 bits

32 bits

32 bits

32 bits

32 bits

32 bits

...

(XOing bits once the key has been traversed through once) Encrypt the all zero string using the Blowsh algorithm, using the modied P-array above, to get a 64 bit block. Replace P1 with the rst 32 bits of output, and P2 with the second 32 bits of output (from the 64 bit block). Use the 64 bit output as input back into the Blowsh cipher, to get a new 64 bit block. Replace the next values in the P-array with the block. Repeat for all the values in the P-array and all the S boxes in order.

Key-Expansion
Blowsh

32 bits 32 bits
Saturday, May 12, 12

32 bits 32 bits

P-Array:

32 bits

32 bits

32 bits

32 bits

32 bits

32 bits

32 bits

32 bits

...

(The second 64 bit block is dropped into the P-array) The Blowsh algorithm is now ready for encryption. The encryption is a simply Feistal network of 16 rounds. For the input of 64 bits, do: Divide x into two 32-bit halves: xL, xR For i = 1 to 16: xL = xL XOR Pi

Saturday, May 12, 12

xR = F(xL) XOR xR Swap xL and xR Next i Swap xL and xR (Undo the last swap.) xR = xR XOR P17 xL = xL XOR P18 Recombine xL and xR

Data-Encryption
64 bits 32 bits 32 bits Pi

!
F

Do 16 Times
(The 16 rounds) The F function is: F(xL) = ((S1,a + S2,b mod 232) XOR S3,c) + S4,d mod 232 where a,b,c,d are four 8 bit quartered derived from xL.

y 12, 12

32 bits 8 bits S1 32 bits 8 bits S2 32 bits 8 bits S3 32 bits 8 bits S4 32 bits

!
= addition modulo 232

32 bits
(The F function) Decryption is the same as encryption, except the P-arrays are used in reverse.

Output
Some example input and output of the Blowsh algorithm. $ ./Blowsh 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Key: 0 0 0 0 0 0 0 0 Plaintext: 0 0 0 0 0 0 0 0

Ciphertext: 4e f9 97 45 61 98 dd 78

$ ./Blowsh FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF Key: ff ff ff ff ff ff ff ff Plaintext: ff ff ff ff ff ff ff ff Ciphertext: 51 86 6f d5 b8 5e cb 8a $ ./Blowsh 37 D0 6B B5 16 CB 75 46 16 4D 5E 40 4F 27 52 32 Key: 37 d0 6b b5 16 cb 75 46 Plaintext: 16 4d 5e 40 4f 27 52 32 Ciphertext: 5f 99 d0 4f 5b 16 39 69 Some example input and output for the le encryption program. [ Other program IO here]

Design
Our implementation of the Blowsh algorithm was written in C++. This language was chosen to avoid all the hassle Java creates when dealing with bytes and casting between types. In C++, using stdint.h allows for the types to be created with the appropriate size, allowing for easier debugging and code creation. The original design followed the specication, especially during the encryption section. The encryption method takes in an array of bytes (should be 8 long for 64 bits), breaks it apart into two different 32 bit integers, and then performs the computation in the Feistal network loop 16 times.

(The encryption loop) However, for key generation our implementation actually calculated the digits of pi using the Bailey-Borwein-Plouffe formula. Every time the key is set, the formula calculates the digits of Pi using the formula and sets the P-array and S boxes. While accurate, setting the key took a very long time.

Proling
The raw dump of gprof is:

% Time 84.75 12.08 3.02 0.31 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Cumulative Seconds

Self Seconds 4.08 0.58 0.15 0.02 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Calls 17355552 0 4168 17354511 1 8368 2088 1042 523 63 20 12 6 6 6 6 6 3 1 1 1 1 0 0

Total S/call

name BinaryExp series std::pow BlockCipher F pack32BitWord computeHexPi encrypt blockSize keySize std::operator& std::setf std:operator&= std::operator~ std::operator|= std::operator| print_uint8_hex global constructors static_init setKey BlowFish

4.08 4.66 4.80 4.82 4.82 4.82 4.82 4.82 4.82 4.82 4.82 4.82 4.82 4.82 4.82 4.82 4.82 4.82 4.82 4.82 4.82

0.02 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4.80 0.02

Image of callgrind information.

Analysis
The gprof proling tool gave a less in depth look than callgrind, but it was an excellent way to get started. The tool said that 84.75% of the program time was being spent in the BinaryExp() method. This method does binary exponentiation and ends up taking a lot of computing power the further along Pi we are trying to calculate. The total time spent in this method was 4.08 seconds, an absurd amount.

The next method, series, took 12.08% of the time, and is also used in the calculation of Pi. The method also uses the pow function in the C language, which ranks third in the gprof analysis. Finally, ranked at number 5, we have the F function, which is used in encryption. encryption method is down at number 8, with very little time being used in it. The

The callgrind stack trace conrms this, while also going deeper in the library calls. The binaryExp method calls fmod, which ranks rst. After which is the method binaryExp itself, conrming gprofs information. Looking through the ranks, the prolers agree on which methods are consuming the most time. From here, the best choice is to look at the Pi generation. Why are we generating Pi? We need to set the P-array and S boxes to the string every time a new key is set. But it doesnt make sense to be generating it every time - the string itself never changes. In order to speed up the algorithm, we can generate the static digits of Pi and have the digits in an external le which can be referenced. The boxes can be set to these static digits without generation, speeding up the code. With all of the Pi generation code gone, the highest costing method should be the F function and the Encryption function.

Re-Design
We noticed that the majority of time was spent calculating the rst 8336 hexadecimal digits of Pi. ! To combat this, we removed the Pi generation code from the main Blowsh le and created its own executable. ! When this was run, it output a syntactically correct header le that had the values of Pi dened in a class called HexPi. !This class could then be accessed by including the header le in the main Blowsh code and thus lead to a large speedup and decrease in execution time. Pi[0] = 0x243f6a88; Pi[1] = 0x85a308d3; Pi[2] = 0x13198a2e; Pi[3] = 0x3707344; ... The array holds 1042 sections of Pi, which can be accessed during the key-expansion part of the algorithm. The digits of Pi can simply be assigned by: " pArray[i] = hexPi.Pi[i]; Once the Pi generation has been seperated out,

Proling (Again)
After re-designing our code, we ran gprof and kcachegrind again.

% Time 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
kcachegrind:

Cumulative Seconds

Self Seconds 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Calls 8352 2086 522 21 13 12 6 6 6 6 6 3 1 1 1 1

Total S/call 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 F

name

0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

pack32BitWord encrypt keySize BlockSize operator& std::setf operator&= operator~ operator|= operator| print_uint8_hex BlockCipher HexPi setKey BlowFish

Again, the kcachegrind goes into more detail and examines some of the library calls. However, callgrind says that the encrypt method is more expensive than the pack32BitWord method, whereas gprof has those two reversed. In both instances, the two are near the top of the lists.

Analysis
The gprof analysis was less useful that we rst hoped. It says the algorithm is faster, but the testing does not go into enough detail to know how much faster. However, it is safe to say the changes we made did speed up the program.

Callgrind goes into more detail. The rst method that is ours is the F function, which is called only in the encrypt method. It goes through and puts each of the 8 bits into S boxes, and then runs the arithmetic. This method is rather efcient and short; in order to make it faster we could unroll the S boxes, but with each one being 256 entries long, and the array indexing being extremely convenient, it didnt seem worth it. Secondly, the encrypt method follows the specication, but in order to speed it up, we unrolled the loop that runs through the Feistal network. We also unrolled P-array to work with this, editing the key-expansion parts as well. Doing these changes, as seen, drastically sped up the algorithm.

Developer Manual
The Blowsh project is hosted on Github, here: https://github.com/Wayfarer247/Blowsh482. It can be cloned using Git. $ git clone https://github.com/Wayfarer247/Blowsh482 The repository has two branches. Master, which is the old branch and does not have any speed improvements, and the speed-improvements branch, which has the faster version of the algorithm. To switch between the two: $ git checkout master or $ git checkout speed-improvements Once a branch has been chosen, build the project: $ make The makele in the directory will compile the project and makes 3 executables. Blowsh, the main program, can be run. Another executable, Prole, runs the encryption algorithm with an all zero key and all zero plaintext data N times. ! It prints out the total running time of the program. Crypter reads a le in, encryptions the le with the provided key and then saves the encrypted le

User Manual
Our Blowsh implementation is completely command line based. The program will probably not run on Windows without gcc (g++) installed. Once it has been compiled (see Developer Manual), then it can be run. $ ./Blowsh This will run the program if you are on the master branch. Speed Improvements, the nal branch, needs the key and plaintext passed in via command line arguments. $ ./Blowsh 00 00 00 00 00 00 00 00 11 11 11 11 11 11 11 11

In this case, the rst sixteen zeroes are the key, and the sixteen ones are the plaintext, inputted as hex. To use prole, do: $ Prole <Number of Iterations> e.g. Prole 100000 To use crypter, do: $ Crypter <Input File> <Output File> <Key bits> e.g. (Key Size: 8 bytes) Crypter pt.txt ct.txt 00 00 00 00 00 00 00 00

Discussion
The choice of using C++ over Java had interesting repercussions. The reason it was chosen was because of how C++ handles bit operations and how it stores bytes, integers, and long variables. These variables are stored in memory correctly, so running operations on them did not involve and Java witchcraft. However, it did bring up some issues with pointers and how variables are stored. We nally got all these issues ironed out, but it took quite a bit longer and probably would have gone faster had it been done in Java, which we are more comfortable with. When we started writing the algorithm, we divided up the parts to work on, but didnt spend too much time designing how the algorithm overall would work. Because of this, the key-expansion part generated Pi from scratch, rather than simply using a static variable. While it worked, the effort required to generate Pi could have been better used elsewhere. Of course, this did mean that we learned how to generate Pi effectively. If we had done some planning and design before hand, we also could have written the algorithm in such a way that making future optimization easier. Since we didnt, the algorithm was a good representation of an unoptimized algorithm, but the work required to optimize it later was much greater.

Future Work
To further improve the Blowsh algorithm, we could unroll all the loops to remove that overhead of loop creation and the associated variables. Further, parts of the program, such as the F function could be rewritten into a lower level language, such as C or Assembly instead of C++.! This would decrease the running time of the algorithm at the cost of removing the portability of the implementation. We could add a decryption function to allow for the decryption of the encrypted bytes using a given key.

Work
Ethan Mick Presentation 1

Paper Presentation 2 Code maintenance Kevin Allison Pi Generation Static Pi Generation Key Expansion Paper Code maintenance Keith Feldman Encryption Method Main Method Usage Speed-Improvements Code maintenance

References
Schneier, Bruce. "Description of a New Variable-Length Key, 64-Bit Block Cipher(Blowsh)." " Blowsh Paper. 1993. Web. 18 Mar. 2012. " <http://www.schneier.com/paper-blowsh-fse.html>. Morgan, Mike. "Blowsh Bug." Schneier on Security. 8 July 1996. Web. 18 Mar. 2012. " <http://www.schneier.com/blowsh-bug.txt>. Schneier, Bruce. "The Blowsh Encryption Algorithm" Blowsh. 1993. Web. 18 Mar. 2012. " <http://www.schneier.com/blowsh.html>. "Standard Cryptographic Algorithm Naming." Zetnet Meta Refresh. Web. 18 Mar. 2012. " <http://www.users.zetnet.co.uk/hopwood/crypto/scan/cs.html>.

Das könnte Ihnen auch gefallen