Take Home Exam 3: Optimization

2016
Take Home Exam 3
OPTIMIZATION
NAFEES HAIDER
PROFESSOR IZIDOR GERTNER | CSC 34200

11/8/16
Table of Contents
Objective ...................................................................................................................................................... 2
Section 1: Microsoft Visual Studio Environment ..................................................................................... 2
Clearing the Array using Index ............................................................................................................. 2
Procedure for generating the .asm file in Visual Studio. ..................................................................... 4
Clear Array Using Pointers.................................................................................................................. 10
Comparison and Result Part................................................................................................................ 19
Section 2: Clearing of Array in Linux ..................................................................................................... 24
Clear using Index .................................................................................................................................. 25
Clear Using Pointers ............................................................................................................................. 27
Running Time Analysis ........................................................................................................................ 28
Section 3: Dot Product Table ................................................................................................................... 33
Comparison (Optimized and Unoptimized Version) ......................................................................... 35
Section 4: Dot Product Intrinsic Challenging Part ................................................................................ 41
Conclusion ................................................................................................................................................. 43
Objective
The main goal of this assignment is to optimize the assembly codes generated by the compiler.
This optimization means to create a custom build assembly codes that are having a better
performance. After that we are going to measure the time of execution of a function in Microsoft
Visual Studio (32-bits) and Linux (64-bits). In time measurement part, we are going to compare
the time differences between an optimized and unoptimized codes. Then we are going to repeat
this process over the different sizes of arrays, ranging from 10 to all the way 10000000
(10,100,1000,10000,100000,1000000,10000000). With this time measurement test we are
assuring that the optimized codes are having promising results.
Section 1: Microsoft Visual Studio Environment

We have to start with first creating a project in MS Visual Studio. In this project we can perform
this task in two ways. We can either create a main file, function file, and a header file or we can
just create two function files and identify the prototype of the function in the main file. For
curiosity purposes, we are going to use both ways in this assignment.
Clearing the Array using Index

As explained above, we are going to create a main file,
header file, and a function file. In main file, we are just
defining the parameters and calling the function. In header
file, we are defining the prototype of the function. And finally
we are creating a function in the function file. The names of
the files are Array.cpp for the main file, function.cpp for the
function file, and Header.h for header file, as we can see in

Figure 1 Clear using Index
the figure 1.
One we have the entire setup, we can compile and run it to see if it works. This project compiled
successfully. Now we look into its disassembly and see what is going on.
Figure 2 The Program starts by creating the stack frame. Reserving the space from the memory
Figure 3 Moved the size of an Array into the register eax so that it can be used for further calculations
Figure 4 Pushed the value into the memory which was an unnecessary and will be show why later when we are going to perform
the optimization in codes
After this the function is going to be called which is clearing out the values stored in the data
segment of stack frame. The next step is to generate the .asm file (assembly codes file) of the
function file. In order to generate it we have to follow the procedure this is going to be described
next.
Procedure for generating the .asm file in Visual Studio.

We click on the Project at the top then in the drop down menu, at the very bottom we click on the
<Project Name> properties. The new window will pop up. On the left side, we are going to see
Output Files, directly under C/C++ option inside the menu we are going to see Assembler output
to the left. Click on it and choose “Assembly-Only Listing (/FA)”

Figure 5 Procedure for generating the .asm file
After that, click Apply and close the window. Then start compiling the program. Make sure that
the function.cpp file is open in order for the program to make the function.asm file for that. Once
the program finishes compiling we can see the function.asm file. After that, we have to import
that file in our current project. We can do this by right click on the Source Files on the right side,
in the Solution Explorer menu. Click Add then click Existing Item. A window will pop up, add
the function.asm file in that. Click to open it up and see the assembly codes. The assembly code
of clearing the array using the index is shown in the figure 3.

Figure 6 UnOptimized Codes for Clear Using Index
The codes above are the unoptimized version of the function that clears the array using index.
The point of our interest is from line 42 to line 60. In this part of the codes we can see that the
immediate value is being stored to the memory address of base pointer which is variable _i$.
After that we are going to jump to the function which has been named as $LN4@ClearUsing:
Right here, we are going to move the value zero that was stored into the memory back to the
register which is going to be compared by the value stored inside the size variable. If they are not
equal, then keep on moving to the next statement. When it moves to the next step the line 54 is
redundant and does the same thing as line 50. After that if stores the address of array into the
register ecx and it performs some calculation of how to move to the next memory location. The
theory is simple. Once it moves to the next memory location it replaces the content inside it with
zero which in a way is clearing it out. The steps above are repeating 10 times until the values in
register eax and variable _size$[ebp] are not the same. Once they are same it goes to the
procedure called $LN1@ClearUsing: And the function finally exits.
This is the part we will actually optimize. We can edit this file and make it optimized directly
from Visual Studio and save it in a different location with a different name. The optimized codes
are given in figure 7.

Figure 7 Optimized Codes for Clear Using Index
In the unoptimized version, intuitively we can see that we can eliminated many thing, such as
redundant lines and the transfer of one variable from memory to register. Now let us take a look
at our optimized codes. We directly assigned the immediate value zero to the register eax. Then
we assign the value of the variable size into the register ebx and initialized the address of an
array in the register ecx. After that jump to the procedure named $LN4@ClearUsing: where we
compare the content of register eax and ebx. If they are not the same they move to the next line.
In this line replace the content of the first array location with zero. Repeat this process for every
array location. After that the value of register eax and ebx will be similar, at this point jump to
the process called $LN1@ClearUsing: which is basically the exit of the function. Notice the
number of steps. And how much redundancy was avoided at this point. This saves a lot of time.
Next create the new project in which we don’t have the function file in it because we are going to
use this optimized function.asm file to perform the operation. After the optimization has been
performed the next step is to import this optimized version of .asm file into the new project that
we just created. Note that this file is not going to link and compile all by itself. We have to link it
by following some additional procedures in the program. We can do it by right clicking on the
function.asm file and go to its properties. Under General choose Custom Build Tool for Item
Type. Click Apply the windows will disappear and reappear within a second. This time there is
going to be another section under Configuration Properties. Underneath it, click General we can
see that there is going to be empty Command Line and Outputs. We have to type two command
in order to link it and make it work. For Command Line we are going to type ml -c "-
Fl$(IntDir)%(FileName).lst" "-Fo$(IntDir)%(FileName).obj" "%(FullPath)" and for output we
are going to type $(IntDir)%(FileName).obj;%(Outputs), as show in the figure 8.

Figure 8 Link the .asm file for execution
Click Apply and then click ok. Now make sure that there is no function file. So that the entire
execution of the function is from the .asm file. Test the result. Once the test verifies that there is
no error then we can start implementing our next phase which is the running time analysis.
Clear Array Using Pointers

In this part, we are going to perform the same operation as above. However, this time we are
going to do it through pointers. Pointers, in general are a lot faster than the indexes for multiple
reasons. First is because we are not copying the entire array into the function and then return it to
the main again. The pointers acquire the address of an array or we can call it acquiring of
pointers. In this way when we set the pointer equals to zero. The content inside the address
pointed by the pointer is replaced with zero. Basically there is no redundancy, due to avoiding of
copying the array.
We start writing the codes for clearing the arrays using pointers. The procedure is the same that
we create the new project and use the codes shown in the figure 9.
Figure 9 Clear using Pointers
As we can see that we declared the we declare the size which is going to change later in running
time analysis. The global array in declared. Although we could’ve declared it inside the main
because everything is being processed through pointers global will not matter at this point. We
also declared the prototype before the main and the void function is declared at the bottom of the
main function. And just to reassure that the array is clearing out I initialized the for loop to
display all the values of the array after the clear function has been performed. The result should
display all zeros.
So what is happening in this file is that we declare the size in the main file and then call the clear
using pointers function. It takes the address of the globally defined array and inside that function
which is going to run 10 times and on each iteration it adds one to its offset and replace the
content with zero. And this loop is going to run till it performs 10 iterations. Hence, clearing out
the entire array.
Let us take a deeper understand on it using disassembly of this code.
Figure 10 We started by reserving the space in the memory as shown in the memory window and next step will be to declare 10
in the variable named size
Figure 11 Storing the value 10 in the memory location with little indian fashion
Figure 12 Jump into Clear using Pointer function. here we are setting eax to four which is later on going to help us jump to next
memory location
Figure 13 moving the content of variable size to register eax and as well as the address location of array to register ecx. Then
loading the memory location of first content of the array that is accessed through register edx
Figure 14 moving the address of the first content of the array to pointer and from pointer to register eax. After that move the
immediate value zero to the content of the address that is stored in register eax.
Figure 15 Repeating these steps over and over to make all the content equal to zero. Basically erasing all the content stored in
the memory location
Figure 16 Exiting of the function
Now we have to optimized the codes and make the running time as fast as possible. We have to
first generate the .asm file of it. The figure 17 below shows the .asm codes for the unoptimized
clear using pointer function.

Figure 17 Unoptimized asm codes for Clear Using Pointer function
First of all, before optimized we have to comment out two lines otherwise the program will give
us compilation errors. These two commented lines are shown in figure 18

Figure 18 Commands to be commented out
We start by initializing stack frame. Then we are going to store the immediate value 4 into the
memory. Register ecx is going to have the zero offset address of an array. Temporarily the
register ecx is having nothing inside. It will be zero. Next step is to load the address of an array
to the register ecx and jump to the procedure called $LN$@ClearUing: where we are going to
move the content of the variable size into the register eax and move the address of the array into
register ecx. Load the memory location using the theoretical formula mentioned above, compare
the addresses of pointer p and the address stored in register edx. If they are not equal, then move
on to next statement. On the next statement, move the address of the that is stored in pointer
which is the address of the array to register eax. And then replace the content inside it with zero.
Go back to the procedure named $LN2@ClearUsing: and repeat these steps over and over until
we go through all the contents of the array and replace it with zero. After that the function goes
to the procedure called $LN1@ClearUsing: which is basically the exiting of the function.
The codes below in the figure 19 shows the optimized version of the clear using pointer
functions.
Figure 19 Optimized function: Clear using Pointers
Comparison and Result Part

In this part, we are going to compare each method of clearing array using different size arrays.
The size n is going to be 10, 100, 1000, 10000, 100000, 1000000, and 10000000.
As the size is increasing we can see that the time in increasing but that is expected. However, we
can see the difference between the optimized codes running time of the same function and
unoptimized codes running time. The running time of unoptimized codes is larger than the
optimized codes and as we increase the size of the array we can see that the optimized codes
running time in nearly half of the running time of the unoptimized codes.
The measurement shown in the tables given blow are done as follow. Each method with the same
size has ran five times and then taken average of, in order to be precise. This step has been
repeated for all the sizes, mentioned above, of the arrays. The same method has also been using
for clear using pointers and its optimized version too. Now based on the intuition we can say that
clearing of an array using pointer should be faster than the clearing of an array using index.
Likewise, the optimized version of clearing the array using pointer should be faster than the
optimized version of clearing of an array using index, since its original codes are faster too.
Table 1 For Size = 10
Clear Using Optimized Clear Clear Using Optimized Clear
Index Time(μs) Using Index Pointer Time(μs) Using Pointer
Time(μs) Time(μs)
125.16 79.82 111.47 90.37
124.87 92.65 136.84 81.82
135.42 89.82 115.18 75.55
124.30 82.68 120.31 78.68
134.28 92.37 119.74 83.53
Average = Average = 87.47 Average = Average = 81.99
128.80 120.71
Time(μs) Time(μs)
127.16 119.74 128.01 91.23
183.89 138.56 134.56 92.37
191.58 94.08 141.41 132.57
143.59 118.03 141.70 90.38
141.69 99.50 145.12 116.61

Average = Average = Average = Average =
157.58 113.98 138.16 104.63
Time(μs) Time(μs)
153.67 90.38 162.22 91.80
159.37 129.15 139.42 93.51
147.68 99.22 144.55 114.90
178.75 144.83 205.84 93.23
196.72 98.93 143.12 88.95
Average = Average = Average = Average = 96.49
167.24 112.50 159.03
Time(μs) Time(μs)
165.07 143.12 155.09 113.47
151.10 101.21 150.37 106.06
187.11 107.76 164.21 100.07
194.83 118.03 177.61 106.63

189.30 127.72 160.79 107.20
177.48 119.56 161.61 106.69
Time(μs) Time(μs)
373.20 264.86 442.76 212.97
463.29 296.50 423.37 208.41
477.76 229.79 403.42 199.86
492.30 224.66 452.81 197.86
523.16 341.55 451.60 290.23
465.94 271.47 434.79 221.87
Time(μs) Time(μs)
2733.56 1605.13 2735.84 1186.59
2691.37 1239.34 2781.18 1567.49
2709.33 2511.18 2816.53 2416.53

2729.86 1284.10 2803.42 1240.76
2872.69 1246.47 2889.80 1317.74
2747.36 1577.24 2805.35 1545.82
Time(μs) Time(μs)
27633.61 14200.126 26352.07 12553.66
27842.87 16541.67 28963.90 15752.51
29685.21 14793.14 29621.35 12394.85
29681.10 11753.94 28553.35 13122.44
30755.77 12678.25 28202.39 12380.03
29119.71 13993.43 28338.61 13240.70

Running Time Summary
30000
25000
20000
Time in μs
15000
10000
5000
0
0 2000000 4000000 6000000 8000000 10000000
Size of n
Clear Using Index Optimized Clear Using Index

Clear Using Pointers Optimized Clear Using Pointers
Graph 1 Microsoft Visual Studio Running Time Analysis Summary
As we can see the running time complexity is linear. However, the code shows significant
difference between running time of unoptimized and optimized codes. The result in the graph is
expected, as mentioned above. Unoptimized codes are a lot slower but even in that the clear
using pointer is much fast while the optimized codes are faster than the unoptimized ones.
Section 2: Clearing of Array in Linux

In Linux environment, we can measure the performance of array clearing using both index and
pointers along with gcc compiler. Since we are working on optimizing the codes we have to use
some special commands in the terminal to generate the .asm file which later on we are going to
modify it, make it the optimized version and then link it with main file to perform the operation.
The following commands are going to help us generate, link, and compile the assembly file.
1. gcc -O0 -S function.c
2. gcc main.c function.s

The first function will help us generating the assembly file of the function.c file. This file can be
read in gtext software and we can edit it here to make it optimized. Save the file and use the
second command from the table above to link this assembly file with the function file. Notice
that this time in main.c file we remove the function; just like we did in visual studio but it still
works because the assembly file in having the function to clear the array.
Clear using Index

The C codes for clearing the array using index or pointer is the same as the one we have used in
Microsoft Visual Studio. The generate assembly code of unoptimized function for clear using
index is given below.
Figure 20 Unoptimized Version of Clear using Index

Figure 21 Optimized Version of Clear using Index
Over here, we are going to compare the unoptimized and optimized codes. In unoptimized
version of the code we were having the variable -4(%rbp) which works as an addition and
moving on to the next memory location. There are a lot of memory calls which needs to be
eliminated.
The way we tackled this problem is that we noticed that there are many transitions between
register to memory in the unoptimized version of codes. We optimized it by reducing the amount
of transfers and made the calculation stay within the registers, as much as possible. So here, we
can see that local variables –28(%rbp) and -24(%rbp) are assigned to registers. The first one (–
28(%rbp)) in the memory shows the size of the array and the second one (–24(%rbp)) shows the
location of the first element of an array.
Clear Using Pointers

Again the clear using pointers is using the same C/C++ codes for Microsoft Visual Studio. Both
the unoptimized and optimized codes for clear using pointers are given below.
Figure 22 Unoptimized Version of Clear using Pointers
In the codes above, we can see all the unnecessary memory calls. The calls between registers and
memories increases the amount of time it’s take one program to finish executing. So we need to
keep the transition at the minimal level.

Figure 23 Optimized Version of Clear using Pointers
We reduced the amount of memory calls and kept most of the calculations between stack pointer,
base pointer, register rax and register rdx. Similar to clear using index, except this time we are
using addresses instead of an actual variable which future reduces the run time.
Running Time Analysis

Just like in Visual Studio we have to perform running time analysis in both clear using index and
pointers functions, as well as their optimized versions. The time shown in the following table is
in microseconds. For accuracy purposes, we take the running time for each one five times,
followed by their averages. The running analysis will show some expected results. Again, based
on intuition we can say that the time taken by clear using index is slower than the others. The
time taken by clear using pointers is going to be faster because it eliminates the steps of copying
the array into the memory and then setting up the offset based on the next address.
Time(μs) Time(μs)
1 1 0 1
1 0 1 0
1 1 1 1
1 1 1 0
0 0 0 0
Average = 0.8 Average = 0.6 Average = 0.6 Average = 0.4
Time(μs) Time(μs)
1 0 1 1
1 1 1 0
1 1 1 0
1 1 1 1
1 0 0 0
Average = 1 Average = 0.6 Average = 0.8 Average = 0.4

Time(μs) Time(μs)
3 1 2 0
2 1 2 1
2 1 2 1
2 1 1 1
2 1 2 1
Average = 2.2 Average = 1 Average = 1.8 Average = 0.8
Time(μs) Time(μs)
51 14 28 10
23 13 26 9
23 13 26 9
24 14 27 11
24 13 26 10
Average = 29 Average = 13.4 Average = 26.6 Average = 9.8

Time(μs) Time(μs)
265 135 237 98
263 133 236 100
262 131 263 97
264 133 261 110
299 132 236 97
Average = 270.6 Average = 132.8 Average = 246.6 Average = 100.4
Time(μs) Time(μs)
3033 1423 2448 1053
2757 1381 2443 994
2702 1373 2454 1030
2710 1372 2403 1024
2779 1372 2456 1024
Average = Average = Average = Average = 1025
2796.2 1384.2 2440.8

Time(μs) Time(μs)
27035 9136 20227 5645
26952 9042 19409 9448
26945 9002 18883 5759
39665 8956 25086 5549
27058 9011 20258 5654
Average = Average = Average = Average = 6411
29531 9029.4 20772.6
30000
25000
20000
Time in μs
15000
10000
5000
0
0 2000000 4000000 6000000 8000000 10000000
Size of N
Clear Using Index Optimized Clear Using Index

Clear Using Pointers Optimized Clear Using Pointer
Graph 2 Linux Running Time Analysis Summary

The result shown above seems to be predictable theoretically. Note that all the time
measurements are written in microseconds. That’s why the numbers appear to be huge but it’s
really not.
Section 3: Dot Product Table

Now as we have seen the difference in timings by optimization we can use it in different
applications, in order to make calculations faster. There are various theories where we can apply
it. One of them that we are going to perform optimization is Dot Product. Dot Product plays a
vital role, especially calculating the magnitude of the vector. The codes shown in figures 24, 25,
and 26 are the codes for dot product respectively.
Figure 24 Main file of dot product
In the codes above we declare all the libraries, followed by importing the header file “Header.h”.
After that we are declaring two global arrays. Both of them are of size 10, at this point. Later on
the size will change for running time analysis. After that, we are declaring the main where the
custom size of an array is going to be 10 for now. On the next line where I am calling the dot
product function and displaying the output at the same time to verify my output result. It is a
good habit to verify the result in order to make sure that I have the right codes.
Figure 25 Header File of dot product
This is the header file in which we declared the prototype of dot product.
Figure 26 Function file of dot product
This is the function file which is going to perform the dot product. This function takes the two
arrays and their size as the argument. Then it declares the variable “sum” in which the result of
the sum is going to be stored. Now we are going to run the for loop in which the array1 is going
to be added with array2 and the result is going to be added with the previous result and stored
into the sum.

Comparison (Optimized and Unoptimized Version)
The codes below are show the comparison between optimized and unoptimized version of dot
product codes and later on we are going to see the different between the it takes one to finish the
process, in running time analysis part.

The time analysis of dot product above is done in Microsoft Visual Studio. The following tables
shows the same pattern we followed for clearing the arrays above. We are going to take the
average of the time that has been taken 5 times. This step will be repeated for all sizes of n. The
data and the graph is given below.
Dot Product Optimized Dot
Time(μs) Product
Time(μs)
189.88 120.60
176.48 95.22
131.43 95.79
139.99 97.79
129.44 102.92
Average = Average =
153.44 102.46
Time(μs) Product
Time(μs)
126.30 100.36
188.45 107.48
137.42 128.01
165.07 92.37
175.23 95.22
Average = Average =
158.49 104.69
Time(μs) Product
Time(μs)
208.98 142.27
137.70 128.01
235.50 166.79
258.30 96.94
182.18 137.42
Average = Average =
200.53 134.29
Time(μs) Product
Time(μs)
223.81 127.44
177.05 167.07
223.52 123.16
209.27 148.87
169.64 124.88
Average = Average =
209.66 138.28
Time(μs) Product
Time(μs)
734.71 435.92
520.60 424.80
817.11 324.73
560.51 310.48
509.19 351.53
Average = Average =
628.42 369.49
Time(μs) Product
Time(μs)
4453.88 3125.58
4463.00 2388.31
4022.80 2222.95
4023.94 2995.29
3777.04 2480.11
Average = Average =
4148.13 2642.45
Time(μs) Product
Time(μs)
45006.36 24689.64
42639.44 24944.80
45186.55 24165.90
40640.30 25527.27
40752.63 24554.78
Average = Average =
42845.06 24776.48
Running Time Analysis of Dot Product

45000
40000
35000
Running Time in μs
30000
25000
20000
15000
10000
5000
0
0 2000000 4000000 6000000 8000000 10000000
Size of n
Dot Product Optimized Dot Product

Section 4: Dot Product Intrinsic Challenging Part
Now this part is going to use the intrinsic function instead of regular C/C++ codes. The beauty of
intrinsic codes is that they perform calculations much faster. In our case we are going to see that
intrinsic functions may are faster but due to certain limitations they might overall get slower.
How it will affect our time is discussed later on.
Figure 27 Intrinsic Dot Product
In the codes shown in the above figure 27 we can see how intrinsic function is working. We start
by declaring the universal size variable so that in the future we don’t have to change the value of
size everywhere which could consume a lot of time. After that I declared two full size floating
arrays. But we are going to create two addition array that are going to store only eight floating
number because the intrinsic function _mm256_dp_ps can only calculate the dot product of 128
bits. Which means that it can calculate the dot product of only four elements of array. So we
need to first declare a for loop that is going to copy the 8 floating numbers of into another array
of size eight bits. After that we have to load these floating values into two 256 bit variable that
we can assume as an array at this point. Next is to create another 256 intrinsic variable named as
result. Inside the result we are going to store the dot product of intrinsic array one and intrinsic
array two, using the command called _mm256_dp_ps (dp for dot product and ps for floating
number). Now we create a floating pointer that on which we are going to pass the content of
result. The last step is to calculate the actual value and we can do it by creating a floating
variable, named as value. The value is going to store the sum of previous values and the sum of
the dot product of two 128 bits. And finally in order to verify the answer the we can do it by
displays what is stored in a variable name value.

The running time analysis is going based on the exact same pattern. As we have taken average
result of five running time records.
Dot Product Dot Dot Dot Dot Dot Dot Product
Intrinsic Product Product Product Product Product Intrinsic Time(μs)
Time(μs) Intrinsic Intrinsic Intrinsic Intrinsic Intrinsic for size 10000000
for size 10 Time(μs) Time(μs) Time(μs) Time(μs) Time(μs)
for size 100 for size for size for size for size
1000 10000 100000 1000000
11.11 11.97 35.64 148.25 1267.56 12063.51 125351.85

10.55 11.97 41.62 173.63 1432.64 12204.35 127867.02
10.26 13.97 37.63 140.27 1226.51 14257.09 125586.49
10.54 16.82 32.22 142.55 1224.51 12322.10 125348.72
11.12 12.26 43.62 139.41 1467.13 12275.06 130307.21
Average = Average = Average = Average = Average = Average = Average =
10.72 13.40 38.15 148.82 1323.67 12624.42 126892.26
Dot Product Intrinsic Run Time Analysis

140000
120000
100000
Time in μs
80000
60000
40000
20000
0
0 2000000 4000000 6000000 8000000 10000000
Size
Conclusion
This assignment was far more the most interesting assignments for all. I have finally learned a
way that how to reduce the time consumption of the same algorithm by directly accessing
through its assembly codes. The more we optimize it from assembly and reduce the memory
calls the faster the speed is going to be. The run time analysis suggests that if the size of the array
increases the running time of the optimized codes are going to be efficient up to roughly 50%.
There was another thing to learn about which is that in C/C++ there is something that exist which
I had never heard before, that is intrinsic functions. In order to perform calculations, the intrinsic
functions can be much faster than the regular function because they are meant to work like that.
Most of their functionality is based on performing mathematical calculations. And most of all I
also got the chance creating and testing intrinsic function to see its actual performance.

Take Home Exam 3: Optimization

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Take Home Exam 3: Optimization

Hochgeladen von

Copyright:

Verfügbare Formate

2016

Take Home Exam 3

PROFESSOR IZIDOR GERTNER | CSC 34200

(10,100,1000,10000,100000,1000000,10000000). With this time measurement test we are

assuring that the optimized codes are having promising results.

Section 1: Microsoft Visual Studio Environment

curiosity purposes, we are going to use both ways in this assignment.

Clearing the Array using Index

header file, and a function file. In main file, we are just

defining the parameters and calling the function. In header

file, we are defining the prototype of the function. And finally

we are creating a function in the function file. The names of

function file, and Header.h for header file, as we can see in

the optimization in codes

Procedure for generating the .asm file in Visual Studio.

to the left. Click on it and choose “Assembly-Only Listing (/FA)”

of clearing the array using the index is shown in the figure 3.

procedure called $LN1@ClearUsing: And the function finally exits.

are given in figure 7.

Fl$(IntDir)%(FileName).lst" "-Fo$(IntDir)%(FileName).obj" "%(FullPath)" and for output we

are going to type $(IntDir)%(FileName).obj;%(Outputs), as show in the figure 8.

Clear Array Using Pointers

copying the array.

Figure 9 Clear using Pointers

display all zeros.

the entire array.

Let us take a deeper understand on it using disassembly of this code.

clear using pointer function.

us compilation errors. These two commented lines are shown in figure 18

Comparison and Result Part

Table 1 For Size = 10

Clear Using Optimized Clear Clear Using Optimized Clear

Index Time(μs) Using Index Pointer Time(μs) Using Pointer

125.16 79.82 111.47 90.37

124.87 92.65 136.84 81.82

135.42 89.82 115.18 75.55

124.30 82.68 120.31 78.68

134.28 92.37 119.74 83.53

Average = Average = 87.47 Average = Average = 81.99

Table 2 For Size = 100

Clear Using Optimized Clear Clear Using Optimized Clear

Index Time(μs) Using Index Pointer Time(μs) Using Pointer

127.16 119.74 128.01 91.23

183.89 138.56 134.56 92.37

191.58 94.08 141.41 132.57

143.59 118.03 141.70 90.38

141.69 99.50 145.12 116.61

157.58 113.98 138.16 104.63

Table 3 For Size = 1000

Clear Using Optimized Clear Clear Using Optimized Clear

Index Time(μs) Using Index Pointer Time(μs) Using Pointer

153.67 90.38 162.22 91.80

159.37 129.15 139.42 93.51

147.68 99.22 144.55 114.90

178.75 144.83 205.84 93.23

196.72 98.93 143.12 88.95

Average = Average = Average = Average = 96.49

167.24 112.50 159.03

Table 4 For Size = 10000

Clear Using Optimized Clear Clear Using Optimized Clear

Index Time(μs) Using Index Pointer Time(μs) Using Pointer

165.07 143.12 155.09 113.47

151.10 101.21 150.37 106.06