Beruflich Dokumente
Kultur Dokumente
Note: For any issues regarding the assignment, drop an email at <14mseesrehman@seecs.edu.pk>
Assignment # 6
Problem 1: (MATLAB problem)
So far, we have covered a number of concepts in Stochastics system theory. Now, it is time to put
those concepts to some use. For this assignment you are required to analyze the data present in
the Drones Dataset excel file using the tools you have studied so far. The data presents a
number of stats of all the drone strikes in Pakistan till 2013. For more info on the
terms/abbreviations used in the file, refer to the notes section of the same excel file.
SUBMISSION REQUIREMENTS:
Using the dataset, produce atleast 5 Analysis Solutions. You may compute variance,
covariance, correlation, (conditional) expectations, (conditional and joint) probabilities,
(conditional) PMFs and whatever other statistical tool you deem appropriate in the analysis.
Use as many as possible.
Steps:
1. Pick any two columns (which according to you will be best for analysis) from the dataset
2. Apply a minimum of four analysis tools (i.e. computing marginal, conditional, joint
probabilities and other statistical parameters) on the data values from these pair of columns.
For more info on the Analysis Tools to use, consult the sample case
3. Repeat steps 1 and 2, until you have done it at least 5 times in total (i.e. you have to do it
for 5 different pairs of columns data)
Page 1 of 5
A sample Analysis Solution is given below. You are required to produce 5 such cases.
NOTE: The sample case given in the assignment was done for the data from the 2012 Drones
dataset, so the graphs and the values may (or may not) differ just slightly by a value or two.
Analysis Solution (Sample Case):
Step 1: Picked column S (No. of Missiles fired) and column V (Number of people reported killed)
Comparing the Number of Missiles fired (in a single drone strike) with the Number of People Killed (in a
single drone strike).
RV X = No. of Missiles fired (column S)
RV Y = Number of people reported killed (column V)
Step 2: Applying the following Analysis Tools on the data values from the chosen pair of
columns:
Analysis Tool 1 (Finding the Marginal Probabilities of X and Y):
Using the data in each column, we compute the probability of each outcome of X and Y, and then use it to
plot the individual PMFs.
PMF of X
Page 2 of 5
PMF of Y
Page 3 of 5
Comment on the value of the Correlation Coefficient. What does a lower/higher value indicate?
Page 4 of 5
Entropy Background:
To find the dependency of one variable X on another variable Y, we often use Correlation between X
and Y. Another tool often used in Information Theory is called the Mutual Information I(X;Y). Mutual
Information gives an idea of the common information in X and Y.
Mutual Information is given by,
I(X;Y) = H(X) + H(Y) H(X,Y)
Where,H(X) = Entropy of X, H(Y) = Entropy of Y, H(X,Y) = Joint Entropy of X and Y
Page 5 of 5