Sie sind auf Seite 1von 8

1. If the variance of a variable/column is 0 then what does it mean?

Can
we use that variable for our analysis? 


According to the definition of Variance, It is the expectation of the squared


deviation of a random variable from its Mean, and it informally measures
how far a set of (random) numbers are spread out from their Mean.

Thus from the statistical point of view a zero variance means, squared
deviation of a random variable from its Mean is null or zero, and the
distance between set of random number from Mean is also zero, they all lie
on same path. i.e straight line is connecting all points in a plane.

Also, In other words , if the variance is zero , it can also be concluded that
‘All observations are equal’.

For example : The Variance of the observations say 8,8,8,8,8 is zero.

If Zero variance then it also means that there is no deviation from the data.
Data looks same. So Mean , Median and Mode will all be same.

Also, as the Variance is zero. It means that the data does not contain any
useful information. So, we cannot use that variable for our analysis.

1. Calculate mean, median, mode, variance and standard deviation for


column A 

Mean is a measure of central tendency. It measures what the majority of the data
are doing toward the middle of a set. The mean is often referred to as
the average of a data set.

The observations for column A are 7,6,7,7,8,5,8,7,7,5,5

Total number of observations are : 11

According to the formula of Mean





Mean = ( X) / N = (7+6+7+7+8+5+8+7+7+5+5) / 11


= 72 / 11 = 6.54


Median is simply another name for the 50th percentile. It is the score in the
middle. Half of the observation values are larger then the median and half of the
observation values are smaller then the median.

To calculate median we need to arrange the observation value set in Ascending


order :

A : Ascending Order : 5,5,5,6,7,7,7,7,7,8,8


So, the median is the value in the middle of the sorted list in the observation.

Here , The value in the middle is 7.


So, Median is 7.


Moreover, If the number of observations are even, The Median is the average of
the two middle values in the the sorted list of observations.


Mode is another measure of central tendency. The mode is just the number that
occurs most frequently. 


In simple words, Mode is the calculation for highest frequency of a number in the
given set.

So, Observation Column A : (7,6,7,7,8,5,8,7,7,5,5)


Frequency of ‘7’ is Maximum.

So, Mode of the observation column A = ‘7’

Variance measures how far a set of numbers are spread out from their mean.It is
used in descriptive statistics, statistical inference, hypothesis testing, goodness of
fit, Monte Carlo sampling, amongst many others.


Variance of a population is given by 




Variance of a sample is given by



Variance = (Average of the elements of the set - Element of set)² / n

So, Mean (Average) of the elements in the observation column A :




Mean = ( X) / N = (7+6+7+7+8+5+8+7+7+5+5) / 11 = 72/11 = 6.54

Variance 

= [(6.54-7)² + (6.54-6)² + (6.54-7)² + (6.54-7)² + (6.54-8)² + (6.54-5)² +
(6.54-8)² + (6.54-7)² + (6.54-7)² + (6.54-5)² + (6.54-5)² ] / 11
= [(-0.46)² + (-0.54)² + (-0.46)² + (-0.46)² + (-1.46)² + (1.54)² + (-1.46)² +
(-0.46)²+(-0.46-)² + (1.54)² +(1.54)² ] / 11
= (0.2116 + 0.2916 + 0.2116 + 0.2116 + 2.1316 + 2.3716 + 2.1316 + 0.2116 +
0.2116 + 2.3716 + 2.3716 ) / 11

=(12.7276) / 11

= 1.1570
So variance of the elements in observation column A is

Standard Deviation tells us how much data deviates from the actual mean. It is
the square root of Variance.A low standard deviation indicates that the data points
tend to be close to the mean, while a high standard deviation indicates that the
data points are spread out over a wider range of values.

Standard deviation of a population is given by :

Also, Standard deviation of a sample is given by :


Assuming that the given list of observations in column A is of Population.




Standard Deviation = sqrt (Variance)

= sqrt (1.1570)

= 1.0756

3. In a group of 12 scores, the largest score is increased by 36 points.


What effect will this have on the mean of the scores ?


If

x = {x1, x2, . . . , xn}


Then the mean is given by


1 n
n∑
x̄ = xi

i=1

Increasing any amount by some value a would lead to


n

n (∑ )
′ 1
x̄ = (xi) + a
i=1

a
x̄′ = x̄ +
n
a
In other words the mean is increased by  .

n
In this case,

a = 36 and n=12


Therefore, the increase to the mean is 


36
= 3

12
So, In a group of 12 scores, if the largest score is increased by 36 points.

The Mean would increase by three. 


4. Explain the difference between Data (Singular) and Data (Plural) with
examples ?


Data (singular) : The value of the variable associated with one element of a
population or sample. This value may be a number, a word, or a symbol.

In other Words, value Associated with a single cell is called Data(Singular).

Data (plural) : The set of values collected for the variable from each of the
elements belonging to the sample. In other words, all the values
representing a single variable is called Data(plural).


For Example :

Consider the following data representation :

Mpg Cyl Dis Hp

21 6 160 110

21 6 160 110

22.8 4 108 93

21.4 6 258 110

18.7 8 360 175

18.1 6 225 105

14.3 8 360 245

24.4 4 146.7 62

In this data representation table :




Data(singular) is the value of variable represented by a single cell. So, In
Dis column, the value 160 represent a Data(singular) value.


Data(plural) is defined as all the values represented by a variable in the data
representation. So, all the values in column Dis 160 ,160, 108 ,258 ,360 ,
225,360,146.7 represent Data(plural).


5. How the inferential statistics helps to make decisions out of it?

Inferential statistics use a random sample of data taken from a population


to describe and make inferences about the population. Inferential statistics
are valuable when examination of each member of an entire population is
not convenient or possible. For example, to measure the diameter of each
nail that is manufactured in a mill is impractical. You can measure the
diameters of a representative random sample of nails. You can use the
information from the sample to make generalisations about the diameters of
all of the nails.


A random sample is first extracted (taken) from the population. Then, this
random sample is used to describe and make inferences about the whole
population. This method (Inferential statistics) is useful when examination
of each member of the entire population is not convenient or possible.


Submitted By:

Akash Kumar Dubey

Email : akashdubey826@gmail.com

Phone : +91 84 274 90461

Das könnte Ihnen auch gefallen