Sie sind auf Seite 1von 11

MITS6002 Business Analytics

Assignment 3

Q1. Carefully Read the “CommBank Retail Business Insights Report FY18” provided with this
as an attachment and answer the below questions.

i. Comment on the insights report based on the overall features; including the
quality of visualisations, presentability, and the information provided.

The report “CommBank Retail Business Insights Report FY18” is a well establish
quantitative survey which is based on business owners about 2473, managers
and decision makers, and about 16 in deep interviews.
This report consists of responses from retailers about 262 within the sectors
including homewares and hardware, and all other retails like clothing, footwear,
food, liquor, etc.
In the month august 2017 and October 2017, A survey was conducted on the behalf of
Commonwealth Bank by DBM Consultants. And it was seen that participants were
drawn from businesses with an annual turnover of more than 500,000.
Across the entrepreneurial behaviour and management capability the commbank
innovation index measured 15 core elements of innovations.
To rank the businesses on the scale of -100 to + 100 the result combines into a
single numerical indicator.

ii. List the key information you derive from this insights report and explain how
they will be useful in decision making.

An important detail derived from the report “CommBank Retail Business


Insights Report FY18” is that in order to respond to a competitive pressure from
the Australian commercial sector that increases the desire to increase their
business performance, to measure and balance the performance growth. This
will help to increase the adoption of innovative ideas in retailer’s mind that will
help in decision making.
iii. Write an abstract (one paragraph) summarising the insights report.

Summary:
According to report “ CommBank Retail Business Insights Report FY18 ” instead
of a steady adoption rate a lot of retailers are lagging behind from the average
Of whole nation.
There is always a competitive pressure as the Australian retail sector continues
to respond and is determined to drive the efficiency of the organization,
maintain or improve their businesses.
Later, Market retailers seemed to gradually embrace the creative idea of
improving customer experience, leading the technology, and ensuring that the
opportunities available would be maximized by them. Many outstanding and
passionate changes are being made by many retailers for the return of keeping
our resources transformed in these areas, with some contributing to the money-
related benefits and undeniable benefits that translate into development.
In major cases, From investing in innovations market retailers are expected to realise a
short-term returns, The payback period is less then six months with one in two
anticipating.

iv. Suggest improvements to this insights report.


In the report “ CommBank Retail Business Insights Report FY18 ” a limited
inspire was seen in the capabilities of enterprises and the behaviours of retailers
which helps in innovation.
The feedback of customers is the most important way to improve the business
and every retailer must look for innovate in response to their customer’s direct
feedback or observed behavioural changes.
Q2. Regression analysis is a commonly used technique to find relationships among
variables. Answer the below questions based on regression analysis.

i. Provide an example where regression analysis can be effectively used.


Example where we can use regression analysis is to predict the price of a
house in upcoming months.

ii. Collect height and weight data from 10 friends/relatives of yours and complete
the below table. Every student in class should have a unique set of values.

Height Weight
1 6.2 78
2 5.6 65
3 6.3 79
4 5.2 58
5 5.5 62
6 5.7 66
7 6.1 74
8 5.3 60
9 5.8 69
10 6.0 70
iii. Draw a scatterplot based on above data. Based on your plot comment on the
relationship between height and weight.

Relationship:
The relationship between the variables weight and height is strong positive correlation.
When the variable height increases the weight also increase and it is called as positive
correlation.
It is very strong correlation as all the points are almost on linear regression line.

iv. Compute the equation of the regression line.


Input Data :
Data set x = 6.2, 5.6, 6.3, 5.2, 5.5, 5.7, 6.1, 5.3, 5.8, 6.0
Data set y = 78, 65, 79, 58, 62, 66, 74, 60, 69, 70
Total number of elements = 10

Xmean = ( 6.2 + 5.6 + 6.3 + 5.2 + 5.5 + 5.7 + 6.1 + 5.3 + 5.8 + 6.0 ) / 10
Xmean  = 57.7 / 10
Xmean = 5.77

Ymean = ( 78 + 65 + 79 + 58 + 62 + 66 + 74 + 60 + 69 + 70 ) / 10
Ymean = 681 / 10
Ymean = 68.1

∑ y = 78 + 65 + 79 + 58 + 62 + 66 + 74 + 60 + 69 + 70
∑ y = 681

∑ x² = ( 6.2 )² + ( 5.6 )² + ( 6.3 )² + ( 5.2 )² + ( 5.5 )² + ( 5.7 )² + ( 6.1 )² + ( 5.3 )² +


( 5.8 )² + ( 6.0 )²
∑ x² = 38.44 + 31.36 + 39.69 + 27.04 + 30.25 + 32.49 + 37.21 + 28.09 + 33.64 + 36
∑ x² = 334.21

∑ x = 6.2 + 5.6 + 6.3 + 5.2 + 5.5 + 5.7 + 6.1 + 5.3 + 5.8 + 6.0
∑ x = 57.7

∑ xy = ( 6.2 x 78 ) + ( 5.6 x 65 ) + ( 6.3 x 79 ) + ( 5.2 x 58 ) + ( 5.5 x 62 ) + ( 5.7 x 66 )


+ ( 6.1 x 74 ) + ( 5.3 x 60 ) + ( 5.8 x 69 ) + ( 6.0 x 70 )
∑ xy = 483.6 + 364 + 497.7 + 301.6 + 341 + 376.2 + 451.4 + 318 + 400.2 + 420
∑ xy = 3953.7

Intercept Formula:
Intercept = { ( ∑ y ) ( ∑ x² ) - ( ∑ x ) ( ∑ xy ) } / { n ( ∑ x² ) - ( ∑ x )² }

Intercept = { ( 681 x 334.21 ) - ( 57.7 x 3953.7 ) ) } / {(10 x 334.21) - (57.7)² }


Intercept = ( 227597.01 - 228128.49 ) / ( 3342.1 - 3329.29 )
Intercept = ( -531.47999999998 ) / (12.81 )
Intercept = -41.4895

Slope formula:
Slope = { n ( ∑ xy ) - ( ∑ x ) ( ∑ y ) } / { n(∑x²) - (∑x)² }

Slope = { 10 ( 3953.7 ) - ( 57.7 x 681 ) } / {(10 x 334.21) - (57.7)² }


Slope = ( 39537 - 39293.7 ) / ( 3342.1 - 3329.29 )
Slope = ( 243.3 ) / ( 12.81 )
Slope = 18.993

Regression equation = Slope x + Intercept


Regression equation = 18.993 x - 41.4895

v. Calculate the R2 value and comment on the goodness of the fit.


Input Data :
Data set x = 6.2, 5.6, 6.3, 5.2, 5.5, 5.7, 6.1, 5.3, 5.8, 6.0
Data set y = 78, 65, 79, 58, 62, 66, 74, 60, 69, 70
Total number of elements = 10

Correlation = { ( 6.2 - 5.77 ) ( 78 - 68.1 ) + ( 5.6 - 5.77 ) ( 65 - 68.1 ) + ( 6.3 - 5.77 )


( 79 - 68.1 ) + ( 5.2 - 5.77 ) ( 58 - 68.1 ) + ( 5.5 - 5.77 ) ( 62 - 68.1 ) + ( 5.7 - 5.77 )
( 66 - 68.1 ) + ( 6.1 - 5.77 ) ( 74 - 68.1 ) + ( 5.3 - 5.77 ) ( 60 - 68.1 ) + ( 5.8 - 5.77 )
( 69 - 68.1 ) + ( 6.0 - 5.77 ) ( 70 - 68.1 ) }
{ (10 - 1) x 0.3773 x 7.264}

Correlation = ( 0.43 x 9.9 ) + ( -0.17 x -3.1 ) + ( 0.53 x 10.9 ) + ( -0.57 x -10.1 ) + (


-0.27 x -6.1 ) + ( -0.069999999999999 x -2.1 ) + ( 0.33 x 5.9 ) + ( -0.47 x -8.1 ) +
( 0.03 x 0.90000000000001 ) + ( 0.23 x 1.9 )
{ (9) x 0.3773 x 7.2641 }

Correlation = { ( 4.257 ) + ( 0.527 ) + ( 5.777 ) + ( 5.757 ) + ( 1.647 ) + ( 0.147 ) +


( 1.947 ) + ( 3.807 ) + ( 0.027 ) + ( 0.437 ) }
(24.6647)

Correlation = 24.33 / (24.6647)


Correlation = 0.9864

R² = (Correlation)²
R² = (0.9864)²
R² = 0.973

Goodness of the fit:


The value of goodness of r squared lies between 0 and 1.
The value 0.973 depicts that the response variable can be strongly explained by
the predictor variable.
vi. Use an analytics tool of your choice to calculate the values for iv, and v.
Compare them with your answer.

Using Excel:
SUMMARY OUTPUT

Regression Statistics
0.9864304
Multiple R 26
0.9730449
R Square 85
0.9696756
Adjusted R Square 08
1.2649573
Standard Error 49
Observations 10

ANOVA
Significa
  df SS MS F nce F
462.09906 462.09906 288.79077
Regression 1 32 32 94 1.46E-07
12.800936 1.6001170
Residual 8 77 96
Total 9 474.9      

Coefficien Standard Lower Upper


  ts Error t Stat P-value 95% 95%
- - -
41.489461 6.4611680 6.4213561 0.0002044 - 26.5899
Intercept 36 67 58 33 56.38894 81
18.992974 1.1176384 16.993845 1.45933E- 21.5702
X Variable 1 24 07 34 07 16.4157 53

Regression equation = 18.993 x - 41.4895


R² = 0.973
Both the methods give almost same values for regression equation and R squared
value.
Q3 Classification and regression are commonly used processes in business analytics.

i. Briefly explain the difference between classification and prediction.

a. Classification : It is the task of prediction of one or more classes, based on


given features of the particular problem. In order to identify the category on
any observation we take use of classification.
Prediction : It is the all over goal of supervised machine learning. It can be
either regression, classification or filling out the missing values in the data.
In prediction, we don’t predict the class labels, but the major interest is
missing and unavailable data values.

ii. Give examples for classification methods you know.


a Neural Networks : Used to classify objects in the images, face of
person etc
b Logistic Regression : Used to predict whether the bank customer will
buy the financial product or not.
c. Decision Trees : Techniques like supervised learning us decision
trees. In the decision tree decisions are represented
by the nodes and outcomes are represented by the
leaves.

ci. The following diagram shows a neural network with one hidden
layer.
Write down the algebraic equation for y1 in terms of input values i1,i2 and weights w.

Hidden layer 1:

Z(1)=W(1)X+ b(1)
a(1)=z(1)

Z ( 1 ) = vectorized output of layer 1


W ( 1 ) = vectorized weights assigned to neurons of hidden layer i.e. w1, w2,
w3 and w4
X = vectorized input features i.e. i1 and i2
b = vectorized bias assigned to neurons in hidden layer  b1 and b2
a ( 1 ) = vectorized form of any linear function.

Output Layer 2:
z(2) =W(2)a(1)+b(2)
a(2)=z(2)

Calculation at Output layer:


z(2)=(W(2)*[W(1)X+b(1)])+b(2)
z(2)=[W(2)*W(1)]*X+[W(2)*b(1)+b(2)]

Let,
[W(2)*W(1)]=W
[W(2)*b(1)+b(2)]=b

Final output : z ( 2 ) = W * X + b
It is a linear function.

Briefly explain how neural networks are used for classification.


It can use in problems of search:
i. Can used in to compare two or more documents
It can be used in anomaly detection:
ii. While making a system or a computer software, if there is any
suspicious activity which gets triggered in the software, which is very
unusual then neural networks can be used to monitor such activities

cii. Give at least three examples how clustering can be used in business
analytics. In your answer explain how each business case could be
addressed using clustering.
a. Retail businesses:
Shopping behaviours of customers, analysing how to retain their the
customers etc

b. Insurance industry:
Clustering is mostly used in fraud detection, identifying risk
associated with selling the product to the customer and analysing
how the company can retain their customers

c. Banking Sector:
Clustering is used to group types of customers in order to sell them
products effectively and analyze how profitable the customer is in
the bank.

Das könnte Ihnen auch gefallen