Sie sind auf Seite 1von 3

How Small Simple Models Can Yield Big Insights

Summary:
With big data there is seamless data coming in so there should be some
knowledge of it, there should be a prior analysis and focus where and what we
need then we can achieve our goals. We need a strategy to extract decision
relevant information out of a dataset.
The speaker mentions that fetching a data is like fishing in a big ocean and there
should be right strategy and simple data models, there are ideas that big data
trends now are too discounted and have very less importance.
Virus: R0: Mean number of baby girls, A newly born baby girl would have in her
life time. So this was applied to the spread of disease R0 is the average number
of new infections, if R0>1 then it would increase if R0<1 then it would decrease
and there are 2 scenarios in this case:

50% chances that there would be 4 more infections from a person(this


might start off the epidemics and these four can spread it to the other 4 )
50% chance there would be no infection from the person. (the case might
die off)

The outliers should never be ignored as they can create events which can have a
major impact on the statistical data.
Inference:

Averages can be deceiving


Treating a distribution as its average value can lead to incorrect inferences
Averages as experienced by one population can be different from the
other
Ignoring Outliers can be a big issue

For the square root law if the distance travelled by the police car is considered
than in a graphical representation it might be actually greater what it is to be
represented so the square root law helps us in getting the correct data
In the above example if A is the service area and N is number of police cars than
the Distance (D) is directly proportional to the Sq root of the area of the police
car. In bigdata we might need to look at this type of special data.
Queues: In this there is a basic system where there are Arriving customer,
Waiting customers, Service facility and Departing customers. It is a basic
engineering or mathematics that effectively helps in management of the queuing
system. Today there are large number of files and bigdata helps us in analysing
the queues.
The littles law helps us in analysing the queuing system with the help of the
equation:

L= time avg number of customer


Lambda: avg rate of arrivals
W: Mean time Spent
MMK queuing : In this type as the fraction of time the server gets busy as close
to 100% the queuing delay explodes so the managers must manage it in a way
so that there is a delay or an idle time of the server else there can be a queuing
explosion. So if there are many servers which are working 95% of the time and if
can manage the queuing delay then it is very critical to have an estimate for the
Lambda which we can get from the analysis of bigdata , If there is a difference in
5%then the queuing can go higher which might cause a distortion. If we make
everything deterministic there can be no queuing delay at all.

Performances degrade as the arrival rate increases or mean service time


increases.
Performance degrades as the variance of time between arrivals increases
or variance of service time increases.

There are queues everywhere and it is important to have them in-order to


maintain a good and optimum service level.
Case Study: A personal big data Small Model Experience
How can we differentiate if people were there in the queue or not in the queue? :
So the queuing had to be decided and the signature of the queue was important.
The signature of the queue was of the session after which when the card was
inserted. There is a gap between each session during the queuing process.
Queue Inference engine:
An observation: There was a customer who did not experience any queue delay
as there was no queue and the others had to experience the same as there was
queue. There can be customers who initiated a busy period and the once who did
not.
People come to the ATM which is known as the poisson Process which imposed
on a huge data set and then a derivation of algorithm was done for the customer
queuing and it was done by the big data analysis. Also the bank statement can
be received with its number of times a customer waits in the queue , So this
again can be done by big data analysis.
As per Simchi Levi : The QIE could not have been derived if the small models
would not have been synced with the Big data recursive thinking.

Reference: https://www.youtube.com/watch?v=3lijsLLCndM

Das könnte Ihnen auch gefallen