You are on page 1of 21

Advanced Statistical

Inference
Psychology UN1660

Professor Greg Jensen


Also Starring: Raymond Crookes

(Or:
Handsome
Bayes
Modeling
What Are Statistics For?
School)

Welcome!
Disclaimer: The aim of this course is not to teach you
the statistical skill set that is currently dominant in
the field.
o Instead, I will advocate for a direction that I hope
the field is going: Multi-level model construction
and model comparison, build on Bayesian
principles.
If you are here, it is by choice. I do not intend to
waste your time.
o Wherever possible, I want this course to develop
skills and understanding that will be useful to you
in a career in the sciences.
o By the end of the semester, you will all feel more

Read The Friendly Manual


Some have commented that the textbook in my
introductory stats course felt optional. That will not
be the case in this course.
o In order to stay on top of the material in this class,
you must do the assigned reading in advance. My
lectures are intended to clarify the readings and
put them in a psychological context.
Assignments will also include instructions, and these
should be read carefully before undertaking an
analysis.
o You will learn the most by doing in this course, so
the effort you put in to early assignments will pay
itself back.

Theory vs. Evidence


Scientific knowledge can broadly be considered to
come in two varieties: theories and evidence.
o A scientific theory is a framework proposing
mechanisms in nature. It speculative: Perhaps
this is how the world works.
o Scientific evidence is the data collected from
experiments, as well as descriptions of widelyvalidated phenomena.
Evidence displaces common sense in favor of robust
theories.
o The idea that gravity bends both space and time
is weird, but all available experimental evidence is
consistent with it.

But What Are Theories


Really?
Its easy to invoke the word theory, but its much
harder to get specific about what a theory is.
Most theories outside of physics are vague (many
deliberately so).
o For example, the theory of working memory is a
general idea of a temporary form of memory,
associated with our immediate conscious
experience. Despite having this idea, we still dont
know how information flows into and out of
working memory from the senses and from longterm memory.
o Typically, multiple theories claim to partially
explain the same phenomena. For example,

Everybody Gets Their Own Facts


Scientific evidence is also remarkably murky.
Almost all scientific measurements are
operationalized. That is, we often cant measure
directly, so we rely on indirect indicators.
o Its clear that IQ measures something, but it
doesnt correspond as well with intelligence as
was originally hoped.
o Other measures, like preference or reaction
time are even more vague, because differences
in those measures could arise from so many
possible influences.
The public and scientists commonly disagree over

Take Em On To The Bridge


The bridge between theory and evidence are
scientific models.
o Models can make specific predictions about
measurable outcomes (e.g. How fast should
reaction times be?).
o Models can also make specific predictions about
the noise or uncertainty of outcomes (e.g. How
varied should reaction times be, and what
distribution should they follow?).
Things that are vague in a theory must be specified
in a model. This yields testable predictions, linking
theory with experiment.
o If the results of experiment contradict those

Models Without Theories


We may also build models without having any theory
in mind.
o All good science must begin with a description of
the data. This is why we use descriptive
statistics.
Models do more than merely describe. They also
predict.
o Typically, a model describes a complete
probability distribution that describes the
relatively probability of every possible outcome.
This makes a model a much more powerful form of
description than mere descriptive statistics, because

Model Engineering

A quantitative models consists of four basic parts:


constants, variables, operations, and
parameters.
o A constant is something we are entirely certain
of, such as the value of , or how many seconds
are in a minute.
o A variable is something we must measure in the
world that we think may change, or that is
uncertain. Usually, these are represent the
phenomena in nature that we wish to study.
o An operation specifies a mathematical
relationship in the model. Familiar operations will
include comparisons (e.g. =) and arithmetic
(e.g. +), but we will also consider operations

The New Normal


Lets consider our old friend, the
normal distribution, presented in
a new way.
Rather than a distribution, instead
think of it as a process for
a normal
generating random samples.
distribution

Variables
Operations
Parameters

with a mean
The outcome

of
variable
x
and a
is a random sample
standard
from

Same As The Old Normal

The normal distribution is a generic model that


described many phenomena in nature, but it is not a
theory. It doesnt tell us why is best described
by
Variables
and .
Operations

Consider, however, the following model: Parameters

While perhaps unfamiliar, this is the simple linear


regression model. It predicts values of the variable
in terms of the variable , given a slope , an
intercept , and residual variability .

I Was Promised Bayesian


Stats

One of the most confusing things to an outsider


about the field of statistics is that the same math has
many interpretations.

Classical experimental texts interpret using the


frequentist philosophy. Under that framework, we
might ask whether is significantly different from
zero, using a hypothesis test.
o Turns out, you can build any hypothesis test as a
special case of the Bayesian framework.
We will instead take a Bayesian view of the
problem. We first specify some prior belief about ,
then update it using the data.

Where Were Going, We Wont Need


p
We will do this by building models using only our
assumptions.
o For example: Suppose participants belong to three
groups. We are curious how the means of those
groups compare.
o What is the hypothesis test classical stats often
recommends?
ANOVA!
o What are the assumptions of ANOVA?
ANOVA
Each group has (But
its own
mean,Welchs
all groups are
what if

each group
normally distributed,
and all have
the same
ANOVA
should
variance.

instead

have its own


variance?)

Where Were Going, We Wont Need


p
Those who took my intro class may (vaguely) recall
that these two hypothesis tests each required a
totally different set of equations.
o When framed using the notation below, however,
we can see how similar they are.
By instead learning to build models from the ground
up, we wont need to learn a new test for every
problem. Instead, we will only need to answer the
ANOVA
Welchs
question,
What are your assumptions?

ANOVA

Model Comparison
There are many models that could have given rise to
a particular dataset, but some models do a better job
than others.
o Thats a simple enough idea, but it gets tricky
when we ask to measure how well a model does
its job.
o If we can agree on a yardstick for measuring how
good each model is, this will allow us both to
compare models to find the best case, as well as
to combine models to cover our bets.
Model comparison is essential, because it provides a
way to compare different theories, or different
variations of a theory.

Information Criteria
Our yardstick this semester will be information
criteria.
o These metrics estimate how well the current
model will do at predicting future observations.
The goal of this approach is to find the middle ground
between underfitting and overfitting.
o A model that is underfit is one that is too vague
to capture the important features of the data.
A null hypothesis typically underfits the data.
o A model that is overfit is one that is too
complicated. Overfitting causes a model to fit the
current data well, but to fit future data poorly.
This can be thought of as a model that misses

Multi-Level Models

Imagine that three sprinters (, , and ) are in training.


Each time a sprinter runs a race, they cross the finish
line in seconds.
o Times vary for several reasons. Each sprinter
might in general be a little faster or slower.
However, each race will also go a little differently.
o We need to measure two kinds of uncertainty:
How sprinters differ from one another, and how
each sprinters individual performance varies.

Group Level

Subject Level

Multi-Level Models

This is an example of a multi-level model.


o Such a model simultaneously makes several
inferences.
o The first is about the population of sprinters. This
models describes the population of sprinters as a
normal distribution with a mean of and a
standard deviation of .
o The model also makes inferences are about the
specific performance of each sprinter.

Group Level

Subject Level

Multi-Level Models

Multi-level models are fast becoming a new gold


standard, because they permit many interesting
experiments that are impossible with simpler
methods.
o Using classical methods, obtaining estimates for
all of the parameters in a multi-level model was
nightmarishly hard.
o We will instead lean on a new generation of tools
that will let us bypass the ugliest of the math.

Despite these tools, multi-level models are still


challenging to think about. Consider the following
questions:
o What distribution should we expect and to

A Plan For The Semester


Our first order of business is getting acquainted with
R.
o We will be using R exclusively for our analyses, so
get started using it immediately. Get everything
installed ASAP.
Next week, well build simple models. These will
introduce all the moving parts we will later need to
build complex models.
In Lecture 12, well dig into information theory to
better understand how we can compare models.
In Lecture 16, well grapple with MCMC, which will let

Summary
Our mission this semester is to build models.
o These will let us compare theory to experiment in
a more precise and explicit way than is standard
in the field.
The building blocks for our models will arise from a
Bayesian approach to probability.
o We will describe our uncertainty in terms of
probability distributions, and data can update
those distributions.
As our appreciation of probability deepens, we will be
able to take on more complicated problems.
o Information criteria will let us compare models to