Sie sind auf Seite 1von 8
Psyc100, Chapter 4, Lecture 1  Learning - any process through which experience at one

Psyc100, Chapter 4, Lecture 1

Psyc100, Chapter 4, Lecture 1  Learning - any process through which experience at one time
Psyc100, Chapter 4, Lecture 1  Learning - any process through which experience at one time

Learning - any process through which experience at one time can alter an individual’s behavior at a future time

E.g. Habituation

A reduction in response to a stimulus after repeated presentations e.g. startle response

after repeated presentations – e.g. startle response  Training procedure  Neutral stimulus elicits
after repeated presentations – e.g. startle response  Training procedure  Neutral stimulus elicits

Training procedure

Neutral stimulus elicits reflexive response through being paired with another stimulus that already elicits this reflexive response

stimulus that already elicits this reflexive response  Reflex = Stimulus-response sequence mediated by CNS 

Reflex = Stimulus-response sequence mediated by CNS

Stimulus Response = action that automatically follows an event

Response = action that automatically follows an event 9/27/2010  Classical Conditioning  Operant

9/27/2010

= action that automatically follows an event 9/27/2010  Classical Conditioning  Operant Conditioning 
= action that automatically follows an event 9/27/2010  Classical Conditioning  Operant Conditioning 

Classical Conditioning

Operant Conditioning

Different Perspectives of learning

Behavioural

Cognitive

Ethological perspective

◦ Behavioural ◦ Cognitive ◦ Ethological perspective  The attempt to understand observable activity in terms
◦ Behavioural ◦ Cognitive ◦ Ethological perspective  The attempt to understand observable activity in terms

The attempt to understand observable activity in terms of observable stimuli and observable responses

John B. Watson (1913)

“Give me a dozen healthy infants and my own specified world to bring them up in and I’ll guarantee to take any one at random and produce teachers, lawyers …”

B.F. Skinner (1938)

produce teachers, lawyers …”  B.F. Skinner (1938) NEUTRAL STIMULUS (Bell) UNCONDITIONED STIMULUS (Food)
produce teachers, lawyers …”  B.F. Skinner (1938) NEUTRAL STIMULUS (Bell) UNCONDITIONED STIMULUS (Food)

NEUTRAL STIMULUS (Bell)

UNCONDITIONED STIMULUS (Food)

UNCONDITIONED STIMULUS (Food)

NEUTRAL STIMULUS (Bell)

will

elicit

will

elicit a

will

elicit a

will CONDITIONED STIMULUS (Bell) CONDITIONED STIMULUS
will
CONDITIONED STIMULUS (Bell)
CONDITIONED STIMULUS

elicit a

NO REACTION

NO REACTION REFLEX ACTION REFLEX ACTION CONDITIONED RESPONSE

REFLEX ACTION

NO REACTION REFLEX ACTION REFLEX ACTION CONDITIONED RESPONSE

REFLEX ACTION

NO REACTION REFLEX ACTION REFLEX ACTION CONDITIONED RESPONSE

CONDITIONED

RESPONSE

Neutral

stimulus

Does not normally elicit a response or reflex action by itself

A bell ringing

A color

A furry object

Unconditioned

Stimulus

Unconditioned

Response

Always elicits a reflex action: an

Food

unconditioned response

Blast of air

Noise

A

response to an unconditioned

Salivation at smell

stimulus--naturally occurring

of food

Eye blinks at blast

of air

Startle reaction in

babies

Conditioned

Stimulus

The stimulus that was originally neutral becomes conditioned after

it has been paired with the

Will eventually elicit the unconditioned

response by itself

unconditioned stimulus
unconditioned stimulus
the unconditioned response by itself unconditioned stimulus Learning Experience Stimulus A (The word ball) Stimulus B
the unconditioned response by itself unconditioned stimulus Learning Experience Stimulus A (The word ball) Stimulus B
Learning Experience Stimulus A (The word ball) Stimulus B (Sight of a ball) Thought of
Learning Experience
Stimulus A
(The word ball)
Stimulus B
(Sight of a ball)
Thought of B
(Mental image of a ball)
After Learning
Stimulus A
(The word ball)
Thought of B
(Mental image of a ball)
Conditioning Procedure
Neutral stimulus
Unconditioned stimulus
Unconditioned response
(Bell)
(Food)
(Salivation)
After Conditioning
Conditioned stimulus
Conditioned response
(Bell)
(Salivation)
stimulus Conditioned response (Bell) (Salivation) 9/27/2010 UCS (drug) UCR (nausea) CS (waiting room)

9/27/2010

Conditioned response (Bell) (Salivation) 9/27/2010 UCS (drug) UCR (nausea) CS (waiting room) UCS (drug)
Conditioned response (Bell) (Salivation) 9/27/2010 UCS (drug) UCR (nausea) CS (waiting room) UCS (drug)
UCS (drug) UCR (nausea) CS (waiting room) UCS (drug) UCR (nausea) CS (waiting room) CR
UCS
(drug)
UCR
(nausea)
CS
(waiting
room)
UCS
(drug)
UCR
(nausea)
CS
(waiting
room)
CR
(nausea)
(drug) UCR (nausea) CS (waiting room) CR (nausea)  Extinction  Without food the bell elicited

Extinction

Without food the bell elicited less and less saliva

But, does not return animal to previous naïve state…

Spontaneous recovery

Passage of time following extinction partially renew the conditioned reflex

state…  Spontaneous recovery  Passage of time following extinction partially renew the conditioned reflex 2
Acquisition Strong (CS+UCS) Spontaneous Extinction recovery of (CS alone) CR Strength of CR Extinction (CS
Acquisition
Strong
(CS+UCS)
Spontaneous
Extinction
recovery of
(CS alone)
CR
Strength
of CR
Extinction
(CS alone)
Weak
Pause
Time
Strength of CR Extinction (CS alone) Weak Pause Time  After conditioning, stimuli that resemble the

After conditioning, stimuli that resemble the CS will elicit the response even when they have never been paired before.

Depends on the degree of similarity between new stimuli & conditioned stimuli.

Further a new tone is away from the original tone, the less the dog salivated.

is away from the original tone, the less the dog salivated.  Generalisation occurs in both
is away from the original tone, the less the dog salivated.  Generalisation occurs in both

Generalisation occurs in both physically and semantically similar stimuli (Razran, 1939)

Paired words with lemon juice squirts

Style, urn, freeze, surf = Salivate

Generalised to fashion, vase, chill and wave

Did not generalise to homophones or orthographically similar words (e.g. Serf, stile, etc)

These associations must be encoded deeply (i.e. Semantic rather than surface forms)

be encoded deeply (i.e. Semantic rather than surface forms) 9/27/2010  The conditioned stimulus is not

9/27/2010

deeply (i.e. Semantic rather than surface forms) 9/27/2010  The conditioned stimulus is not truly lost

The conditioned stimulus is not truly lost during extinction, but is Inhibited.

Eye-blink reflex studies (i.e. Tone + Puff of Air) in rabbits has shown that conditioning and extinction involve different sets of neurons

Neurons involved in conditioning excite neurons that control eye-blinks.

Neurons involved in extinction inhibit neurons that control eye-blinks

in extinction inhibit neurons that control eye-blinks  Generalisation can be abolished if the response to
in extinction inhibit neurons that control eye-blinks  Generalisation can be abolished if the response to

Generalisation can be abolished if the response to one is reinforced while the response to the other is extinguished

E.g. Conditioning to black square is generalised to grey square.

Grey square extinguished (no pairings)

Eventually conditioned the dog to discriminate a black square from a grey that was (almost) imperceptibly different from the black.

Allows investigations of sensory capacities

the black.  Allows investigations of sensory capacities  John B. Watson and Little Albert ◦
the black.  Allows investigations of sensory capacities  John B. Watson and Little Albert ◦

John B. Watson and Little Albert

first psychologist to explain human behavior in terms of Pavlovian conditioning

Fear not seen as feeling but observable behavior, catching breath, stiffening body turning away

This emotion can be thought of as a reflex and is therefore amenable to scientific investigation through CC

away ◦ This emotion can be thought of as a reflex and is therefore amenable to
UCS   (loud noise)   UCR       (fear)   CS   (rat)  

UCS

UCS  
 

(loud noise)

 

UCR

   
 

(fear)

 

CS

CS  
 

(rat)

 

UCS

 
UCS  
 

(loud noise)

 

UCR

 

CS

CS (fear)

(fear)

(rat)

CR

   
 

(fear)

 

Stimulus similar to rat (such as

  (fear)   Stimulus similar to rat (such as rabbit) Conditioned fear (generalization)  We don’t
rabbit)
rabbit)

Conditioned fear

(generalization)

to rat (such as rabbit) Conditioned fear (generalization)  We don’t just react to stimuli, we

We don’t just react to stimuli, we often behave in ways that produce certain changes

Actions that result in a particular goal are known as operant responses

result in a particular goal are known as operant responses Scratch at bars After Many Trials
result in a particular goal are known as operant responses Scratch at bars After Many Trials
Scratch at bars After Many Trials in Box Push at ceiling Situation: Dig at floor
Scratch at bars
After Many
Trials in Box
Push at ceiling
Situation:
Dig at floor
stimuli
Howl
inside of
puzzle box
Etc.
Etc.
Press lever
Howl inside of puzzle box Etc. Etc. Press lever 9/27/2010  E.L. Thorndike (1898)  Deprived

9/27/2010

inside of puzzle box Etc. Etc. Press lever 9/27/2010  E.L. Thorndike (1898)  Deprived cats
 E.L. Thorndike (1898)  Deprived cats of food  Placed cats in puzzle box
 E.L. Thorndike (1898)
 Deprived cats of food
 Placed cats in puzzle box

On first trial cat engaged in many different behaviors until accidentally opening box

After 20 30 trails cat could open box almost immediately after entering it.

Learning = Trial and Error

box  After 20 – 30 trails cat could open box almost immediately after entering it.
box  After 20 – 30 trails cat could open box almost immediately after entering it.
 Responses that produce satisfying effects in particular situations become more likely to occur again

Responses that produce satisfying effects in particular situations become more likely to occur again in that situation, and responses that produce discomforting effects become less likely to occur again

Pavlovian condition: Animal = passive agent

Thorndike: Animal = active agent that emits behaviour from its own accord

= active agent that emits behaviour from its own accord  Consequences ◦ positive and negative
= active agent that emits behaviour from its own accord  Consequences ◦ positive and negative
= active agent that emits behaviour from its own accord  Consequences ◦ positive and negative

Consequences

positive and negative reinforcement

positive and negative punishment

negative reinforcement ◦ positive and negative punishment 9/27/2010  Did not like the term “satisfying” ◦

9/27/2010

◦ positive and negative punishment 9/27/2010  Did not like the term “satisfying” ◦ Used term

Did not like the term “satisfying”

Used term “reinforcer” for any event that follows a

behavior AND

strengthens the behavior

Invented better apparatus: the Skinner box

Animals could be kept in the box for the whole duration of the experimental session whilst multiple conditioning trials could take place

session whilst multiple conditioning trials could take place  Punished behavior is not forgotten, it's
session whilst multiple conditioning trials could take place  Punished behavior is not forgotten, it's
session whilst multiple conditioning trials could take place  Punished behavior is not forgotten, it's

Punished behavior is not forgotten, it's suppressed--behavior returns when punishment is no longer present

Causes increased aggression- shows that aggression is a way to cope with problems

Creates fear that can generalize to undesirable behaviors, e.g., fear of school

Does not necessarily guide toward desired behavior--reinforcement tells you what to do, punishment only tells you what not to do

Punishment teaches how to avoid punishment

tells you what to do, punishment only tells you what not to do  Punishment teaches
 Allows complex behaviours to be conditioned  Process reinforces gradually more desired responses until

Allows complex behaviours to be conditioned

Process reinforces gradually more desired responses until final response is achieved

more desired responses until final response is achieved ◦ E.g. The grading system in musical instrument

E.g. The grading system in musical instrument learning

Reinforcement stays consistent (certificate etc.) required behaviour becomes more complex

(certificate etc.) required behaviour becomes more complex  Continuous: 1 to 1 ratio, a prize every
(certificate etc.) required behaviour becomes more complex  Continuous: 1 to 1 ratio, a prize every

Continuous: 1 to 1 ratio, a prize every time

Ratio:

fixed: 1 to ?, a prize every ? time

variable: ? to ?, maybe a prize, maybe not!

Interval:

fixed: announced examination

variable: pop quiz

◦ fixed: announced examination ◦ variable: pop quiz 9/27/2010  Continuous Reinforcement ◦ reinforcing the
◦ fixed: announced examination ◦ variable: pop quiz 9/27/2010  Continuous Reinforcement ◦ reinforcing the
◦ fixed: announced examination ◦ variable: pop quiz 9/27/2010  Continuous Reinforcement ◦ reinforcing the

9/27/2010

announced examination ◦ variable: pop quiz 9/27/2010  Continuous Reinforcement ◦ reinforcing the desired

Continuous Reinforcement

reinforcing the desired response each time it occurs

learning occurs rapidly

extinction occurs rapidly

Partial Reinforcement

reinforcing a response only part of the time

learning occurs slowly

resistance to extinction

◦ learning occurs slowly ◦ resistance to extinction  Ratio Schedules = higher response rates -
◦ learning occurs slowly ◦ resistance to extinction  Ratio Schedules = higher response rates -

Ratio Schedules = higher response rates - the more you respond the faster you reach the set threshold and receive the UCS

Variable Schedules = greater resistance to extinction as you are never sure when a reward will be presented so you don’t know when is a good time to give up - it could always be “just about to pay out” (fruit machines)

always be “just about to pay out” (fruit machines)  Indicate when a reinforcer is potentially
always be “just about to pay out” (fruit machines)  Indicate when a reinforcer is potentially

Indicate when a reinforcer is potentially available

E.g. a lever press will only result in a food pellet when a red light is illuminated

Red light = discriminative stimulus

E.g. waiting for the “ready” green light to flash on a camera before taking a picture

E.g. waiting until people are in a good mood before asking them a favour

flash on a camera before taking a picture  E.g. waiting until people are in a
 Primary Reinforcer ◦ An innate reinforcer ◦ Satisfies a biological need  Secondary Reinforcer

Primary Reinforcer

An innate reinforcer

Satisfies a biological need

Secondary Reinforcer

A conditioned reinforcer

An event that gains its reinforcing power through its association with a primary reinforcer

power through its association with a primary reinforcer  The most famous example was Azrin's token
power through its association with a primary reinforcer  The most famous example was Azrin's token

The most famous example was Azrin's token economy in psychiatric hospital

Tokens could be exchanged for cosmetics, candy, cigarettes, clothing, bedside tables, use of the TV, stereo, & sleeping late

Tokens were obtained by attending work and therapy as well as for good grooming, appropriate meal time behaviors, and minor housekeeping chores

Reduction in bizarre behaviors and increased normal behaviors and social skills

behaviors and increased normal behaviors and social skills  Disadvantages of Tokens ◦ Not usually seen
behaviors and increased normal behaviors and social skills  Disadvantages of Tokens ◦ Not usually seen

Disadvantages of Tokens

Not usually seen in classrooms nor are food snacks or other unusual back up reinforcers

Tokens, except for money and grades, are unavailable in the normal environment

In some environments, people will use unauthorized means such as force/theft to obtain

will use unauthorized means such as force/theft to obtain 9/27/2010  Tokens are secondary reinforces ◦

9/27/2010

unauthorized means such as force/theft to obtain 9/27/2010  Tokens are secondary reinforces ◦ Reinforcing value
unauthorized means such as force/theft to obtain 9/27/2010  Tokens are secondary reinforces ◦ Reinforcing value

Tokens are secondary reinforces

Reinforcing value related to previous learning experiences

E.g. Work > Money > Buy Food

Chimpanzees will work for tokens and save them for use later when grape vending machine is removed from enclosure

Clicker training

machine is removed from enclosure  Clicker training  Advantages of Tokens ◦ Potent reinforcers 
machine is removed from enclosure  Clicker training  Advantages of Tokens ◦ Potent reinforcers 

Advantages of Tokens

Potent reinforcers big changes in behavior

Bridge the delays between target responses and back up reinforcers

Backed up by a variety of items and hence are less subject to satiating.

Administration does little to disrupt on going behavior

Can be used with many individuals all with different back up reinforcer preferences

Can be accumulated towards valuable goals

preferences ◦ Can be accumulated towards valuable goals  Stimulus precedes the response and elicits it
 Stimulus precedes the response and elicits it  Elicited responses  Learning as a

Stimulus precedes the response and elicits it

Elicited responses

Learning as a result of association

Pavlov

CLASSICAL

Stimulus follows the response and

strengthens it

Emitted responses

Learning as a result of

consequences

Skinner

OPERANT

 In groups  Think of some real world (human or non- human animal) examples

In groups

Think of some real world (human or non- human animal) examples

 In groups  Think of some real world (human or non- human animal) examples 9/27/2010

9/27/2010