Вы находитесь на странице: 1из 20

LECTURE: 10

Behavior

LEARNING(contd.)

Shaping and Operant Conditioning

Reinforcement:
Other

positive and negative

Types of Learning

OPERANT CONDITIONING

Definition: Learning in which the consequences of behavior lead to changes


in the probability of its occurrence.
The term is derived from the word operate, when our behavior operates on
the outside world , it produces consequences for us, and those consequences
determine whether we will continue to engage in that behavior.
Operant conditioning was described by American Psychologist Edward
Thorndike(1911). Thorndike was interested in the question of animal
intelligence, which he investigated using an apparatus he called a puzzle
box . A hungry cat was placed in the box, food was placed outside, and the
cats efforts to escape were observable. With each trial the cat became more
efficient at opening the door of the box .
Based on these observations, Thorndike formulated the law of effect ,
which states that the consequences of a response determine whether the
response will be performed in the future .

POSITIVE AND NEGATIVE


REINFORCEMENT

Positive and Negative Reinforcements are two


ways in which the desirable and undesirable
consequences of our behavior influence our future
behavior.

POSITIVE REINFORCEMENT

Definition: Any consequence of behavior that leads to an increase in the


probability of its occurrences.
Many of our actions cause something to happen and the consequences of
our actions often influence our future actions. Positive reinforcement has
occurred when a consequence of behavior leads to an increase in the
probability that we will engage in that behavior in the future.
In positive reinforcement the consequences of the behavior are positive , so
the behavior is engaged in more frequently.
Case Study.
Positive reinforcement does not only occur when it is intentionally arranged
but the natural consequences of our behavior can be reinforcing as well. For
example we learn that some ways of interacting with our friends just
naturally lead to happier relationships and that is positively reinforcing.

TWO IMPORTANT ISSUES IN THE


USE OF POSITIVE REINFORCEMENT
1.

2.

Timing: The Positive reinforcer must be given within a


short time following the response, or learning will
progress very slowly, if at all. The greater the delay
between the response and the reinforcer, the slower the
learning. This phenomenon has been referred to as the
principle of delay of reinforcement.
Consistency in the delivery of reinforcement: For
learning to take place , the individual providing positive
reinforcement must consistently give it after every(or
nearly every ) response. After some learning has taken
place, its not always necessary or even desirable , to
reinforce every response. Consistency of reinforcement
is essential in the beginning of the learning process.

PRIMARY AND SECONDARY


REINFORCEMENT

Positive reinforcers are sometimes inborn and acquired through


learning at other times.
There are two types of positive reinforcement, primary and secondary.
Primary Reinforcers: Are innately reinforcing and do not have to be
acquired through learning. Food , water, warmth, physical activity, and
sexual gratification are all examples of primary reinforcers.
Secondary Reinforcers: Which play an important part in operant
conditioning are learned through classical conditioning . A neutral
stimulus can be turned into a secondary reinforcer by pairing it
repeatedly with a primary reinforcer. For example praising a dog when
giving food. After enough pairings of these two stimuli, the praise will
become a secondary reinforcer and will be effective in reinforcing the
dogs behavior.

SCHEDULES OF POSITIVE
REINFORCEMENT

1.

2.

In addition to continuous reinforcement, psychologists have


described four types of schedules of reinforcement and have
shown us the effects each on behavior.
Fixed Ratio: On a fixed ratio schedule of reinforcement,
the reinforcer is given only after a specified number of
responses. If sewing machine operators were given a pay
slip for every six dresses that were sewn, the schedule for
reinforcement would be called a fixed ratio schedule.
Variable Ratio: On a variable ratio schedule of
reinforcement, the reinforcer is obtained only after a
varying number of responses have been made. These
schedules produce very high rates of responding and the
learning is rather permanent. For example gambling.

CONTD.
3.

4.

Fixed Interval: Here, the schedule of reinforcement is not based


on the number of responses but on the passage of time. The term
fixed interval is used when the first response that occurs after a
predetermined period of time is reinforced. This produces a pattern
of behavior in which very few responses are made until the fixed
interval of time approaches and then the rate of responding
increases rapidly. For example politicians start visiting back home
as the time approaches for the elections. Since the visits back home
are reinforced by votes, so the rate of visits rises dramatically.
Variable Interval: The schedule for reinforcement in which the
first response made after a variable amount of time is reinforced.
Like the variable ratio schedule , this variable interval schedule
produces high rates of steady response and although it is not a
good schedule for initial learning but it produces highly stable
performance when the response has already been partially learned
through continuous reinforcement, e.g. fishing .

SHAPING

A strategy of positively reinforcing behaviors that are


successively more similar to desired behaviors.
In many situations, the response that one wants to reinforce
never occurs so what is needed to be done in this case is to
reinforce responses that are progressively more similar to the
response that the individual finally wants to reinforce (the
target response).
In doing so one gradually increase the probability of the
target response and can then reinforce it when it occurs.
This is called shaping, or the method of successive
approximations because the target response is shaped out of
behaviors that successively approximate it.

NEGATIVE REINFORCEMENT

Negative reinforcement occurs when (1) a


behavior is followed by the removal or the
avoidance of a negative event and (2) the
probability that the behavior will occur in the
future increases as a result.
Examples include beating a child to stop him
from putting his fingers into an electrical socket
etc.

TWO TYPED OF CONDITIONING ARE


BASED ON NEGATIVE REINFORCEMENT :

Escape Conditioning: Operant conditioning in


which the behavior is reinforced because It
causes a negative event to cease (a form of
negative reinforcement).
Avoidance Conditioning: Operant conditioning
in which the behavior is reinforced because it
prevents something negative from happening (a
form of negative reinforcement).

CONTRASTING CLASSICAL AND


OPERANT CONDITIONING

1.

2.

3.

Classical and Operant conditioning differ from each other in three primary ways:
Classical conditioning involves an association between two stimuli, such as a tone
and food. In contrast operant conditioning involves an association between a response
and the resulting consequences, such a studying hard and getting an A.
Classical conditioning usually involves reflexive, involuntary behaviors that are
controlled by the spinal cord or autonomic nervous system. These include fear
responses salivation, and other involuntary behaviors. Operant conditioning on the
other hand, usually involves more complicated voluntary behaviors, which are
mediated by the somatic nervous system.
The most important difference concerns the way in which the stimulus that makes
conditioning happen is presented (as the unconditioned stimulus, or UCS, in
classical conditioning or the reinforcing stimulus in operant conditioning). In
classical conditioning the UCS is paired with the conditioned stimulus(CS)
independent of the individuals behavior. The individual does not have to do anything
for either the CS or UCS to be presented. In operant conditioning, however, the
reinforcing consequence occurs only if the response being conditioned has just been
emitted; that is the reinforcing consequence is contingent on the occurrence of the
response.

PUNISHMENT

1.

2.

Definition: A negative consequence of a behavior, which leads to


a decrease in the frequency of the behavior that produces it.
Dangers of Punishment: Five dangers inherent in punishment
are:
The use of punishment is often reinforcing to the punisher. For
example if a parent spanks a child who has been whining and
the spanking stops the child from whining, the parent is
reinforced for spanking through negative reinforcement. This
unfortunately may mean that the frequency of spankings and
perhaps their intensity, increase, thereby increasing not only
the amount of physical pain the child endures but also the
dangers of child abuse.
Punishment often has a generalized inhibiting effect on the
individual. Repeatedly spanking a child for talking back to you
may lead the child to quit talking to you altogether.

CONTD.
3.

We commonly react to physical punishment by learning to dislike the


person who inflicts the pain, and perhaps by reacting aggressively
toward that person. Sometimes an individual takes out his or her
resentment on someone else if its not possible to react directly against
the person who gave the pain. Thus punishment may solve one
problem but only lead to a worse problem, namely aggression.

4. What we think is punishment is not always effective in punishing the


behavior. In particular, most teachers and parents think that criticism
will punish the behavior at which it is aimed. However, in many
settings, especially homes and classrooms filled with young children ,
criticism is often a positive reinforcer that increases the rate of
whatever behavior the criticism follows. This has been called the
criticism trap. For example some teachers and parents see a behavior
they do not like and criticize to get rid of it. But childrens actions are
sometimes reinforced by the attention they receive when criticized. In
this way cricism reinforces rather than punishes the behavior, and the
criticized behavior increases in frequency.

CONTD.
5. Even when punishment is effective in
suppressing an inappropriate behavior, it does
not teach the individual how to act more
appropriately. Punishment used by itself may
be self-defeating, it may suppress one
inappropriate behavior only to be replaced by
another one. It is not only until appropriate
behaviors are taught to the individual to
replace the inappropriate ones that any
progress can be made.

GUIDELINES FOR THE USE OF


PUNISHMENT
1.

2.

Do not use physical punishment. Taking away TV


time from a 10- year old or giving a 4- year old a
time-out in a chair in the corner for 3 minutes is
more effective than spankings, and certainly more
humane. Indeed physical punishment back-fires
and causes children to behave worse rather than
better.
Punish the inappropriate behavior immediately.
Immediately using a firm voice to tell a child that
is bad to let go of your hand when walking on the
side walk might punish that behavior , but waiting
5 minutes to do so will be much less effective.

CONTD.
3.

4.

Make sure that you positively reinforce


appropriate behavior to take the place of the
inappropriate behavior you are trying to
eliminate. Punishment is not effective in the long
run unless you are also reinforcing appropriate
behaviors.
Make it clear to the individual what behavior you
are punishing and remove all threat of
punishment as soon as that behavior stops. Do
not punish people, punish specific behaviors. And
stop punishing when the inappropriate behavior
ceases.

CONTD.
5.

6.

Do not mix punishment with rewards for the same


behavior. For example do not punish a child for
fighting and then apologetically hug and kiss the
child you have just punished. Mixtures of this sort
are confusing and lead to inefficient learning.
Once you have begun to punish, do not back down.
In other words, do not reinforce begging, pleading,
or other inappropriate behavior by letting the
individual out of the punishment. It both nullifies
the punishment and reinforces the begging and
pleading through negative reinforcement.

MODELING: LEARNING BY
WATCHING OTHERS

Definition: Learning based on observation of the behavior of another.


Stanford university psychologist, Albert Bandura, one of is most
important contributions has been to emphasize that people learn not
only through classical and operant conditioning but also by observing
the behaviors of others. Bandura calls this modeling. For example in
countries where grasshoppers are considered a delicacy, people learn
to eat them partly by watching other people enjoy themselves while
eating grasshoppers.
In Banduras view a great deal of cognitive learning takes place
through watching, before there is any chance for the behavior to
occur and can be reinforced. But we can learn more than skills
through modeling. Bandura has suggested that modeling can also
remind us of appropriate behavior in a given situation, reduce our
inhibitions concerning certain behaviors that we see others engaging
in.

BIOLOGICAL FACTORS IN
LEARNING

Although learning is a powerful force that shapes our


lives, our biological characteristics place limits on it
making us better prepared to learn some things more
than others. Therefore the ability of humans to learn
from experience is not limitless, it is influenced by a
number of ways by biological factors.
For example, it appears that people are biologically
prepared to learn some kinds of fears more than others.
It is far easier to classically condition a fear of things
that have some intrinsic association with
danger(snales, height, blood ,etc) using electric shock as
the UCS than it is to condition a fear of truly neutral
things such as a lunch box or keys.

Вам также может понравиться