Вы находитесь на странице: 1из 10

Learning Theory 2 Trial and Error Learning

Involves learning by trying alternative possibilities until the desired outcome is achieved. When an organism continues to explore their environment until they discover a response that will allow them to reach their desired goal.

Edward Thorndike
Thorndike created a puzzle box. He placed a hungry cat in the box. The cat could see out and even stick its paws out between the wooden slats. The only way for the cat to escape was through a door, which could be opened by pressing a lever inside the box. Thorndike placed a piece of fish outside the box, just out of reach. If the cat wanted to eat the fish, it had to work out how to get out of the box through the door. Thorndike observed the cats behaviour and recorded the length of time it took for the cat to press the lever and exit the box. When the cat successfully hit the lever (accidentally at first), it could escape and eat the fish (pleasing consequence). The cat was placed in the box again and the process was repeated over a number of trials until the cat would press the lever as soon as it was placed in the box. Through trial and error, the cat was able to learn how to free itself from the box and obtain the fish. Thorndike called this the law of effect. Law of effect: behaviour that is accompanied or closely followed by a satisfying consequence is more likely to recur (strengthened) and a behaviour that is accompanied or closely followed by annoying consequences or discomfort is less likely to recur (weakened). Youtube link: Thorndikes puzzle box http://www.youtube.com/watch?v=BDujDOLre-8

Example of Trial and Error Learning: Matchstick puzzles http://www.redheads.com.au/games.php

Learning Theory 3 Operant Conditioning


A learning process in which the likelihood of a behaviour being repeated is determined by the consequences of that behaviour. Operant: a response (or set of responses) that occurs and acts (operates) on the environment to produce some kind of effect. Operant Conditioning is based on Thorndikes Law of Effect. An organism will tend to repeat a behaviour (operant) that has desirable consequences, or that will enable it to avoid undesirable consequences. Organisms will tend to not repeat a behaviour that has an undesirable consequence. Operant Conditioning is also known as Instrumental Learning. Instrumental learning: refers to the process through which an organism learns the association between behaviours and its consequences.

B. F. Skinner
An American behavioural psychologist. Skinners work into operant conditioning was pivotal in determining what we know about such techniques today. Some of his reinforcement techniques included teaching pigeons how to dance, play pingpong and bowl a ball in a mini bowling alley. Youtube link: Pigeon ping-pong http://www.youtube.com/watch?v=vGazyH6fQQ4 Skinner created an apparatus for studying operant conditioning. This conditioning chamber is known as the Skinner box. Skinner conducted many experiments with the Skinner box, most notably with rats. Hungry rats were placed in the box and learnt to press a lever to receive food. Youtube link: Skinner box http://www.youtube.com/watch?v=PQtDTdDr8vs

Three-phase model of Operant Conditioning


1. The stimulus precedes an operant (S)

2. The operant response to the stimulus (R) 3. The consequence to the operant response (C) Stimulus (S) --> Response (R) --> Consequence (C)

Elements of Operant Conditioning


Reinforcement: occurs when a stimulus strengthens the likelihood of a response that it follows. Reinforcer: can also be known as a reward. Schedule of reinforcement: a program for giving reinforcement, specifically the frequency and manner in which a desired response is reinforced. Schedules of reinforcement Continuous Partial Fixed-Ratio Fixed-Interval Variable-Ratio Variable-Interval Continuous reinforcement: the reinforcer is provided immediately after every correct response. Partial reinforcement: the process of reinforcing some correct responses, but not all of them. There are four basic schedules of partial reinforcement. Ratio = number Interval = time Fixed = set Variable = unpredictable Fixed-ratio schedule: the reinforcer is given after a set, unvarying number of desired responses have been made. Variable-ratio schedule: the reinforcer is given after an unpredictable number of correct responses. Fixed-interval schedule: the reinforcer is given after a specific period of time has elapsed since the previous reinforcer, provided the correct response has been made. Variable-interval schedule: the reinforcer is given after irregular periods of time have passed, provided the correct response has been made. A variable-ratio schedule of reinforcement is least susceptible to extinction and is employed by gambling businesses (poker machines).

Youtube link: Schedules of reinforcement http://www.youtube.com/watch?v=I_ctJqjlrHA

Types of reinforcement Positive reinforcement: a stimulus that strengthens or increases the frequency or likelihood of a desired response by providing a satisfying consequence (reward). Negative reinforcement: The removal or avoidance of an unpleasant stimulus. It increases the likelihood of a response being repeated and thereby strengthening the response. Punishment: the delivery of an unpleasant consequence following a response, or the removal of a pleasant consequence following a response. There are two types of punishment: positive punishment and negative punishment (response cost). Positive punishment: involves the presentation or introduction of a stimulus that decreases the likelihood of a response occurring again (giving something). Negative punishment: involves the removal of a stimulus and thereby decreasing the likelihood of a response occurring again (taking something away). Response cost: described as involving any valued stimulus being removed whether it causes the behaviour or not. Giving something Positive Reinforcer Positive Punishment Taking away Negative Reinforcer Negative Punishment (Response cost)

Type of event Pleasant Unpleasant

Presented

After response, event is:

Positive reinforcement Positive event follows response

Discomfort Increases follows response desirable behaviour Punishment

Removed

Positive state removed after response Punishment (response cost)

Negative Decreases reinforcement undesirable behaviour Discomfort removed by response

The Simpsons Duffless


Learning
1. In Lisas first experiment how did the hamster demonstrate insight learning?

2. In Lisas second experiment, could the hamsters behaviour be considered operant conditioning? Why/why not?

3a) Find an example of classical conditioning in the episode. What was it?

b) What was the: UCS: UCR: NS: CS: CR:

Research Methods
If Marge was conducting an experiment on Homer when he chose not to drink Duff, answer the following questions. 1. What experimental design did Marge use? 2. The IV in her study was ______________________________________________. The DV in her study was _________________________________________. 3. Write a hypothesis because Marge cant because she doesnt have enough fingers.

4. What are possible extraneous variables?

Operant Conditioning Worksheet 1


Tick the appropriate column to indicate which schedule of reinforcement is being used in each situation. Fixed Interval Louis is given 10 per cent commission on every computer he sells at work. A fisherman is catching fish off a pier. A surfer is riding the waves at Bells Beach. A student is given $100 by his parents if he achieves As on his report card after every semester at school. Bob is playing the poker machines at the casino. Jennifer receives free gifts after every $1000 she spends on her credit card. Liz presses the button for the pedestrian lights at the intersection. A rat is reinforced after every 10th time it presses a lever. A rat is reinforced after, on average, every 10th time it presses a lever. A person checks to see whether a load of washing in the machine is complete. A salesperson receives a bonus for every four perfume bottles they sell. A pigeon is reinforced for its first peck after a light comes on every two minutes. Claire repeatedly dials a busy telephone line. Gustav checks to see whether the chicken he is roasting is cooked. Fixed Ratio Variable Interval Variable Ratio

Tim hands in a weekly report for the school newspaper.

Operant Conditioning Worksheet 2


Tick the appropriate column to indicate which element of operant conditioning is being used in each situation. Positive Negative Punishme Response Reinforce Reinforce nt Cost ment ment A rat quickly learns to press a bar to stop an electric shock being administered through the floor of its cage. Nadia receives pocket money for helping around the house. Sarah crashes her parents car into the garage after being told not to drive the car, so she is grounded for a month. Peter is fined $200 for speeding. Sam is talking in class and not doing any work, so the teacher walks to his desk and stands behind him. Sam stops talking and starts to work. Georgia receives an A+ on her Psychology SAC after studying hard. Sophies parents chastise her when she eats with her fingers at the dining table. Fidos owner puts a collar on him that releases an unpleasant sound that only dogs can hear, every time Fido barks. I have a headache, so I take an aspirin and the headache goes away. Brian does his homework to stop his parents nagging him. The judge tells Mr Axe that he is to spend 30 years in jail for the murder he committed. Ari earns a bonus at work for selling more computers than every other sales person. Sonia rubs some After sun moisturiser on her sunburn and it stops itching. The next time she is sunburned, she applies the After sun. Eric fails to return home from a party until after his curfew, so his parents take away his mobile phone and car for a week. 9

Mrs Garcia gives all of her students free time on the computers when they complete their work. Marvin is sent a bill for his mobile phone, but he forgets to pay it, so the company sends him another bill, including a late fee.

10

Вам также может понравиться