Вы находитесь на странице: 1из 67

Week 3: User Testing I

(Preparation)
Sandra Uitdenbogerd
Topics
•Ethics in Research
–Conducting a test

•Recruiting the ‘Right’ Users


•Screeners
•Writing Questions
•Writing tasks and task scenarios

RMIT University©2014 CS&IT 2


Research ethics
Electric Shock Experiment
Experimenter – gave the participant a low
voltage electric shock to demonstrate.
Participant told that they couldn’t stop. Also
told they wouldn’t be responsible.

1. Please continue
2. The experiment requires that you continue
3. It is absolutely essential that you continue
4. You have no other choice, you must go on.

‘Teacher’ (participant) – told to apply electric


shock to the ‘Learner’ every time they got an
answer wrong. Learner said they had a heart
condition. Told to keep increasing the voltage.

‘Learner’ (actor) – called out and sounded


increasingly like they were in pain. Banged on
the wall and said pain in chest. Stopped all
From Wikipedia sound after 135 volts.
(Milgram, Yale University, 1963)

RMIT University©2014 CS&IT 4


AOL Query Logs
•20 million search •Example queries:
queries (3 months) –2178 - “foods to avoid
when breast feeding”
•650,000 AOL –3482401 - “calorie
customers (US) counting”
–3483689 - “Time after
•Customers names time”
replaced by a –3505202 - “depression
number and medical leave”
–Preserve anonymity –7268042 - “fear that
spouse contemplating
cheating”

Adapted from: Dr. Bradley Malin Lecture Notes for Vanderbilt School of Medicine. "Data Privacy in Biomedicine" presented 17 January, 2008

RMIT University©2014 CS&IT 5


Anonymity?
•Barbaro, M., & Zeller Jr, T. (2006, August
9). A Face Is Exposed for AOL Searcher
No. 4417749. The New York Times.
–http://www.nytimes.com/2006/08/09/technology
/09aol.html

RMIT University©2014 CS&IT 6


Ethics in user tests?
•Pressures on a user
–Performance anxiety
–Feels like an intelligence test
–Comparing self with other subjects
–Feeling stupid in front of observers
–Competing with other subjects

RMIT University©2014 CS&IT 7


What to do?
•Does your experiment involve human
subjects (directly or indirectly?)
–Research must be approved by ethics board

•RMIT
–CHEAN – College Human Ethics Advisory
Network

RMIT University©2014 CS&IT 8


Government Regulation
•National Statement on Ethical Conduct in
Human Research
–Must be used for research by institutions
funded by the government
–Sets national standards for everyone else
–Defines “human research” as research
conducted with or about people, or their data or
tissue. It includes taking part in surveys,
interviews and focus groups, and being
observed by researchers.

RMIT University©2014 CS&IT 9


Values
•Research merit and integrity (justifiable)
•Justice (fair)
•Beneficence (more benefit than risk)
•Respect (for human beings, privacy,
cultural sensitivities)

RMIT University©2014 CS&IT 10


Boards will ask
• Are participants identifiable or re-identifiable?
• Is some form of deception involved?
• Are participants aged less than 18 years?
• Are participants cognitively or emotionally
impaired?
• Do participants belong to a cultural/minority
group?
–Do participants consider themselves to be
Aboriginal or Torres Strait islander people?

RMIT University©2018 CS&IT 11


Boards will ask
• Does the procedure used in the research
involve any experimental manipulation or
include the presentation of any stimulus other
than question-asking?
• Do the questions asked include personally
sensitive and/or culturally sensitive issues?
• Is there a power-dependency relationship
between researcher(s) and participant(s) e.g.
the doctor/patient or teacher/student
relationship?

RMIT University©2014 CS&IT 12


Boards will ask
• Selection of tasks and participants
• Time and location of test
• Use of participants’ personal information
• Presentation of results
• Data, will
–it be stored in a secure location?
–it be stored for 5 years after publication of research
findings?
–only the researchers have access to the data?

RMIT University©2014 CS&IT 13


Treat users with respect
• Time
–Don’t waste it
• Comfort
–Make the user comfortable
• Informed consent
–Inform the user as fully as possible
• Privacy
–Preserve the user’s privacy
• Control
–The user can stop at any time

RMIT University©2014 CS&IT 14


Research ethics –
conducting a test
Before a Test
• Time
– Pilot-test all materials and tasks
• Comfort
– “We’re testing the system; we’re not testing you.”
– “Any difficulties you encounter are the system’s fault. We need your help to
find these problems.”
• Privacy
– “Your test results will be completely confidential.”
• Information
– Brief about purpose of study
– Inform about audio taping, videotaping, other observers
– Answer any questions beforehand (unless biasing)
• Control
– “You can stop at any time.”

RMIT University©2014 CS&IT 16


During the Test
• Time
– Eliminate unnecessary tasks
• Comfort
– Calm, relaxed atmosphere
– Take breaks in long session
– Never act disappointed
– Give tasks one at a time
– First task should be easy, for an early success experience
• Privacy
– User’s boss shouldn’t be watching
• Information
– Answer questions (again, where they won’t bias)
• Control
– User can give up a task and go on to the next
– User can quit entirely

RMIT University©2014 CS&IT 17


After the Test
•Comfort
–Say what they’ve helped you do
•Information
–Answer questions that you had to defer to avoid
biasing the experiment
•Privacy
–Don’t publish user-identifying information
–Don’t show video or audio without user’s
permission

RMIT University©2014 CS&IT 18


What would you do?
•You are testing a Web 2.0 site which
requires users to sign up using an email
address.
•Your housemate was reluctant to be a
participant for your assignment. She
agreed, but now it’s been 15 minutes and
she’s received a phone call from her
boyfriend. She loses interest in your study
and she says she wants to stop.

RMIT University©2014 CS&IT 19


What would you do?
• You tested several friends using your Web 2.0 site.
When catching up with one of your friends they ask
to know who did the worst on the tests.
• You accidentally left the video camera on at the
end of the session, and recorded co-workers who
observed the session talking about the participant.
• For your assignment you’ve chosen a Web 2.0 site
who are the competition for a product your
company is producing. Can you show the results to
your company?

RMIT University©2014 CS&IT 20


What would you do?
•Last year you took UE as a student and
conducted usability testing on your friend.
This year you are lecturing UE and you
want to show your students a clip from the
test.

RMIT University©2014 CS&IT 21


What would you do?
•Last year you took UE as a student and
conducted usability testing on your friend.
This year you are lecturing UE and you
want to show your students a clip from the
test.

RMIT University©2014 CS&IT 22


Recruiting the “right”
users
Recruiting users
•Determine
–defining characteristics
–number of users

•Consider how to recruit users

RMIT University©2014 CS&IT 24


Determine characteristics
•Specify users with respect to
demographics, web use, general computer
use, domain, specific application; e.g.
–New/returning users
–Novice/advanced
–Age & gender
–Specializations
–Online habits and experience
–Interests or activities

RMIT University©2014 CS&IT 25


Who would you recruit?
•Myki ticket machines
•Government information website for
Centrelink payments
•Victorian State Library website
•An interactive Lego building web
application

RMIT University©2014 CS&IT 26


How many users?
•Landauer-Nielsen model
–Every tested user finds a fraction L of usability
problems
– Average L=31%, standard deviation 12%
–If user tests are independent, then n users will
find a fraction
–Percent of usability problems found = 1 − (1 − 𝐿) 𝑛
–So on average 5 users will find 84% of the problems
–Note that this means that half of the time, ≤84% of
problems will be found

RMIT University©2014 CS&IT 27


Landauer-Nielsen Model
L 0.31

n % probs found
1 31.0%
2 52.4%
3 67.1%
4 77.3%
5 84.4%
6 89.2%
7 92.6%
8 94.9%
9 96.5%
10 97.6%

RMIT University©2014 CS&IT 28


Always Consider Spread!

•CC BY 2.5
•Mwtoews - Own work, based (in concept) on figure by Jeremy Kemp, on •File:Standard deviation diagram.svg
2005-02-09 •Created: 7 April 2007

RMIT University©2014 CS&IT 29


Issues with the 5 users figure
• L may be much smaller than 31%
–Spool & Schroeder (2001) study of a CD-
purchasing web site found L=8%, so 5 users only
find 35% of problems (within 2 standard deviations
of L=31%)
• L may vary from problem to problem
–Different problems have different probabilities of
being found, caused by
–Individual differences
–Interface diversity
–Task complexity

RMIT University©2014 CS&IT 30


How to Predict the No. of Users Needed

•After each extra user, estimate L based on


the number of issues found.
•eg. For 2 users:
–L2 = 2 – Found(2)/Found(1)
–Where Found(2) is the number of issues found
by 2 users
–And Found(1) average number of issues found
by 1 user
–Total issues N = Found(1)/L

RMIT University©2014 CS&IT 31


Accuracy of L Estimates
•The estimates of L and the total number of
issues are very inaccurate until the number
of users is ~5 (s.d. ~11%)
•It may be best to start with L estimates from
similar applications with similar users, if
available.

RMIT University©2014 CS&IT 32


The Politics of Magic No. 5
•Nielsen and others emphasise that 5 users
is all you need for testing. Why?
–“Discount usability” – encourage usability
testing at any budget size
–The most information is learnt from the first
user
–The cost benefit ratio for user testing was
highest at about 3 users for a medium-large
software project (based on 1993 figures)
–And…
RMIT University©2014 CS&IT 33
Which is better?
•Using 5 users to find 84% problems with
each of three design iterations
•Using 15 users to find 99% of problems
with one design iteration

RMIT University©2014 CS&IT 34


Finding participants
• In-house (DIY) recruiting
–Customer lists
–Advertisements
–Outside organizations
–Friends and family
–Colleagues
• Outsourcing: market research companies
–Average cost per user is about $100
–Consumer/student about $80
–High-paid professional about $160
–Cost could come down…

RMIT University©2014 CS&IT 35


Finding participants
•New possibility – crowd sourcing
–Mechanical Turk
•Methodologies still being worked out
•Unlikely to be good for sites needing
experience
–Zuccon, G., Leelanupab, T., Whiting, S., Yilmaz, E.,
Jose, J., & Azzopardi, L. (n.d.). Crowdsourcing
interactions: using crowdsourcing for evaluating
interactive information retrieval systems. Information
Retrieval, 2013, 16(2), pp 267-305.

RMIT University©2014 CS&IT 36


Participant incentives
• Internal participants
–Cash is not appropriate
–Gift certificates: cafeteria, cinema voucher
–Promotional gifts: T-shirt, mug, etc.
• Outside participants
–Products: something the company makes
–Cash: $30/hr to $120/hr (crowdsourcing much cheaper)
–Vouchers
–Charitable donations
–Keep the hardware
–Some professions cannot accept cash: repay travel
costs

RMIT University©2014 CS&IT 37


Recruiting tips
•Avoid power users
–They skew results
•If recruiting within your company
–Don’t let users’ managers observe
–Be careful if you include new employees
•Follow up with scheduled participants
–Send a clear and detailed confirmation letter
–Make a confirmation call the day before
–Consider scheduling extra users or “floaters”

RMIT University©2014 CS&IT 38


Create a screener
•Used by recruiters to select (screen) for the
right users
•Write questions that will address the
required user characteristics
–Include all required user criteria
–Number of participants needed in each category
–Clear inclusion and exclusion criteria
–Be as specific as possible

RMIT University©2014 CS&IT 39


Screener questions
• Typically 20 questions
• Clear and specific, no jargon, exact dates,
quantities, times
• Questions should not lead
–“Are you bothered by the excessive lag times on the
Web?”
–“Are there things on the Web that regularly bother you?
If so, what?”
• Every question should have a purpose
• Start with questions that screen out the most
people.

RMIT University©2014 CS&IT 40


Questions
• Good
•Bad –Write a list of the Web
addresses (URLs) of sites
–List your favourite that you go to often, or that
Web sites you really like. Write up to 10
Web addresses.
–Do you check online –How often do you check
online news?
news every time you – Several times a day
– Once a day
surf the Web?
–Which of these features in
–What do you love Wired News are important to
you? (check all that apply)
most about Wired – Number of stories on a given
topic
News? – Number of different topics
covered

RMIT University©2014 CS&IT 41


Screening script
•Give reason for the session
–Feedback to help us improve the product
–Not a sales call

•Give length, date, location of session


•Offer incentives up front
•Explain video/audio taping, if it will occur
•Go through screening questions if person is
interested
RMIT University©2014 CS&IT 42
Writing questionnaires
“Essentially, a questionnaire is a user interface in its own right, and
one should use usability engineering principles to ensure that the
respondents will interpret it correctly” Jacko Nielsen (Usability
Engineering, pg. 212) Thanks to Andrian Radic (2013 student)
Define the objective
• Establish the purpose of the questionnaire
–what information is sought?
–how would you analyze the results?
–what would you do with your analysis?

• Participant Burden: Do not ask questions whose answers


you will not use!

Objective: “to identify points of user dissatisfaction with the


interface and how these negatively affect the software’s
performance”

RMIT University©2014 CS&IT 44


Styles of questions
•Open-ended questions
–Asks for unprompted opinions
–Good for general subjective information
–but difficult to analyse rigorously

“Can you suggest


any improvements to
the interface?”

RMIT University©2014 CS&IT 45


Open question: pros/cons
•Greater in-depth information
–Respondents can express themselves freely,
but if they have trouble doing so information is
lost

•Analysis is more difficult


–Respondent free to say anything
–Potential for bias in analysis

RMIT University©2014 CS&IT 46


Styles of questions
•Closed questions
–Provide alternative specific answers
–Avoid hard to interpret responses
–Can be easily analysed

Do you use computers at work:


O often O sometimes O rarely
vs
In your typical work day, do you use computers:
O over 4 hrs a day
O between 2 and 4 hrs daily
O between 1and 2 hrs daily
O less than 1 hr a day

RMIT University©2014 CS&IT 47


Styles of questions
•Likert / Scalar
–ask user to judge a specific statement on a
numeric scale
–scale usually corresponds with agreement or
disagreement with a statement

Characters on the computer screen


are:
hard to read easy to read
1 2 3 4 5

RMIT University©2014 CS&IT 48


Styles of questions
•Multi-choice
–respondent offered a choice of explicit
responses
How do you most often get help with
the system? (tick one)
O on-line manual
O paper manual
O ask a colleague

Which types of software have you


used? (tick all that apply)
O word processor
O data base
O spreadsheet
O compiler

RMIT University©2014 CS&IT 49


Styles of questions
•Ranked
–respondent places an ordering on items in a list
–useful to indicate a user’s preferences
–forced choice

Rank the usefulness of these


methods of issuing a command
(1 most useful, 2 next most
useful..., 0 if not used)
____ command line
____ menu selection
____ control key accelerator

RMIT University©2014 CS&IT 50


Closed question: pros/cons
•The information lacks depth and variety
•Because possible answers are restricted
–Data is easy to analyse
–Respondent may tick options without reflection
–Chance of investigator bias

RMIT University©2014 CS&IT 51


Styles of questions
• Combining open-ended and closed
questions
–gets specific response, but allows room for
user’s opinion

It is easy to recover from mistakes


disagree agree comment: the undo facility is helpful
1 2 3 4 5

RMIT University©2014 CS&IT 52


Writing Questions
•Phrasing
–be aware some words having either positive or
negative connotation

•Avoid
–Double negative
–Embarrassing Questions
–Hypothetical Questions

RMIT University©2014 CS&IT 53


Writing questions
•Prestige bias
–People answer questions to make them feel
better

RMIT University©2014 CS&IT 54


Question sequence
•Start with general questions, and slowly
scope down to main issues
•Helps respondent to build up understanding
of area of interest
•Provides opportunity for spontaneous
comments, before pointers by more
detailed questions

RMIT University©2014 CS&IT 55


Question sequence
•From detailed issue to wider scope
•First ask detailed questions and then
broaden questions to more global issues:
•This allows respondent to have considered
several detailed issues, before providing
more general opinions

RMIT University©2014 CS&IT 56


Sequence is important
• Asking about parts of your website will skew
all following answers.
–Did you think the navigation on the website was
good?
–Calling attention to the navigation with a question that is
fishing for a “yes” answer
–What did you think of the navigation on the
website?
–It is at least honest, but still artificially calls attention to it.
–List three good things about the website and three
bad things.
–Maybe the navigation will turn up in one or the other list.

RMIT University©2014 CS&IT 57


Writing tasks and task
scenarios
Task types
•Directed (we will use these)
–Specific / answer oriented

•Other Types of Tasks


–First impression
–good for home pages
– e.g. Give 5 adjectives that describe this website

–Exploratory
–Open-ended / research oriented
– e.g. Use the website and see if you would invest in this company
– e.g. Compare two websites and choose one

RMIT University©2014 CS&IT 59


Good task characteristics
• Kuniavsky describes in terms of end goals
–Specific
–Doable
–In realistic sequence
–Domain neutral
–A reasonable length
• Dumas et al:
–Short
–In the user’s words, not the product’s
–Unambiguous - so all participants will understand it
–Realistic as possible

RMIT University©2014 CS&IT 60


Writing tasks
• Make it realistic
–Something that people would actually do
–Avoid humorous tasks, silly names
• Make it scenario-based
–Avoid having micro-steps
• Keep it neutral and unbiased
–No marketing language or internal terminology
–Use language people understand
• Don't give clues or hints
–Avoid wording used on the site
• Work with development team

RMIT University©2014 CS&IT 61


How many tasks?
• For a 90-minute session, task time is 60-70 minutes
• Estimate time for each task
–Allow more time than what it takes you
–Consider setting time limits on open-ended tasks
• Prioritize tasks
–Those you want all users to attempt
• Randomize/rotate tasks
–Tasks performed at the end tend to have better
performance because of the learnability effect
• Prepare additional tasks
–If a user finishes early

RMIT University©2014 CS&IT 62


Scenarios contextualise tasks
•Scenarios help participants understand
what you want them to do
•Remove some artificiality from the test
•Give the user a goal

RMIT University©2014 CS&IT 63


Task Tips
•Define goals before writing tasks
•Have one task per sheet
–Hand this to user to refer to during task

•Avoid setting impossible tasks


–Frustrates users: they doubt other tasks

•Conduct a trial run

RMIT University©2014 CS&IT 64


Let’s review
Review
•Ethics in Research
–Conducting a test

•Recruiting the ‘Right’ Users


•Screeners
•Writing Questions
•Writing tasks and task scenarios

RMIT University©2014 CS&IT 66


Related Reading
•Nielsen and Landauer, 1993 paper
•Kuniavsky, Chapter 6 (Screeners,
Questionnaires)
•Kuniavsky, Chapter 11 (Usability tests,
tasks)

RMIT University©2014 CS&IT 67

Вам также может понравиться