Lecture 14 — Heuristic Evaluation - Why and How | HCI Course | Stanford University
Summary
TLDRThe video introduces heuristic evaluation, a technique for identifying usability issues in software design. Created by Jakob Nielsen, it involves experts using a set of principles to critique a design, which can be applied at various stages of the design process. The method is cost-effective, quick to perform, and beneficial for catching severe problems, though it may generate false positives. The script explains the evaluation process, the importance of multiple evaluators, and how to use the findings to improve design, emphasizing the technique's value alongside other feedback methods.
Takeaways
- 🔍 Heuristic evaluation is a critique-based approach for identifying usability issues in software design, often using a set of principles or heuristics.
- 👥 It is valuable to receive peer critique at various stages of the design process, such as before user testing, redesigning, and before software release.
- 📈 Heuristic evaluation was created by Jakob Nielsen and is a cost-effective method for finding usability problems, with a high benefit-cost ratio.
- 📝 The technique involves having evaluators independently assess the design against a set of heuristics and then discuss their findings collectively.
- 🛠 It can be applied to both working user interfaces and sketches, making it compatible with rapid prototyping and low-fidelity designs.
- 🔑 Nielsen's 10 heuristics serve as a good starting point, but they can be customized or expanded based on the specific needs of the system being evaluated.
- 🤔 The process begins with setting clear goals, even if the findings might be unexpected, and involves giving evaluators tasks to perform with the design.
- 📊 Using multiple evaluators helps in finding a wider range of problems, with each evaluator potentially identifying unique issues.
- 📈 The benefit of adding more evaluators decreases over time, with a general recommendation of 3-5 evaluators for optimal results.
- 🚫 Heuristic evaluation might generate false positives that wouldn't occur in real user testing, which is why it's important to combine it with other methods.
- 📝 After evaluation, a debrief session with the design team is crucial for discussing the findings, estimating fix efforts, and brainstorming improvements.
Q & A
What is heuristic evaluation?
-Heuristic evaluation is a technique for finding usability problems in a design, where evaluators use a set of principles or heuristics to identify issues in the user interface.
Who created heuristic evaluation?
-Heuristic evaluation was created by Jakob Nielsen and colleagues about 20 years ago.
Why is heuristic evaluation valuable?
-Heuristic evaluation is valuable because it allows for quick feedback on a design with a high return on investment, and it can be used with both working user interfaces and sketches.
What are some ideal stages to conduct heuristic evaluation in the design process?
-Heuristic evaluation can be particularly valuable before user testing, before redesigning an application, when needing data to convince stakeholders, and before releasing software for final refinements.
What is the purpose of using multiple evaluators in heuristic evaluation?
-Using multiple evaluators helps to find a wider range of problems due to the diversity of perspectives, which can increase the effectiveness of the evaluation.
What is the recommended number of evaluators for heuristic evaluation according to Jakob Nielsen?
-Jakob Nielsen suggests that three to five evaluators tend to work well for heuristic evaluation, balancing the cost and the number of problems found.
How does heuristic evaluation compare to user testing in terms of speed and interpretation?
-Heuristic evaluation is often faster than user testing as it requires less setup and the results are pre-interpreted, providing direct feedback on problems and solutions.
What are some potential drawbacks of heuristic evaluation compared to user testing?
-Heuristic evaluation might generate false positives that wouldn't occur in a real user environment, whereas user testing is more accurate but can be more time-consuming and resource-intensive.
What is the significance of severity ratings in heuristic evaluation?
-Severity ratings help prioritize which problems to fix first by considering the frequency, impact, and pervasiveness of the issues found during the evaluation.
How should evaluators report the problems they find during heuristic evaluation?
-Evaluators should report problems specifically, relating them to one of the design heuristics, and provide detailed descriptions to help the design team understand and address the issues efficiently.
What is the final step in the heuristic evaluation process after identifying and rating problems?
-The final step is to debrief with the design team to discuss the findings, estimate the effort required to fix issues, and brainstorm future design improvements.
Outlines
🔍 Introduction to Heuristic Evaluation
This paragraph introduces the concept of heuristic evaluation, a method of software evaluation that involves experts using a set of principles or heuristics to identify usability issues in a design. It contrasts this with empirical methods like user testing and formal methods that involve predictive modeling of user behavior. The speaker emphasizes the value of peer critique at various stages of the design process, such as before user testing to refine the interface and after to guide redesign efforts. The paragraph also highlights the importance of having a clear goal in any evaluation process and introduces heuristic evaluation as a technique created by Jakob Nielsen, which is cost-effective and can be applied at any stage of interface design, including with low-fidelity prototypes.
📚 Heuristic Evaluation Process and Benefits
The second paragraph delves into the specifics of the heuristic evaluation process, explaining how evaluators independently assess the design against a set of heuristics and then discuss their findings collectively. It underscores the value of having multiple evaluators to capture a wide range of usability issues, using a graph adapted from Jakob Nielsen's work to illustrate the diminishing returns of adding evaluators. The speaker also discusses the cost-effectiveness of heuristic evaluation, comparing it to user testing in terms of speed and interpretation of results, and notes that while heuristic evaluation can generate false positives, it is a valuable method for identifying severe problems quickly.
👥 Multi-Evaluator Approach and Severity Ratings
This paragraph discusses the rationale behind using multiple evaluators in the heuristic evaluation process, noting that different evaluators will find different problems, and no single evaluator will identify every issue. It explains the process of assigning severity ratings to identified problems, both individually and then collectively, to prioritize fixes. The speaker also touches on the importance of providing evaluators with realistic background information and training, as well as the need for specificity when listing problems and the consideration of missing elements in the design.
🛠️ Severity Rating System and Debriefing
The final paragraph outlines the severity rating system created by Nielsen, which ranges from zero to four, with zero indicating no usability problem and four signifying a critical issue. It describes how evaluators consider the frequency, impact, and pervasiveness of a problem when assigning a severity rating. The paragraph concludes with the importance of a debriefing session with the design team to discuss the findings, estimate the effort required to fix issues, and brainstorm future design improvements. This session is presented as an opportunity to address problems efficiently and to keep all stakeholders informed and engaged.
Mindmap
Keywords
💡Heuristic Evaluation
💡Empirical Methods
💡Formal Methods
💡Simulation
💡Peer Critique
💡User Testing
💡Heuristics
💡Cost-Benefit Ratio
💡False Positives
💡Severity Rating
💡Debrief Session
Highlights
Introduction to heuristic evaluation as a technique for evaluating software usability.
Comparison of empirical methods, formal methods, simulation, and critique-based approaches in software evaluation.
The value of peer critique in the design process for improving software designs.
The optimal stages for incorporating peer critique in the design process.
The importance of having a clear goal in evaluation, even when outcomes may be unexpected.
Heuristic evaluation created by Jakob Nielsen, focusing on finding usability problems in design.
The efficiency and cost-effectiveness of heuristic evaluation compared to other methods.
How heuristic evaluation works with paper prototypes and low-fidelity techniques.
Nielsen's 10 heuristics as a foundation for evaluating user interface design.
The process of heuristic evaluation involving independent review and group discussion.
The benefits of using multiple evaluators and the 'wisdom of crowds' effect.
Jakob Nielsen's rule of thumb for the number of evaluators in heuristic evaluation.
The cost-benefit analysis of heuristic evaluation versus user testing.
The steps involved in conducting a heuristic evaluation, including evaluator training and scenario setup.
The importance of specificity in listing problems found during heuristic evaluation.
Assigning severity ratings to usability issues and the factors considered in this process.
Debriefing with the design team to discuss and address the findings from heuristic evaluation.
The role of heuristic evaluation in the broader context of user interface design and testing.
Transcripts
in this video we're going to introduce a
technique called heuristic evaluation as
we talked about at the beginning of the
course there's lots of different ways to
evaluate software one that you may be
most familiar with those empirical
methods where a some level of formality
you have actual people trying out your
software it's also possible to have
formal methods where you're building a
model of how people behave as a
particular situation and that enables
you to predict how different user
interfaces will work or if you can't
build a closed-form formal model you can
also try out your interface with
simulation and have automated tests that
can detect usability bugs and effective
designs this works especially well for
low level stuff it's harder to do for
higher level stuff and what we're going
to talk about today is critique based
approaches where people are giving you
feedback directly based on their
expertise or a set of heuristics as any
of you who have ever taken an art or
design class now peer critique can be an
incredibly effective form of feedback
and it can help you make your designs
even better you can get peer critique
really at any stage of a design process
but I'd like to highlight a couple that
I think it can be particularly valuable
first it's really valuable to get peer
critique before user testing because
that helps you not waste your users on
stuff that's just going to get picked up
automatically you want to be able to
focus the valuable resources of user
testing on stuff that other people
wouldn't be able to pick up on the rich
qualitative feedback that peer critique
provides can also be really valuable
before redesigning your application
because what it can do is it can show
you what parts of your app you probably
want to keep and what are other parts
that are more problematic and deserve
redesign third sometimes you know there
are problems and you need data to be
able to convince other stakeholders to
make the changes and peer critique can
be a great way especially if it's
structured to be able to get the
feedback that you need to make the
changes that you know need to happen and
lastly this kind of structured peer
critique can be really valuable before
releasing software because it helps you
do a final sanding of being
tire design and smooth out any rough
edges as with most types of evaluation
it's usually helpful to begin with a
clear goal even if what you ultimately
learn is completely unexpected and so
what we're going to talk about today is
a particular technique called heuristic
evaluation heuristic evaluation was
created by Jakob Nielsen and colleagues
about 20 years ago now and the goal of
heuristic evaluation is to be able to
find usability problems in a design I
first learned about heuristic evaluation
when ITA James land a is intro HCI
course and I've been using it and
teaching it ever since it's a really
valuable technique because it lets you
get feedback really quickly and it's a
high bang for the buck strategy and the
slides that I have here are based off
James's slides for this course and the
materials are all available on jakob
nielsen's website the basic idea of
heuristic evaluation is that you're
going to provide a set of people often
other stakeholders on the design team or
outside design experts with a set of
heuristics or principles and they're
going to use those to look for problems
in your design each of them is first
going to do this independently so
they'll walk through a variety of tasks
using your design to look for these bugs
and you'll see you know that different
evaluators are gonna find different
problems and then they're gonna
communicate and talk together only at
the end afterwards at the end of the
process they're going to get back
together and talk about what they found
and this independent first gather
afterwards is how you get a wisdom of
crowds benefit and having multiple
evaluators and one reason that we're
talking about this early in the class is
that it's a technique that you can use
either on a working user interface or on
a sketches of user interfaces and so
heuristic evaluation works really well
in conjunction with paper prototypes and
other rapid low fidelity techniques that
you may be using to get your design
ideas out quick and fast
here's Nielsen's 10 heuristics and
they're a pretty darn good set that said
there's nothing magic about these
heuristics they do a pretty good job of
cover
many of the problems that you'll see in
many user interfaces but you can add on
any that you want and get rid of any
that aren't appropriate for your system
we're going to go over the content of
these ten heuristics in the next couple
lectures and in this lecture I'd like to
introduce the process that you're going
to use with these heuristics so here's
what you're gonna have your evaluators
do give them a couple of tasks to use
your design for and have them do each
task stepping through carefully several
times when they're doing this they're
going to keep the list of usability
principles as a reminder of things to
pay attention to
now which principles will you use I
think Nielsen's ten heuristics are a
fantastic start and you can augment
those with anything else that's relevant
for your domain so if you have
particular design goals that you would
like your design to achieve include
those in the list or if you have
particular goals that you've set up from
competitive analysis of designs that are
out there already that's great too or if
there are things that you've seen your
or other designs excel at those are
important goals too and can be included
in your list of heuristics and then
obviously the important part is that
you're going to take what you've learned
from these evaluators and use those
violations of your heuristics as a way
of fixing problems and redesigning let's
talk a little bit more about why you
might want to have multiple evaluators
rather than just one the graph on the
slide is adapted from jakob nielsen's
work on heuristic evaluation and what
you see is each black square is a bug
that a particular evaluator found an
individual evaluator represents a row of
this matrix and there's about twenty
evaluators in this set the columns
represent the problems and what you can
see is that there's some problems that
were find by relatively few evaluators
and other stuff which almost everybody
found so we're going to call the stuff
on the right the easy problems and the
stuff on the left hard problems and so
in aggregate what we can say is that no
evaluator found every problem and some
evaluators found more than others and so
there are better and worse people to do
so why not have lots of evaluators well
as you add more evaluators they do find
more problems but it kind of tapers off
over time you lose that benefit
eventually and stuff from a cost-benefit
perspective it just stops making sense
after a certain point so where's the
peak of this curve it's of course going
to depend on the user interface that
you're working with how much you're
paying people how much time is involved
all sorts of factors jakob nielsen's
rule of thumb for these kinds of user
interfaces and here's to evaluation is
that three to five people tends to work
pretty well and that's been my
experience too and I think that
definitely one of the reasons that
people use heuristic evaluation is
because it can be an extremely cost
effective way of finding problems in one
study that Jakob Nielsen ran he
estimated that the cost of the problems
found with heuristic evaluation were
$500,000 and the cost of performing it
was just over ten thousand dollars and
so he estimates a 48 fold benefit cost
ratio for this particular user interface
obviously these numbers are
back-of-the-envelope and your mileage
will vary you can think about how to
estimate the benefit that you get from
something like this if you have an
in-house software tool using something
like productivity increases that if
you're making an expense reporting
system or other in-house system that
will make people's time more efficiently
used that's a big usability win and if
you've got software that you're making
available on the open market you can
think about the benefit from sales or
other measures like that one thing that
we can get from that graph is that
evaluators are more likely to find
severe problems and that's good news and
so with a relatively small number of
people
you're pretty likely to stumble across
the most important stuff however as we
saw with just one person in this
particular case you know even the best
evaluator found only about 1/3 of the
problems at the system and so that's why
ganging up a number of evaluators say 5
is going to get you most of the benefit
that you'll be able to
if we compare a heuristic evaluation and
use you're testing one of the things
that we see is that heuristic evaluation
can often be a lot faster it takes just
an hour to versus for an evaluator and
the mechanics of getting a user test up
and running can take longer not even
accounting for the fact that you may
have to build software also the
heuristic evaluation results come pre
interpreted because your evaluators are
directly providing you with problems and
things to fix and so it saves you the
time of having to infer from the
usability tests what might be the
problem or solution now conversely
experts walking through your system can
generate false positives that wouldn't
actually happen in a real environment
and this indeed does happen and so user
testing is sort of by definition going
to be more accurate at the end of the
day I think it's valuable to alternate
methods all of the different techniques
that you'll learn and cut in this class
for getting feedback can each be
valuable and that cycling through them
you can often get the the benefits of
each and that can be because with yours
to evaluation and user testing you'll
find different problems and by running a
Qi or something like that early in the
design process you avoid wasting real
users that you may bring in later on so
now that we've seen the benefits what
are the steps the first thing to do is
to get all of your evaluators up to
speed on what the story is behind your
software any necessary domain knowledge
they might need and tell them about the
scenario that you're gonna have them
step through then obviously you have the
evaluation phase where people are
working through the interface afterwards
each person is going to assign severity
rating and you do this individually
first and then you're going to aggregate
those into a group severity rating and
produce an aggregate report out of that
and finally once you've got this
aggregated report you can share that
with the design team and the design team
team can discuss with
do with that doing this kind of expert
review can be really taxing and so for
each of the scenarios that you lay out
in your design it can be valuable to
have the evaluator go through that
scenario twice the first time they'll
just get a sense of it and the second
time they can focus on more specific
elements if you've got some walk up and
used system like a ticket machine
somewhere then you may want to not give
people any background information at all
because if you've got people that are
just getting off the bus or the train
and they walk up to your machine without
any prior information that's the
experience you want your evaluators to
have on the other hand if you're gonna
have a genomic system or other expert
user interface you'll want to make sure
that whatever training you would give to
real users
you're gonna give to your evaluators as
well in other words whatever the
background is it should be realistic
when your evaluators are walking through
your interface it's going to be
important to produce a list of very
specific problems and explain those
problems with regard to one of the
design heuristics you don't want people
to just be like I don't like it and in
order to maximally preach you these
results for the design team you'll want
to list each one of these separately so
that they can be dealt with efficiently
separate listings can also help you
avoid listing the same repeated problem
over and over again if there's a
repeated element on every single screen
you don't want to list it at every
single screen you want to list it once
so that it can be fixed once and these
problems can be very detailed like the
name of something is confusing or it can
be something that has to do more with
the flow of the user interface or the
architecture of the user experience and
that's not specifically tied to an
interface element your evaluators may
also find that something's missing that
ought to be there and this can be
sometimes ambiguous with early
prototypes like paper prototypes and so
you'll want to clarify whether the user
interface is something that you believe
to be complete or whether there are
intentionally elements missing ahead of
and of course sometimes there are
features that are going to be obviously
there that are implied by the user
interface and so mellow out and relax on
those after your evaluators have gone
through the interface they can each
independently assign a severity rating
to all of the problems that they found
and that's going to enable you to
allocate resources to fix those problems
it can also help give you feedback about
how well you're doing in terms of the
usability of your system in general and
give you a kind of benchmark of your
efforts in this vein the severity
measure that your evaluators are going
to come up with is going to combine
several things it's going to combine the
frequency the impact and the
pervasiveness of the problem that
they're seeing on the screen so
something that is only in one place
maybe a less big deal than something
that shows up throughout the entire user
interface similarly there going to be
some things like misaligned text which
may be inelegant but aren't a
deal-killer in terms of your software
and here's the severity rating system
that Nielsen created you can obviously
use anything that you want it ranges
from zero to four where zero is at the
end of the day your evaluators decide
it's not actually a usability problem
all the way up to it being something
really catastrophic that has to get
fixed right away and here's an example
of a particular problem that RTA Robbie
found when he was taking CS 147 as a
student he walked through somebody's
mobile interface that had a weight entry
element to it and he realized that once
you'd entered your weight there was no
way to edit it after the fact so that's
kind of clunky you wish you could fix it
maybe not a disaster and so what you see
here is he's listed the issue he's given
it a severity rating
he's got the heuristic that it violates
and then he describes exactly what the
problem is and finally after all your
evaluators have gone through the
interface listed their problems and
combined them in terms of the severity
and importance you'll want to debrief
with the design team
this is a nice chance to be able to
discuss general issues in the user
interface and qualitative feedback and
it gives you a chance to go through each
of these line items and suggest
improvements on how you can address
these problems in this debrief session
it can be valuable for the development
team to estimate the amount of effort
that it would take to fix one of these
problems so for example if you've got
something that is 1 on your severity
scale not too big a deal it might have
something to do with wording and it's
dirt simple to fix that tells you go
ahead and fix it conversely you may have
something which is a catastrophe which
takes a lot more effort but its
importance will lead you to fix it and
there's other things where the
importance relative to the cost involved
just don't make sense to deal with right
now
and this debrief session can be a great
way to brainstorm future design ideas
especially while you've got all the
stakeholders in the room and the ideas
about what the issues are with the user
interface are fresh in their minds in
the next two videos will go through
Nielsen's 10 heuristics and talk more
about what they mean
5.0 / 5 (0 votes)