Usability Testing: Top 8 Tips for Designing Usability Tests

Jerz > Writing > Technical Writing > Usability Testing

If you already have a prototype document and you want to conduct a usability test, and you’re eager to learn how to make the most of your opportunity to learn from your users, then this document is for you.

[wp_ad_camp_4]
08 Nov 2002; by Dennis G. Jerz
06 Jun 2010 — last updated
11 Apr 2011 — formatting updates
This document is intended to help beginners design questions to help them conduct a good usability testing session.

Usability Testing is Not Opinion Gathering.

If you show a draft of your project to some volunteers and and ask questions like, “Is my document well-organized?” and “What do you think of my table of contents?” you are likely to get at least some useful feedback. But if your volunteers have a personal relationship with you, and they know you have worked hard on this project and you’d very much like their approval, their responses will not be very helpful.

Even if you tell your test users that you want their critiques, most people are socialized to say nice things. The point of conducting a usability test is to find out in advance what problems will bother your end users. You want your test users to complain when they are lost or frustrated, and you want to be there when they make mistakes while trying to use whatever it is you are testing, so that you know exactly what parts of your project need the most attention.

Watch your testers try to use your document. Don’t just ask questions.

I’ve worked hard on this project; can you tell me what you do and don’t like?
  • Please help me meet my deadline — I’ll feel so affirmed and flattered.
  • If I see you get stuck, or hesitate, I will tell you what to do, so we can get through this testing faster.
  • If you start making any serious suggestions, my face will cloud over with a look of pain, since I don’t want to hear anything but praise.
Remember on The Andy Griffith Show, when Aunt Bea gave Andy a sample of her new apple pie recipe, and it was awful but he couldn’t bear to tell her, so she ended up being completely humiliated by the judges at the county fair? (Well, maybe it was pickles and maybe it was Alice on The Brady Bunch, but you get the idea.) Most volunteer testers are nice people who won’t want to say anything negative about your work. But you want them to find flaws in your prototype, so that you can fix them before the deadline.
I’m trying to determine the strengths and weaknesses of this document.
  • I will read you some questions and keep track of how long it takes you to answer each one. Just do whatever comes naturally.
  • If you should run into problems, don’t feel bad — I’m testing the document, not you. Feel free to talk out loud and tell me whatever comes to mind.
  • If you have any suggestions, I’ll make sure the author of the document sees them.
  • (After the test is over.) I really learned a lot from watching you — thank you!
You’ll get better results if the testers don’t know you are the person who designed the document they are testing. With your eyes open and your mouth shut, watch and listen to your testers as they encounter your work-in-progress. Since you’ve left plenty of time in the production cycle (right?), you are ready and eager to solicit their input and act on what they say.

In a good usability test, your testers will use your document to do whatever your real users want to do. Rather than simply ask your testers to “look at” your document and tell you what they think, come up with a short list of definite tasks — finding a bit of information, collecting and comparing information from different locations, making judgments about the content, etc.

1. Keep the Test Short

If the test is long, few subjects will bother with it. I once had a student who mailed out a thousand questionnaires, and got zero responses back.

Try out a longer test on a small number of subjects, and then cut those sections that don’t result in useful data. Ask a more experienced tester to glance at your questions before you actually conduct the test. (It’s no fun to find out, at a late stage in your project, that you’ve gathered no useful information.)

2. Plan to Quantify Your Results.

When gathering data, it’s easy to ask questions like “Did you think the navigation was clear?” You might tabulate the “yes” and “no” answers, but how will you quantify responses to a general question like “What did you think about the table of contents?”

When you do ask general opinion questions, use a seven-point Likert scale, in which you ask subjects rate their own responses. For example:

Respond to the following statement: This document is clear.

  1. Disagree Strongly
  2. Disagree
  3. Disagree Somewhat
  4. No Opinion
  5. Agree Somewhat
  6. Agree
  7. Agree Strongly

You now have some numerical data to work with, allowing you to identify trends such as the following:

In the first usability test, subjects reported an average score of 5.2 for question 1, indicating only a very slight agreement; this agreement strengthened to 5.7 in the second usability test, and 6.1 in the third.

Your readers probably won’t want to hunt through paragraphs like the one above; use a table to present your results efficiently. As you make changes to your document and re-test it, you can quantify the change in the usability factor.

Table 1: Responses to Subjective Statements

Test 1 Test 2 Improvement
1: “This document is thorough.” 4.2 5.0 +19%
2: “This document is easy to understand.” 3.0 4.5 +50%
3: “The site was too complex.” 4.0 3.0 +25%
Average 3.73 4.17 +31%

If you also tabulated your results for “Task Time” (how long it took users to perform routine tasks), “Task Errors” (how many mistakes users made when answering the task questions) and “Memory” (whether subjects could answer questions about what they had read or done), you could then calculate the average improvement for each section.

(Note that, for “Task Time” and “Task Errors”, a lower number constitutes a better score. My example below suggests one way you can make that concept clear to your reader.)

Table 2: Average Usability Change

Test 1 Test 2 Improvement
Subjective 3.73 4.17 +31%
Task Time (sec) 120 95 +26% *
Task Errors 3.0 1.5 +50% *
Memory Accuracy 3.0 4.0 +33%
Avg. Usability Change +35%
* Note: For these criteria, a lower score on the second usability test corresponds to an increase in usability; the percentage value in the “Improvement” column has been inverted to reflect this relationship.

3. Ask Subjects to Prioritize, Rank, or at Least List.

Instead of asking questions that your subjects can answer “Yes” or “No,” ask your subjects to prioritize, rank, or at least list.

For example:

  • What are the three things you noticed on the home page?
  • Now that you have read the brochure, what are three important facts that your remember about the content?

If many of your test subjects mention some part of your document that you hadn’t thought was very important, that may be a sign that your testers could teach you a lot about that section. If nobody mentions any details from a part of your project that you have put the most time in, then perhaps you need to reorganize your project (make that material more visible, or perhaps give up on that material and focus your energy on what your test users thought was important).

4. Sequence Your Subjective Questions from General to Specific.

A subjective question is one that asks for the tester’s personal opinion:

  • How likely are you to trust the webmaster?
  • Which section of the assembly instructions needs the most attention?

Finding out what users think they know can help designers predict trouble spots. For instance, if 80% of test subjects rated a set of instructions “easy to follow,” but only 40% were actually able to complete the task correctly, then the instructions are seriously flawed because they are giving testers a false sense of confidence.

Begin with general questions that ask your testers to come up with details; then, ask the specific questions in which you tell testers what details to think about.

By simply mentioning something in a question, you call attention to it. It’s also possible to call attention to something indirectly — if I first asked you to think of a monkey, and then asked you to think of a fruit, you’d probably be more likely to think of a banana, and the resulting list of “most popular fruits” would be skewed.

Likewise, if you ask specific questions about parts of your document, all the answers that follow will be skewed.

Did you think the navigation on the website was good?
Calling attention to the navigation with a question that is fishing for a “yes” answer will not give you an accurate view of the usefulness of your document.
What did you think of the navigation on the website?
The above example is at least honest, because it doesn’t encourage the reader to praise the navigation, it still artificially calls attention to it.
List three good things about the website and three bad things.
Maybe the navigation will turn up in one or the other list; maybe the testers will focus their attention on different things entirely.

The solution is to ask general questions first, and then ask specific questions.

5. Move Beyond Opinion

If you ask yes/no questions like “Did you like X?” then you are only measuring an opinion. As long as you also record things like how long it takes users to perform certain typical tasks, how accurate their answers are, and how much they remember a short time after they’ve used your document, then it’s perfectly fine to gather subjective information. For instance, if a user answers “Yes” to “Do you understand this document,” but misses too many memory recall questions, then we can hypothesize that the document gives the user a false sense of confidence.

There’s another problem with yes/no opinion questions like “Did you like the document?”. The answers aren’t terribly useful. Even if my tax forms are designed really well, I’ll neverlike them.

You might be thinking, “Oh, so what I need to do is ask more specific questions like ‘Is this document well organized?'” But the average person probably isn’t trained to evaluate a document’s design.

User testing asks you to observe what people do, and measure their performanceYou are the expert who will use the resulting data to make decisions about design and content.

User opinion is only one piece of the puzzle that you’ll have to assemble.

6. Ask Questions in Pairs

Asking users a question like “Could this part of the document be improved?” doesn’t get right at the core of the problem. Maybe that part of the document is great, but the subject can think of a brilliant plan to improve usability by .oo5%. In that case, the subject would answer “Yes, this passage could be improved,” but that would still not give you any useful information.

It’s often a good idea to ask users to make a snap judgment, and then ask them to explain their answer. Asking for the judgment without explanation will close off an avenue of communication; asking for a short essay won’t give you information you can easily quantify.

1) Respond to the following: In my free time, I am [blank] to read a book.
  1. very likely
  2. likely
  3. somewhat likely
  4. neither likely nor unlikely
  5. somewhat unlikely
  6. unlikely
  7. very unlikely

2) Why did you answer the way you did?

If your multiple-choice and short-answer questions work together, you get both

  • a number (in this case, a rating on a scale of 1-7) you can plug into a chart and
  • words (the response to a short-answer question that you can quote in your report)

7. Plan a Series of Tests (with Different Testers)

What does it mean if a user got 100% of the questions right? Does that mean you have a perfect product, or did you get genius volunteers, or was your test too easy?

Rather than aiming for high usability scores, aim to show improvement in usability, as you use what you learned from your testers to improve your document.

Let’s imagine that Richard is designing a website, and Susan is designing a brochure. They both conduct usability tests.

Table 3: Usability Results for Richard and Susan

Richard’s Website Susan’s Brochure
Subjective 2.17 4.6
Task Time (sec) 92 30
Task Errors 2.3 0.5
Memory Accuracy 1.7 5.2

In every category, Richard’s usability scores are lower than Susan’s.  Richard’s users gave better subjective answers, it took them longer to finish their tasks, they made more errors, and their memories weren’t as accurate.

Should Richard worry? Not at all.

A single usability test is meaningless.

What matters is not the scores that Richard or Susan get on their first test run. What matters is what Richard and Susan do with the information they gather,

  • Learn from the test results.
  • Revise your project (or at least the areas that need the most attention).
  • Retest–with different volunteers (5 users per test run is plenty) but the same questions.
  • Repeat (until the results level off, or you hit your deadline — whichever comes first).

There’s nothing magic about the scores that Richard and Susan started out with. The results of the first test are just an arbitrary starting point.

Imagine that Richard’s next test shows a 25% increase in usability, while Susan’s next test shows a 5% increase. Richard’s scores are still much lower than Susan’s, but that doesn’t matter. What matters is that Richard can show that his efforts have yielded specific, measurable results over time.

8. Plan Realistic Tasks

Your usability testing should emulate, as closely as possible, the environment in which your end users will interact with your document.

A simple scavenger hunt is only marginally useful. If you give your testers a list of things to find (“the ‘Comments’ page” and “the e-mail address for the help desk,” for example), and then time how fast they can find it, that’s only a marginally useful task. (You could too easily improve their time on that task by making the specified information stand out more on the page, but that might make it harder to find the rest of the information on your site.)

When an oil rig explodes, the oil riggers don’t have someone hovering over their shoulder telling them what chapter they should turn to. Instead, people in a crisis situation will start with what they know: they have a problem.  If they have have reason to suspect that your document holds the answer to their problem, they’ll be motivated to read (and use) your document.

So, instead of announcing the solution to the problem and timing how long it takes them to find the page that contains the solution (which is a form of “teaching to the test”), you might instead state a problem, and watch how people use your document in order to solve it.

Thus, don’t tell them to look for the FAQ page, where you have conveniently placed the answer. Instead, just tell them what their problem is, and see what section they choose to turn to first.)

Dennis G. Jerz
08 Nov 2002 — first posted
17 Nov 2002 — minor revisions
04 Dec 2002 — minor revisions
10 Mar 2008 — minor corrections
05 Jun 2010 — modest update


Case Study: A Pamphlet
If I were testing a pamphlet that described the major programs at my school, I would think of a handful of ways that I would expect a user to use that pamphlet.
  • Where would a person pick it up?
  • How likely would that person keep it for future reference?
  • How frequently will the pamphlet be updated?

I would first find about two volunteers (who represent the target audience as closely as possible — if the target audience is incoming freshmen, then the volunteers shouldn’t be graduating seniors), and watch them interact with the document I’m about to test. I’d then talk with them about their initial efforts to use the document to accomplish real tasks.

Based on what I learned, I would come up with maybe five specific tasks, ranging from very simple (what’s the phone number you call to get more information?) to complex (what’s the major that requires the largest number of non-English courses? [the answer to a question isn’t in one place, it requires the reader to flip back and forth and compare]); I would then find five subjects for the first “real” usability test.

I would how long it took them either to find an answer or to give up. (I’d ask them to stop if they haven’t found it in a pre-determined amount of time).

I’d then take the brochure back, and ask them to tell me what three things they remember about it.

I’d then ask them what they thought of the brochure in general (having them “agree” or “disagree” on a seven-point scale to questions such as “This brochure was helpful,” or “This brochure was complete.”), and invite them to make any suggestions.

After they’ve made their own general suggestions, I would ask specific questions about parts of the document I’m really interested in examining.

I would then consider all the ways I could possibly improve the document, and rank them according to how much effort it would take to implement them, and how important they are to the project. Obviously, I’d do the “easy and also important” fixes right away, and then consult with my client and/or supervisor about the rest.

–DGJ

Related Pages

  • Technical Writing: What is It?
    Technical writing is the presentation of information that helps the reader solve a particular problem. Scientific and technical communicators write, design, and/or edit proposals, reports, instruction manuals, web pages, lab reports, newsletters, and many other kinds of professional documents.
  • Quotations: Integrating them in MLA-Style Papers
    The MLA-style in-text citation is a highly compressed format, designed to preserve the smooth flow of your own ideas (without letting the outside material take over your whole paper). A proper MLA inline citation uses just the author’s last name and the page number (or line number), separated by a space (not a comma).
  • Show, Don’t (Just) Tell
    Don’t just tell me your brother is funny… show me what he says and does, and let me decide whether I want to laugh. To convince your readers, show, don’t just tellthem what you want them to know.There.  I’ve just told you something.  Pretty lame, huh? Now, let me show you.
  • Titles for Web Pages: In-Context and Out-of-Context
    Most writers know the value of an informative title, but many beginning web authors don’t know that each web page needs two kinds of titles. The in-context (IC) title always sits at the top of a page, with the rest…
  • Active and Passive Verbs
    Active verbs form more efficient and more powerful sentences than passive verbs. This document will teach you why and how to prefer active verbs. * The subject of an active sentence performs the action of the verb: “I throw the ball.” * The subject of a passive sentence is still the main character of the sentence, but something else performs the action: “The ball is thrown by me.”
  • Blurbs: Writing Previews of Web Pages
    On the Web, blurbs are compressed summaries of what a reader will find on the other end of a hyperlink. Good blurbs don’t harangue (“Click here!”) or tease (“Learn ten great tips!”). You’re reading a blurb now. If it helps you decide whether to click the link, it’s done its job.
  • Short Stories: Developing Ideas for Short Fiction
    A short story is tight — there is no room for long exposition, there are no subplots to explore, and by the end of the story there should be no loose ends to tie up. End right at the climax, so that the reader has to imagine how a life-changing event will affect the protagonist.
  • Short Stories: 10 Tips for Creative Writers (Kennedy and Jerz)
    Short stories make every word count. They avoid unnecessary scenes, characters, and plot points. It usually focuses on a single problem and a short time period. This page offers tips on writing dialogue, building to a climax, and capturing the reader’s interest.
  • MLA Style: Step-By-Step Instructions for Formatting MLA Papers
    Need to write a paper in MLA format? This step-by-step includes images showing how to use MS-Word to create the title block, page layout, and works cited list.
  • Writing Effective E-Mail: Top 10 Tips
    People decide to read or trash e-mails in seconds. From the subject line to the closing, offer a focused, scannable message that puts your reader’s needs first.