Usability Testing: What Is It? -- Jerz's Literacy Weblog (est. 1999)

Jerz > Writing > Technical Writing >

Better-written documents enable people to work with greater speed, recall, accuracy, and comfort. These qualities, when taken together, make up the usability factor. This article covers how usability testing will let you measure how well your document actually does what you intend for it to do. (See also “Top 8 Tips for Designing Usability Tests.”)

19 Jul 2000; by Dennis G. Jerz
11 Apr 2011 — formatting updates

The first rule of writing is “know your audience.” But even the best planning cannot predict all possible user errors.

Computer: TO START PRESS ANY KEY
Homer: Where’s the ‘any’ key? I can’t find the ‘any’ key!

Whether you design and write an entire corporate web site, compose a set of instructions, or write a simple caption for a photo, whenever possible, before completing a polished draft of what you think your readers want, test a prototype on some representative readers. Their insights may surprise you.

For starters, test your prototype on 5 people who approximate your intended audience. Have them use the prototype to perform a real task, while you observe and measure their performance. In the past, my students have asked subjects to follow instructions for making a pizza at a local restaurant, to assemble furniture, to search a website for specific pieces of information, or to rank their emotional responses to several proposed illustrations. Depending on the nature of the project, students timed the performance of their test subjects, counted the number of mistakes, administered memory quizzes, and/or examined qualitative responses.

Questions to Ask Yourself about Your Document:

What real-world tasks will you ask your readers to perform?
How can you measure your document’s usefulness to test subjects relying upon it for completing those tasks?

Warning: Simply asking readers their opinions is a waste of time at this stage, unless you also find some objective way to measure whether their opinions are accurate.

For example:

If 80% of your testers agree that a particular document is “easy to understand”
…but only 40% answered correctly when quizzed about the contents…
…then half of the testers are overconfident.
(Not only are they misunderstanding the material, they don’t even know that your document isn’t helping them. Their high opinion of the document is a trouble sign!)

If all of your test subjects can perform all the assigned tasks without any problems, then your test was probably too easy. You should feel excited and happy when you notice a test subject struggling… the vast majority of usability problems are rather simple to fix (provide a different subheading, change the wording, add a picture, etc.), although you need to leave enough time in the development cycle so that you can actually fix the problems (which generally involves yet another usability test, in order to provide evidence to show that the changes you made did in fact help your readers).

Example: Testing a Resume Folded Like a Brochure

The best way to test your document is to give it to a few people who represent the range of intended users, and watch what happens.

A student once gave me an unusual resume, in the form of a brochure, folded into thirds.

Page one (the front cover) displayed her name and an attractive graphic.
Page two (two sections of the page, visible when the brochure was partially unfolded) carried an employment objective.
Page three (the back side of the previous two pages) gave the usual listing of education and experience, but arranged in three neat columns.

It was a clever, creative layout, with great content, but I thought it lacked focus. Here’s why: when I first saw it, I glanced at the cover and went directly to the inside (page three) — and therefore skipped the employment objective. Although I eventually did find it, I had already decided her resume lacked focus; a potential employer who didn’t already know this student probably wouldn’t have worked as hard as I did in order to figure out why I found the resume confusing.

The non-standard design meant I had difficulty using it. Once I finally noticed and read the objective, the rest of the document made more sense.

Had I just done something silly, or was the design of the document at fault?

Before deciding for sure, I conducted a little usability experiment of my own…. I showed the brochure to another faculty member, and watched what happened. My volunteer glanced at the cover, and went straight to page three, just as I had done. In fact, my test subject didn’t even bother to put her reading glasses on until she had completely unfolded the page.

While two test subjects is hardly a scientifically valid survey, it was enough to convince me (and the student) that the layout needed more revision before the resume was ready for the eyes of dozens or hundreds of potential employers.

What Data to Collect from Your Usability Tests

The information you gather from your user testing will vary greatly, depending on the purpose of your document.

You might treat last year’s brochure as the “control” version, and have the testers compare it with one or more “experimental” redrafts.
You might have subjects take a multiple-choice test before they look at your document; tell them to read the document until they feel they have understood it completely, and then give then another multiple-choice test (which repeats all of the initial questions, but also adds some others).

Whatever you do, collect both quantitative and qualitative data.

Quantitative Data — You measure something, such as accuracy, speed, and recall.

How many mistakes did your test subjects make while trying to do whatever your document is supposed to help them do?
How long did they take to complete the tasks? (How long before they gave up trying?)
How much do they remember 15 minutes later? 24 hours later?

Consult with your client to determine what the most important tasks are, and/or what the most important information is. Then, find out whether your test subjects can perform those tasks or remember that information.

Qualitative Data — Your testers describe something. (Note: it is possible for the researcher to generate qualitative data by recording his or her own subjective descriptions of what is going on; but I don’t recommend that for most usability tests. In general, you should ask your subjects to record their own opinions.) You can gather qualitative data through both multiple-choice and open-ended questions.

Multiple-choice questions
- “Was the introduction informative?” (Yes/No)
- “Rate your impression of the completeness of this document, on a scale of 0 (least complete) to 10 (most complete).”
- “Respond to the following statement: “After reading [some specific section], I am more likely to want to do [some important task].’
  Choose one of the following:
  [disagree strongly / disagree / disagree somewhat / no response / agree somewhat / agree / agree strongly]”
Open-ended questions
- “Do you have any suggestions for improvements?”
- “What were the most interesting sections to you? Why?”
- What are three things that you remember about the document?

Note: You can make any multiple-choice question into an open-ended question simply by asking “Why?”

How Many Testers?

One web guru says that five testers will catch about 80% of the big errors on a given web site. You’ll need to revise your document, and then run another test (with 5 additional people) in order to determine whether you have, in fact, fixed the errors. Since you probably won’t fix all the errors the first time, and since your second test will probably uncover new errors that you need to fix, a third test will probably be necessary as well.

If a particular problem is causing more than usual trouble for your test subjects, and you are running out of time, you can try close-coupled testing:

Close-coupled testing lets you do all the testing you want in the allotted time and get a successful product out of it. The technique is simple. Run a test subject through the product, figure out what’s wrong, change it, and repeat until everything works. Using this technique, I’ve gone through seven design iterations in three-and-a-half days, testing in the morning, changing the prototype at noon, testing in the afternoon, and making more elaborate changes at night.
Close-coupled testing works. Particularly in the early stages, you are not looking for subtleties. You’re lucky if a user can actually get from one end of the application to the other. Problems that may have escaped your attention before a user tried the product now are glaringly obvious once the first user sits down. So fix them! And then move on to the next layer of problems. (Bruce Tognazzini, “$1.98, Close-coupled Usability Testing“)

If your first couple of testers don’t teach you anything useful, you may need to redesign your test. Keep careful notes — this is an experiment. When you present your findings, make some reasonable estimate regarding the reliability of its results (for instance, if all your test subjects were students in the same class, and the professor has a bee in his bonnet about why HTML frames suck, then the attitude of all your test subjects on that particular topic will probably not accurately reflect the attitudes of the general public).

How to Treat Your Test Subjects

According to the administration at many univerities, conducting surveys and questionnaires falls under the category of “human subject research” if the research takes place outside the context of a typical homework assignment. I am supposed to file paperwork that testifies I won’t let any of my students abuse their volunteers. With that in mind, you should remind your testers of the following:

They are volunteers.
They can stop at any time.
The object of inquiry is the document — not the testers, their performance or intelligence.

You want your testers to feel free to speak their mind without fear of hurting your feelings — even if their mistakes may mean you will have to do more work. You may think the test is a simple matter, and you may even be bored with it, but your testers might take it very seriously. Testers have been known to run off in tears — or at least feel unnecessary stress — because they blame themselves for their failure to complete a task.

Conclusion

In the real world, there is never enough time or money to fix all the problems that crop up. Weigh the costs of fixing the problem against the benefits to the user (or to your final grade, if you are working on a class project). Even if you are not a statistician, common sense should help you find some balance between the hype created by usability expertswho want to drum up customers and the folly of managers who deny usability problems.