Formative vs. Summative Testing
Following on from the last lesson where we briefly talked about using wireframes to visualize low-fidelity ideas, in this lesson we’ll learn about usability testing, the method of forming, refuting, or validating the theories that we hold about our design’s usability, which in turns helps us transition our design from low-fidelity to mid- and high-fidelity.
Earlier on in the series we explored the basic concepts of usability, which offers a decent base for us to begin forming theories. However, we can’t rely on these usability heuristics alone. We also need what’s called formative usability testing, where we base theories on user insights; and summative testing, where we confirm those theories.
Formative testing tells us what needs to be improved, whereas summative testing indicates whether or not these improvements were successful. Formative testing reveals insights that we can use to mock up our mid- and high-fidelity wireframes, whereas summative testing is often used to validate these wireframes once created.
Let’s take a look at the different types of usability testing.
Why Do Usability Testing?
Businesses that spend only 10% of their budget on usability improvements see, on average, a 135% increase in their desired metrics, so the ROI of usability is significant.
During tree testing, users are shown a tree-like sitemap of an app or website’s navigation without the distraction of visual elements, and are asked where they would navigate to in order to find something. A question could be: “Where would you go to find a meal plan?” — which could yield different results depending on whether the nav category is labeled as Meal Plans, or else something more obscure like Downloads.
Interestingly, tree testing can be both formative and summative. Formatively, tree testing can reveal what’s wrong with our navigation labeling or structuring, however, summatively it can confirm that a navigation redesign now offers optimal usability.
Suitable tree testing tools include User Zoom and Treejack by Optimal Workshop.
Open Card Sorting
Wait…if tree testing indicates an issue, how do we identify the solution? 🤔
During open card sorting, testers are asked to sort navigation items into appropriate categories; categories of which are created by the tester. That’s right, sometimes the best way is to ask users directly, which offers rich insight into the user’s mental model.
While tools like Optimal Sort are designed for card sorting, card sorting can also be conducted in the real world by having testers simply arrange cards, sticky notes, or something of the sort (pun not intended) into the structure they deem appropriate.
Alternatively, wireframe tools such as Whimsical also have card/sticky functionality.
Below: the user has created three navigation categories and sorted the navigation items into those categories, even using color to further sub-categorize them neatly.
Closed Card Sorting
A closed card sort test is a summative variation of open card sorting, where the categories are already named, and it’s simply a case of the tester confirming where each navigation item belongs. We can then follow up with tree testing again to make sure that users are finding the right things quickly. Wham bam, there’s your navigation.
Note: tree testing is also known as reverse card sorting.
Functional Salience Testing
While that concludes site structure and navigation, it certainly doesn’t conclude the overall information architecture. Before finally building our first mid-fidelity wireframes, we need to know which functions are the most important. Does the user care deeply about searching for things? Would they rather skip the sign up, or the onboarding?
Functional salience tests are used to decipher which functions are most important, which then helps us decide how best to design the content and UI. Instruct the tester to choose three functions from the tree map (also known as a sitemap) — for example, subscribe, pricing, and checkout could be three functions the user deems important.
With these insights in mind, we can then design the task or user flows and visual hierarchy in a way that adds extra emphasis on the chosen functions, and then depict this as what’s called a task or user flow map, a sort-of upgrade to a sitemap that specifically highlights the various user flows. Again, Whimsical is a top tool for this.
Below: notice how the information architecture is evolving? We now have some fantastic insights — literally a map — telling us how our design should be wireframed.
Performance testing is used to evaluate the effectivity of task completion once the wireframes have been created using the insights acquired from formative testing.
Given a variety of example scenarios, users are asked to complete tasks as we watch, listen, and ask questions to help the user communicate their feedback as we take notes.
Remote user testing tools such as Lookback and UserTesting, which integrate with everyday screen design tools such as Marvel and InVision, are very suited for this. Also, new kid UserLook helps designers kickstart usability testing in seconds — a huge advantage to UserLook is that it’s free if you source testers from your own userbase.
Below: a mockup (left), a video feed of the user testing it, a chat window, and finally, somewhere to take notes — here, somebody is running a usability test using Lookback.
Discreetly, we then assign the tester a score — 2 for Completed, 1 for Had Difficulty, and 0 for Fail — and after all the tests are complete, we can use the scores and notes to measure success rate, task time, and overall user satisfaction. Also, you might find that affinity mapping is useful for clustering recurring feedback, which helps us determine which issues are high-priority, and we can do this with all kinds of testing and research.
Tip: pay attention to how users overcome roadblocks, as this will reveal solutions.
5-Second Usability Testing
5-second usability tests are used to determine whether or not users can recall something from memory, such as the name of the brand, or what the app, website, or specific screen does. Most users are only willing to allocate 5 seconds to an app or website, so if the user for any reason becomes stalled, they will hit the back button.
5-second usability tests are simple but effective, and like many of the usability tests outlined in this lesson, they can also be carried out using user/usability testing tools.
An expectancy test is an interesting one, as the tester can answer with some rather hilarious responses 😂 if the usability is bad enough. The approach here is to ask the usability tester what they think something means or does without interacting with it. If something doesn’t seem to be working, an expectancy test can reveal why — this can happen when trying to innovate new ideas or introduce unfamiliar design concepts.
Visual Affordance Testing
During a visual affordance test, users are first asked to circle the elements they believe to be clickable, and then again, with elements they don’t believe to be clickable. With these insights we can fine-tune the clickability and tapability of interactive elements.
Below: screens copied from Whimsical to Freehand, where users can draw on them.
Brand Perception Testing
Brand testing helps us evaluate our brand message.
While this blurs the lines between UX and marketing, branding does influence our decision to interact with a business, especially when trust, security, and legalities are concerned. This test is used to identify the feelings aroused from a set of mockups, where users are asked to circle adjectives that best describe the brand, and from this we can then decipher if users are associating the brand with the desired attributes.
Often enough, this is done with a simple survey.
Perception tests can be both formative and summative depending on whether we’re showing users a mockup, or simply a moodboard with various abstract visual elements.
Free Exploration Testing
A free exploration test is a summative test where the user has 5 minutes to explore freely while speaking aloud. From this we can identify any flaws not already identified.
Interestingly, usability testing can also be conducted without users, using what’s known as eye-tracking software. But let’s set one thing straight, eye-tracking by itself is not effective, because without talking to users we’re missing context, intent, and emotion. That being said, eye-tracking can reveal enough insights to raise important questions, questions that we can then ask during some of the usability tests above.
Eye-tracking tools offer a number of useful features; heatmaps (which track where users move their mice), scrollmaps (which track where users scroll to and how long they stop to look around), and clickmaps (which track where users actually tap/click).
While eye-tracking software doesn’t reveal solutions, it can set us in the right direction.
Fullstory, Hotjar, and Crazy Egg are three terrific tools for eye-tracking.
Although you’ll want to test for usability often and repeatedly, when you’re sure that your wireframes are offering the best user experience, it’s time to mockup the final design using a UI design tool. When you’re ready, read our UI design tool comparison.
This guide forms part of our ebook, A Beginner’s Guide to Designing UX.