You Can't Make Good Decisions with Bad Data

I think a critical lesson of the Lean Startup movement is that you have to learn quickly.

The “quickly” part of that lesson can lead to a culture of “good enough.” Your features should be good enough to attract some early adopters. Your design should be good enough to be usable. Your code should be good enough to make your product functional.

While this might drive a lot of perfectionists nuts, I’m all for it. Good enough means that you can spend your time perfecting and polishing only the parts of your product that people care about, and that means a much better eventual experience for your users. It may also mean that you stay in business long enough to deliver that experience.

I think though that there’s one part of your product where the standard for “good enough” is a whole lot higher: Data. Data are different.

You Can’t Make Good Decisions With Bad Data

The most important reason to do good research is that it can keep you from destroying your startup. I’m not being hyperbolic here. Bad data can ruin your product.

Imagine for a moment an a/b testing system that randomly returned the wrong test winner 30% of the time. It would be tough to make decisions based on that information, wouldn’t it? How would you know if you were choosing the right experiment branch?

Qualitative research can be just as bad. I can’t tell you how many founders have spent time and money talking to potential customers and then wondered why nobody used their product. Nine times out of ten, they were talking to the wrong people, asking the wrong questions, or using terrible interview techniques.

I had one person tell me, “bad data are better than no data,” but I strongly disagree here. After all, if I know I don’t have any data, I can go do some research and learn something.

But if I have some bad data, I think I already know the answers. Confirmation bias will make it even harder for me to unlearn that bad information. I’m going to stop looking and start acting on that information, and that may influence all of my product decisions.

If I “know” that all of my users are left handed, I can spend an awful lot of time building and throwing out features for left handed people before realizing that what I got wrong was the original premise. And, of course, that problem is made even worse if I’m not getting good information about how the features are actually performing.

You Have To Keep Doing It

Unlike any given feature or piece of code, collecting data is guaranteed to be part of your process for the life of your startup.

One of the best arguments for building minimum viable products and features is that you might just throw them out once you’ve learned something from them (like that nobody wants what you built).

This isn’t true of collecting data. Obviously you may change the way you collect data or the types of data you collect, but you’re going to keep doing it, because there’s simply no other way to make informed decisions.

Because this is something that you know is absolutely vital to your company, it’s worth getting it right early.

Data Collection Is Not a Mystery

Most of your product development is going to be a mystery. That’s the nature of startups.

You’ve got a new product in a new market, possibly with new technology. You have to do a lot of digging in order to figure out what you should be building. There’s no guide book telling you exactly what features your revolutionary new product should have.

That’s not true of gathering data. There is a ton of useful, pertinent information about the right way to do both qualitative and quantitative research. There are workshops and courses you can take on how to not screw up user interviews. There are coaches you can hire to get you trained in gathering all sorts of data. There are tools you can drop in to help you do a/b testing and funnel tracking. There are blogs you can read written by people who have already made mistakes so that you don’t have to make the same ones. There is a book called Lean Analytics that pretty much lays it out for you.

You don’t have to take advantage of all of these things, but you also don’t have to start from scratch. Taking a little time to learn about the tools and methods already available to you gives you a huge head start.

Good Data Take Less Time Than Bad Data

Here’s the good news: good data actually take less time to collect than bad data. Sure, you may have to do a little bit of upfront research on the right tools and methods, but once you’ve got those down, you’re going to move a hell of a lot faster.

For example, customer development interviews go much more quickly when you’re asking the right questions of the right people. You don’t have to talk to nearly as many users when you know how to not lead them and to interpret their answers well. Observational and usability research becomes much simpler when you know what you’re looking for.

The same is true for quantitative data collection. Your a/b tests won’t seem nearly so random when you’re sure that the information in the system is correct. You won’t have to spend time as much time figuring out what’s going on with your experiments if you trust your graphs.

Good Data Does Not Mean Complete Data

I do want to make one thing perfectly clear: the quest for good data should be more about avoiding bad data than it is about making sure you have every scrap of information available.

If you don’t have all the data, and you know you don’t have all the data, that’s fine. You can always go out and do more research and testing later. You just don’t want to put yourself into the situation where you have to unlearn things later.

You don’t have to have all the answers. You just have to make sure you don’t have any wrong answers. And you do that by setting the bar for “good enough” pretty damn high on your data collection skills.


Like the post? Please share it!

Want more information like this? 


My new book, UX for Lean Startups, will help you learn how to do better qualitative and quantitative research. It also includes tons of tips and tricks for better, faster design. 

Combining Qualitative & Quantitative Research


Designers are infallible. At least, that’s the only conclusion that I can draw, considering how many of them flat out refuse to do any sort of qualitative or quantitative testing on their product. I have spoken with designers, founders, and product owners at companies of all sizes, and it always amazes me how many of them are so convinced that their product vision is perfect that they will come up with the most inventive excuses for not doing any sort of customer research or testing. 

Before I share some of these excuses with you, let’s take a look at the types of research I would expect these folks to be doing on their products and ideas.

Quantitative Reserach

When I say quantitative research in this context, I’m talking about a/b testing, product analytics, and metrics - things that tell you what is happening when users interact with your product. These are methods of finding out, after you’ve shipped a new product, feature, or change, exactly what your users are doing with it. 

Are people using the new feature once and then abandoning it? Are they not finding the new feature at all? Are they spending more money than users who don’t see the change? Are they more likely to sign up for a subscription or buy a premium offering? These are the types of questions that quantitative research can answer. 

For a simple example, if you were to design a new version of a landing page, you might run an a/b test of the new design against the old design. Half of your users would see each version, and you’d measure to see which design got you more registered users or qualified leads or sales or any other metric you cared about.

Qualitative Research

By qualitative testing, I mean the act of watching people use your product and talking to them about it. I don’t mean asking users what you should build. I just mean observing and listening to your users in order to better understand their behavior. 

You might do qualitative testing before building a new feature or product so that you can learn more about your potential users’ behaviors. What is their current workflow? What is their level of technical expertise? What products are they already using? You might also do it once your product is in the hands of users in order to understand why they’re behaving the way they are. Do they find something confusing? Are they getting lost or stuck at a particular point? Does the product not solve a critical problem for them? 

For example, you might find a few of your regular users and watch them with your product in order to understand why they’re spending less money since you shipped a new feature. You might give them a task in order to see if they could complete it or if they got stuck. You might interview them about their usage of the new feature in order to understand how they feel about it. 


Excuses, Excuses

While it may seem perfectly reasonable to want to know what your users are really doing and why they are doing it, a huge number of designers seem really resistant to performing these simple types of research or even listening to the results. I don’t know why they refuse to pay any attention to their users, but I can share some of the terrible excuses they’ve given me. 


A/B Testing is Only Good for Small Changes

I hear this one a lot. There seems to be a misconception that a/b testing is only useful for things like button color and that by doing a/b testing you’re only ever going to get small changes. The argument goes something like, “Well, we can only test very small things and so we will test our way to a local maximum without ever being able to really make an important change to our user experience.”
This is simply untrue.

You can a/b test anything. You can show two groups of users entirely different experiences and measure how each group behaves. You can hide whole features from users. You can change the entire checkout flow for half the people buying things from you. You can test a brand new registration or onboarding system. And, of course, you can test different button colors, if that is something that you are inclined to do.

The important thing to remember here is that a/b testing is a tool. Itʼs agnostic about what youʼre testing. If youʼre just testing small changes, youʼll only get small changes in your product. If, on the other hand, you test big things - major navigation changes, new features, new purchasing flows, completely different products - then youʼll get big changes. And, more importantly, you’ll know how they affected your users. 


Quantitative Testing Leads to a Confused Mess of an Interface

This is one of those arguments that has a grain of truth in it. It goes something like, “If we always just take the thing that converts best, we will end up with a confusing mess of an interface.”
Anybody who has looked at Amazonʼs product pages knows the sort of thing that a/b testing can lead to. They have a huge amount of information on each screen, and none of it seems particularly attractive. On the other hand, they rake in money.

Itʼs true that when youʼre doing lots of a/b testing on various features, you can wind up with a weird mishmash of things in your product that donʼt necessarily create a harmonious overall design. You can even wind up with features that, while they improve conversion on their own, end up hurting conversion when they’re combined. 

As an example, letʼs say youʼre testing a product detail page. You decide to run several a/b tests simultaneously for the following new features:
  • 
customer photos

  • comments
  • ratings

  • extended product details

  • shipping information

  • sale price

  • return info
Now, letʼs imagine that each one of those items, in its own a/b test, increases conversion by some small, but statistically significant margin. That means you keep all of them. Now youʼve got a product detail page with a huge number of things on it. You might, rightly, worry that the page is becoming so overwhelming that youʼll start to lose conversions.

Again, this is not the fault of a/b testing – or in this case, a/b/c/d/e testing. This is the fault of a bad test. You see, itʼs not enough that you run an a/b test. You have to run a good a/b test. In this case, just because the addition of a particular feature to your product page improved conversions doesn’t mean that adding a dozen new features to your product page will increase your conversion. 

In this instance, you might be better off running several a/b tests serially. In other words, add a feature, test it, and then add another and test. This way you’ll be sure that every additional feature is actually improving your conversion. Alternatively, you could test a few different versions of the page with different combinations of features to see which converts best. 


A/B Testing Takes Away the Need For Design

For some reason, people think that a/b testing means that you just randomly test whatever crazy shit pops into your head. They envision a world where engineers algorithmically generate feature ideas, build them all, and then just measure which one does best.

This is just absolute nonsense.

A/B testing only specifies that you need to test new designs against each other or against some sort of a control. It says absolutely zero about how you come up with those design ideas.

The best way to come up with great products is to go out and observe users and find problems that you can solve and then use good design processes to solve them. When you start doing testing, youʼre not changing anything at all about that process. Youʼre just making sure that you get metrics on how those changes affect real user behavior.

Letʼs imagine that youʼre building an online site to buy pet food. You come up with a fabulous landing page idea that involves some sort of talking sock puppet. You decide to create this puppet character based on your intimate knowledge of your user base and your sincere belief that what they are missing in their lives is a talking sock puppet. Itʼs a reasonable assumption.

Instead of just launching your wholly re-imagined landing page, complete with talking sock puppet video, you create your landing page and show it to only half of your users, while the rest of your users are stuck with their sad, sock puppet-less version of the site. Then you look to see which group of users bought more pet food. At no point did the testing process have anything to do with the design process. 

Itʼs really that simple. Nothing about a/b testing determines what youʼre going to test. A/B testing has literally nothing to do with the initial design and research process. 

Whatever youʼre testing, you still need somebody who is good at creating the experiences youʼre planning on testing against one another. A/B testing two crappy experiences does, in fact, lead to a final crappy experience. After all, if youʼre looking at two options that both suck, a/b testing is only going to determine which one sucks less.

Design is still incredibly important. It just becomes possible to measure designʼs impact with a/b testing.


There’s No Time to Usability Test

When I ask people whether they’ve done usability testing on prototypes of major changes to their products, I frequently get told that there simply wasn’t time. It often sounds something like, “Oh, we had this really tight deadline, and we couldn’t fit in a round of usability testing on a prototype because that would have added at least a week, and then we wouldn’t have been able to ship on time.” 

The fact is you don't have time NOT to usability test. As your development cycle gets farther along, major changes get more and more expensive to implement. If you're in an agile development environment, you can make updates based on user feedback quickly after a release, but in a more traditional environment, it can be a long time before you can correct a big mistake, and that spells slippage, higher costs, and angry development teams. Even in agile environments, it’s still faster to fix things before you write a lot of code than after you have pissed off customers who are wondering why you ruined an important feature that they were using. 

I know you have a deadline. I know it's probably slipped already. It's still a bad excuse for not getting customer feedback during the development process. You're just costing yourself time later. I’ve never known good usability testing to do anything other than save time in the long run on big projects.


Qualitative Research Doesn’t Work Because Users Don’t Know What They Want

This is possibly the most common argument against qualitative research, and it’s particularly frustrating, because part of the statement is quite true. Users aren’t particularly good at coming up with brilliant new ideas for what to build next. Fortunately, that doesn’t matter. 

Let’s make this perfectly clear. Qualitative research is NOT about asking people what they want. At no point do we say, “What should we build next?” and then relinquish control over our interfaces to our users. People who do this are NOT doing qualitative research. 

Qualitative research isn’t about asking people what they want and giving it to them. Qualitative research is about understanding the needs and behaviors of your users. It’s about really knowing what problem you’re solving and for whom.

Once you understand what your users are like and what they want to do with your product, it’s your job to come up with ways to make that happen. That’s the design part. That’s the part that’s your job.


It’s My Vision - Users Will Screw it Up

This can also be called the "But Steve Jobs doesn't listen to users..." excuse. 

The fact is, understanding what your users like and don't like about your product doesn't mean giving up on your vision. You don't need to make every single change suggested by your users. You don't need to sacrifice a coherent design to the whims of a user test. You don’t even need to keep a design just because it converts better in an a/b test. 

What you do need to do is understand exactly what is happening with your product and why. And you can only do that by gathering data. The data can help you make better decisions, but they don’t force you to do anything at all.


Design Isn’t About Metrics

This is the argument that infuriates me the most. I have literally heard people say things like, “Design can’t be measured, because design isnʼt about the bottom line. Itʼs all about the customer experience.”

Nope.

Wouldnʼt it be a better experience if everything on Amazon were free? Be honest! It totally would. 

Unfortunately, it would be a somewhat traumatic experience for the Amazon stockholders. You see, we donʼt always optimize for the absolute best user experience. We make tradeoffs. We aim for a fabulous user experience that also delivers fabulous profits.

While itʼs true that we donʼt want to just turn our user experience design over to short term revenue metrics, we can vastly improve user experience by seeing which improvements and features are most beneficial for both users and the company.

Design is not art. If you think that thereʼs some ideal design that is completely divorced from the effect itʼs having on your companyʼs bottom line, then youʼre an artist, not a designer. Design has a purpose and a goal, and those things can be measured.


So, What’s the Right Answer?

If you’re all out of excuses, there is something that you can do to vastly improve your product. You can use quantitative and qualitative data together. 

Use quantitative metrics to understand exactly what your users are doing. What features do they use? How much do they spend? Does changing something big have a big impact on real user behavior?

Use qualitative research to understand why your users do what they do. What problems are they trying to solve? Why are they dropping out of a particular task flow when they do? Why do they leave and never come back.

Let’s look at an example of how you might do this effectively. First, imagine that you have a payment flow in your product. Now, imagine that 80% of your users are not getting through that payment flow once they’ve started. Of course, you wouldn’t know that at all if you weren’t looking at your metrics. You also wouldn’t know that the majority of people are dropping out in one particular place in the flow.

Next, imagine that you want to know why so many people are getting stuck at that one place. You could do a very simple observational test where you watch four or five real users going through the payment flow in order to see if they get stuck in the same place. When they do, you could discuss with them what stopped them there. Did they need more information? Was there a bug? Did they get confused?

Once you have a hypothesis about what’s not working for people, you can make a change to your payment flow that you think will fix the problem. Neither qualitative nor quantitative research tells you what this change is. They just alert you that there’s a problem and give you some ideas about why that problem is happening. 

After you’ve made your change, you can run an a/b test of the old version against the new version. This will let you know whether your change was effective or if the problem still exists. This creates a fantastic feedback loop of information so that you can confirm whether your design instincts are functioning correctly and you’re actually solving user problems. 

As you can hopefully see from the example, nobody is saying that you have to be a slave to your data. Nobody is saying that you have to turn your product vision or development process over to an algorithm or a focus group. Nobody is saying that you can only make small changes. All I’m saying is that using quantitative and qualitative research correctly gives you insight into what your users are doing and why they are doing it. And that will be good for your designs, your product, and your business.


Like the post? 

Fucking Ship It Already: Just Not to Everyone At Once

There is a pretty common fear that people have. They’re concerned that if they ship something that isn’t ready, they’ll get hammered and lose all their customers. Startups who have spent many painstaking months acquiring a small group of loyal customers are hesitant to lose those customers by shipping something bad.

I get it. It’s scary. Sorry, cupcake. Do it anyway.

First, your early adopters tend to be much more forgiving of a few misfires. They’re used to it. They’re early adopters. Yours is likely not the first product they’ve adopted early. If you’re feeling uncomfortable, go to the Way Back Machine and look at some first versions of products you use every day. When your eyes stop bleeding, come back and finish this post. I’ll wait.

Still nervous? That’s ok. The lucky thing is that you don’t have to ship your ridiculous first draft of a feature to absolutely everybody at once. Let’s look at a few strategies you can use to reduce the risk.

The Interactive Mockup

A prototype is the lowest risk way you can get your big change, new feature, or possible pivot in front of real users without ruining your existing product. And you’d be surprised at how often it helps you find easy to fix problems before you ever write a line of “real code.”

If you don’t want to build an entire interactive prototype, trying showing mockups, sketches, or wireframes of what you’re considering. The trick is that you have to show it to your real, current users.

Get on a screen share with some users and let them poke around the prototype. Whatever you do, never tell them why you made the changes or what the feature is supposed to be for or how awesome it is. You want the experience to be as close as possible to what it would be if you just released the feature into the wild and let the users discover it for themselves.

If your product involves any sort of user generated content, taking the time to include some of the tester’s own content can be extremely helpful. For example, if it’s a marketplace where you can buy and sell handmade stuff, having the user’s own items can make a mockup seem more familiar and orient the user more quickly.

Of course, if there’s sensitive financial data or anything private, make sure to get the user’s permission BEFORE you include that info in their interactive prototype. Otherwise, it’s just creepy.

The Opt In

Another method that works well is the Opt In. While early adopters tend to be somewhat forgiving of changes or new features, people who opt in to those changes are even more so.

Allowing people to opt in to new features means that you have a whole group of people who are not only accepting of change but actively seeking it out. That’s great for getting very early feedback while avoiding the occasional freakout from the small percentage of people who just enjoy screaming, “Things were better before!”

Here’s a fun thing you can learn from your opt in group: If people who explicitly ask to see your new feature hate your new feature, your new feature probably sucks.

The Opt Out

Of course, you don’t only want to test your new features or changes with people who are excited about change. You also want to test them with people who hate change, since they’re the ones who are going to scream loudest.

Once you’re pretty sure that your feature doesn’t suck, you can share it with more people. Just make sure to let them go back to the old way, and then measure the number of people who actually do switch back.

Is it a very vocal 1% that is voting with their opt out? You’re probably ok. Is half of your user base switching back in disgust? You may not have nailed that whole “making it not suck” thing.

The n% Rollout

Even with an opt out, if you’ve got a big enough user base, you can still limit the percentage of users who see the change. In fact, you really should be split testing this thing 50/50, but if you want to start with just 10% to make sure that you don’t have any major surprises, that’s a totally reasonable thing to do.

When you roll a new feature out to a small percentage of your users, just make sure that you know what sorts of things you’re looking for. This is a great strategy for seeing if your servers are going to keel over, for example.

It’s also nice for seeing if that small, randomly selected cohort behaves any differently from the group that doesn’t have the new feature. Is that cohort more likely to make a purchase? Are they more likely to set fire to their computers and swear never to use your terrible product ever again? These are both good things to know.

Do remember, however, that people on the internet talk about things. Kind of a lot. If you have any way at all for your users to be in contact with one another, people will find out that their friends are seeing something different. This can work for or against you. Just figure out who’s whining the loudest about being jealous of the other group, and you’ll know whether to continue the rollout. What you want to hear is, “Why don’t I have New New New New New Thing, yet?” and not “Just be thankful that they haven’t forced the hideous abomination on you. Then you will have to set your computer on fire.”

The New User Rollout

Of course, if you’re absolutely terrified of your current user base (and you’d be surprised at how many startups seem to be), you can always release the change only to new users.

This is nice, because you get two completely fresh cohorts where the only difference is whether or not they’ve seen the change. It’s a great way to do A/B testing.

On the other hand, if it’s something that’s supposed to improve things for retained users or users with lots of data, it can take a really long time to get enough information from this. After all, you need those new cohorts to turn into retained users before seeing any actual results, and that can take months.

Also, whether or not new users love your changes doesn’t always predict whether your old users will complain. Your power users may have invested a lot of time and energy into setting up your product just the way they want it, and making major changes that are better for new folks doesn’t always make them very happy.

In the end, you need to make the decision whether you’ll have enough happy new users to offset the possibly angry old ones. But you’ll probably need to make that decision about a million more times in the life of your startup, so get used to it.

So, are you ready to fucking ship it, already? Yes. Yes, you are. Just don't ship it to everybody all it once.

Now, follow me on Twitter.

How Metrics Can Make You a Better Designer

I have another new article in Smashing Magazine's UX section: How Metrics Can Make You a Better Designer.

Here's a little sample:

Metrics can be a touchy subject in design. When I say things like, “Designers should embrace A/B testing” or “Metrics can improve design,” I often hear concerns.

Many designers tell me they feel that metrics displace creativity or create a paint-by-numbers scenario. They don’t want their training and intuition to be overruled by what a chart says a link color should be.

These are valid concerns, if your company thinks it can replace design with metrics. But if you use them correctly, metrics can vastly improve design and make you an even better designer.


Read the rest here >

Why Your Test Results Don't Add Up and What To Do About It

Check out my guest blog post for KISSmetrics: Why Website Test Results Don’t Always Add Up & What To Do About It!

Here's a little sample:

If you do enough A/B testing, I promise that you will eventually have some variation of this problem:

You run a test. You see a 10% increase in conversion. You run a different, unrelated test. You see a 20% increase in conversion. You roll both winning branches out to 100% of your customers. You donʼt see a 30% increase in conversion.

Why? In every world Iʼve ever inhabited, 10 plus 20 equals 30, right? Youʼve proven that both changes youʼve made are improvements. Why arenʼt you seeing the expected overall increase in conversions when you roll them both out?


Read the Rest at KISSmetrics.


Breaking the Rules: A UX Case Study

Recently, I was lucky enough to be featured in Smashing Magazine's brand new UX section! Smashing is already a fabulous resource for web design and coding, and I think it's going to be a great place to learn about user experience.

You should read my first article, Breaking the Rules: A UX Case Study.

Here's a little something to get you started:

I read a lot of design articles about best practices for improving the flow of sign-up forms. Most of these articles offer great advice, such as minimizing the number of steps, asking for as little information up front as possible, and providing clear feedback on the status of the user’s data.

If you’re creating a sign-up form, you could do worse than to follow all of these guidelines. On the other hand, you could do a lot better.

Design guidelines aren’t one size fits all. Sometimes you can improve a process by breaking a few rules. The trick is knowing which rules to break for a particular project.


Read the rest of the article!

Hypothesis Generation vs. Validation

A lot of people ask me what sort of research they should be doing on their products. There are a lot of factors that go into deciding which sort of information you should be getting from users, but it pretty much boils down to a question of “what do you want to learn.”

Today, I’m going to explore one of the many ways you can go about looking at this: Hypothesis Generation vs. Hypothesis Validation. Don’t worry, it’s not as complicated as I’ve made it sound.

What is Hypothesis Generation

In a nutshell, hypothesis generation is what helps you come up with new ideas for what you need to change. Sure, you can do this by sitting around in a room and brainstorming new features, but reaching out and learning from your users is a much faster way of getting the right data.

Imagine you were building a product to help people buy shoes online. Hypothesis generation might include things like:

  • Talking to people who buy shoes online to explore what their problems are
  • Talking to people who don’t buy shoes online to understand why
  • Watching people attempt to buy shoes both online and offline in order to understand what their problems really are rather than what they tell you they are
  • Watching people use your product to figure out if you’ve done anything particularly confusing that is keeping them from buying shoes from you

As you can see, you can do hypothesis generation at any point in the development of your product. For example, before you have any product at all, you need to do research to learn about your potential users’ habits and problems. Once you have a product, you need to do hypothesis generation to understand how people are using your product and what problems you’ve caused.

To be clear, the research itself does not generate hypotheses. YOU do that. The goal is not to just go out and have people tell you exactly what they want and then build it. The goal is to gain an understanding of your users or your product to help you think up clever ideas for what to build next.

Good hypothesis generation almost always involves qualitative research. At some point, you need to observe people or talk to people in order to understand them better.

However, you can sometimes use data mining or other metrics analyzation to begin to generate a hypothesis. For example, you might look at your registration flow and notice a severe drop off half way through. This might give you a clue that you have some sort of user problem half way through your registration process that you might want to look into with some qualitative research.

What is Hypothesis Validation

Hypothesis validation is different. In this case, you already have an idea of what is wrong, and you have an idea of how you might possibly fix it. You now have to go out and do some research to figure out if your assumptions and decisions were correct.

For our fictional shoe-buying product, hypothesis validation might look something like:

  • Standard usability testing on a proposed new purchase flow to see if it goes more smoothly than the old one
  • Showing mockups to people in a particular persona group to see if a proposed new feature appeals to that specific group of people
  • A/B testing of changes to see if a new feature improves purchase conversion

Hypothesis validation also almost always involves some sort of tangible thing that is getting tested. That thing could be anything from a wireframe to a prototype to an actual feature, but there’s something that you’re testing and getting concrete data about.

You can use both quantitative and qualitative data to validate a hypothesis, but you have to choose carefully to make sure you’re testing the right thing. In fact, sometimes a combination of the two is most effective. I’ve got some information on choosing the right type of test in my post Qual vs. Quant: When to Listen and When to Measure.

Types of Research

Why is this distinction between generation and validation important? Because figuring out whether you’re generating hypotheses or validating them is necessary for deciding which type of research you want to do.

Want to understand why nobody is registering for your site? Generate some hypotheses with observational testing of new users. Want to see if the mockups for your new registration flow are likely to improve matters? Validate your hypothesis with straight usability testing of a prototype.

These aren’t the only factors that go into determining the type of research necessary for your stage of product development, but they’re an important part of deciding how to learn from your users.

Like the post? Follow me on Twitter!

Designers Need to A/B Test Their Designs

The other day, I posted something I strongly believe on Twitter. A few people disagreed. I’d like to address the arguments, and I’d love to hear feedback and counter-arguments in the comments where you have more than 140 characters to tell me I’m wrong.

My original tweet was, “I don't trust designers who don't want their designs a/b tested. They're not interested in knowing if they were wrong.”

Here are some of the real responses that I got on Twitter with my longer form response.

“There’s a difference between A/B testing (public) and internally deciding. Design is also a matter of taste.”

I agree. There is a big difference between A/B testing in public and internally deciding. That’s why I’m such a huge fan of A/B testing. You can debate this stuff for weeks, and often it’s a huge waste of time.

When you’re debating design internally, what you should be asking is “which of these designs will be better for the business and users.” A/B testing tells you conclusively which side is right. Debate over!

Ok, there’s the small exception of short term vs. long term effects, which is addressed later, but in general, it’s more definitive than the opinion of the people in the room.

With regard to the “matter of taste,” that’s both true and false. Sure, different people like different designs. What you’re saying by refusing to A/B test your designs is that your taste as a designer should always trump that of the majority of your users. As long as you like your design, you don’t care whether users agree with you.

If you want your design aesthetic to override that of your users, you should be an artist. I love art. I even, very occasionally, buy some of it.

But I pay for products all the time, and I tend to buy products that I think are well designed, not necessarily ones where the designer thought they were well designed.

“If Apple had done A/B tests for the iPod in 2001 with a user-replaceable battery, that version would’ve likely won—initially.”

Honestly, it still might win. Is taking your iPod to the Apple store when the battery dies really a feature? No! It’s a design tradeoff. They couldn’t create something with the other design elements they wanted that still had a replaceable battery. That’s fine. 


But all other things about the iPod being totally equal, wouldn’t you buy the one where you could replace the battery yourself? I would. The key there is the phrase “totally equal.”

“Seeing far into the future of technology is not something consumers are particularly great at.”

I feel like the guy who made this argument was confusing A/B testing with bad qualitative testing or just asking users what they would like to see in a product.

This isn’t what A/B testing does. A/B testing measures actual user behavior right now. If I make this change, will they give me more money? It has literally nothing to do with asking users to figure out the future of technology.

“A/B testing has value but shouldn't be litmus test for designer or a design”

Really? What should be the litmus test for a designer or a design if not, “does this change or set of changes actually improve the key metrics of my company”?

In the end, isn’t that the litmus test for everybody in a company? Are you contributing to the profitability of the business in some way?

If you have some better way of figuring out if your design changes are actually improving real metrics, I’d love to hear about it. We can make THAT the litmus test for design.

“Data is valuable but must be interpreted. Doesn't "prove" wrongness or rightness. Designer still has judgment.”

I agree with the first sentence. Data certainly must be interpreted. I even agree that certain design changes may hurt certain metrics, and that can be ok if they’re improving other metrics or are shown to improve things in the long run.

But the only way to know if your overall design is actually making things better for your users is by scientifically testing it against a control.

If your overall design changes aren’t improving key metrics, where’s the judgement there? If you release something that is meant to increase the number of signups and it decreases the number of signups, I think that pretty effectively “proves wrongness.”

The great thing about A/B testing is that you know when this happens.

“Is it the designers fault, surely more appropriate to an IA? After all the IA should dictate the feel/flow.”

First off, I don’t work for companies that are big enough to draw a distinction between the two, but I’m sure there’s enough blame to go around.

Secondly, I think that everybody in an organization has the responsibility to improve key metrics. If you think that your work shouldn’t increase revenue, retention, or other numbers you want higher, why should you be employed?

Design of all kinds is important and can have a huge impact on company profitability. That impact can and should be measured. You don’t get a pass just because you’re not changing flow.

“A/B tests are a snapshot of current variables. They don’t embody nor convey a bigger strategy or long-term vision.”

Also, “That’s only an absolute truth you can rely on if you A/B test for the entire lifespan of the product, which defeats the point.”

These are excellent points, and they are a drawback of A/B testing. It’s sometimes tough to tell what the long term effects of a particular design change are going to be from A/B testing. Also, A/B testing doesn’t easily account for design changes that are a part of a larger design strategy.

In other words, sometimes you’re going to make changes that cause problems with your metrics in the short term, because you strongly believe that it’s going to improve things long term.

However, I believe that you address this by recognizing the potential for problems and designing a better test, not by refusing to A/B test at all.

Just because this particular tool isn’t perfect doesn’t mean we get to fall back on “trust the designers implicitly and never make them check their work.” That doesn’t work out so well sometimes either.

An Argument I Didn’t Hear

There’s one really good argument that I didn’t get, although some of the above tweets touched on it. Sometimes changes that individually test well don’t test well as a whole.

This is a really serious problem with A/B testing because you can wind up with Frankenstein-style interfaces. Each individual decision wins, but the combination is a giant mess.

Again, you don’t address this by not A/B testing. You address it by designing better tests and making sure that all of your combined decisions are still improving things.

How I Really Feel

Look, if I’m hiring for a company that wants to make money (and most of them do), I want my designers to understand how their changes actually affect my bottom line.

No matter how great a designer thinks his or her design is, if it hurts my revenue and retention or other key metrics, it’s a bad design for my company and my users.

Saying you’re against having your designs A/B tested sounds like you’re saying that you just don’t care whether what you’re changing works for users and the company. As a designer, you’re welcome to do that, but I’m not going to work with you.

Like the post? Follow me on Twitter!

Qual vs. Quant: When to Listen and When to Measure

I have written about qualitative vs quantitative research before, but I still get a lot of questions about it. To answer some of those questions, I want to do a bit of a deeper dive here and give a few examples to help startups answer the key question.

To be clear, that key question is “when should I use qualitative research, and when should I use quantitative research for the best results?” Another way of looking at this is, “when should I be listening to users, and when should I just be shipping code and looking at the metrics?”

The real answer is that you should do both constantly, but there are times when one is significantly more helpful than the other.

I will continue to repeat my cardinal rule: Quantitative research tells you WHAT your problem is. Qualitative research tells you WHY you have that problem.

Now, let’s look at what that actually means to you when you’re making product decisions.

A One Variable Change

When you’re trying to decide between qualitative and quantitative testing for any given change or feature, you need to figure out how many variables you’re changing.

Here’s a simple example: You have a product page with a buy button on it. You want to see if the buy button performs better if it’s higher on the page without really changing anything else. Which do you do? Qualitative of quantitative?

That’s right, I said this one was simple. There’s absolutely no reason to qualitatively test this before shipping it. Just get this in front of users and measure their actual rate of clicking on the button.

The fact is, with a change this small, users in a testing session or discussion aren’t going to be able to give you any decent information. Hell, they probably won’t even notice the difference. Qualitative feedback here is not going to be worth the time and money it takes to set up interviews, talk to users, and analyze the data.

More importantly, since you are only changing one variable, if user behavior changes, you already have a really good idea WHY it changed. It changed because the CTA button was in a better place. There’s nothing mysterious going on here.

There’s an exception! In a few cases, you are going to ship a change that seems incredibly simple, and you are going to see an enormous and surprising change in your metrics (either positive or negative). If this happens, it’s worth running some observational tests with something like UserTesting.com where you just watch people using the feature both before and after the change to see if anything weird is happening. For example, you may have introduced a bug, or you may have made it so that the button is no longer visible to certain users.

A Multi-Variable or Flow Change

Another typical design change involves adding an entirely new feature, which may affect many different variables.

Here’s an example: You want to add a feature that allows people to connect with other users of your product. You’ll need to add several new pieces to your interface in order to allow users to do things like find people they know, find other interesting people they don’t know, manage their new connections, and get some value from the connections they’ve made.

Now, you could simply build the feature, ship it, and test to see how it did, much the way you made your single variable change. The problem is that you’ll have no idea WHY it succeeded or failed - especially failed.

Let’s assume that you ship it and find that it hurts retention. You can assume that it was a bad feature choice, but often I find that people don’t use new features not because they hate the concept, but because the features are badly implemented.

The best way to deal with this is to prevent it from happening in the first place. When you’re making large, multi-variable changes or really rearranging a process flow for something that already exists on your site, you’ll want to perform qualitative testing before you ever ship the product.

Specifically, the goal here is to do some standard usability testing with interactive prototypes, so that you can learn which bits are confusing (ps. yes, there are confusing bits, trust me!) and fix them before they ever get in front of users.

Sure, you’ll still do an a/b test once you’ve shipped it, but give that new feature the best possible chance to succeed by first making sure you’re not building something impossible to use.

Deciding What To Build Next

Look, whatever you take from this next part, please do not assume that I’m telling you that you should ask your users exactly what they want and then build that. Nobody thinks that’s the right way to build products, and I’m tired of arguing about it with people who don’t get UCD or Lean UX.

However, you can learn a huge amount from both quantitative and qualitative research when you’re deciding what to build next.

Here’s an example: You have a flourishing social commerce product with lots of users doing lots of things, but you also have 15 million ideas for what you should build next. You need to narrow that down a bit.

The key here is that you want to look at what your users are currently doing with your product and what they aren’t doing with it, and you should do that with both qualitative and quantitative data.

Qualitative Approaches:

  • Watch users with your product on a regular basis. See where they struggle, where they seem disappointed, or where they complain that they can’t do what they want. Those will all give you ideas for iterating on current features or adding new ones.
  • Talk to people who have stopped using your product. Find out what they thought they’d be getting when they started using it and why they stopped.
  • Watch new users with your product and ask them what they expected from the first 15 minutes using the product. If this doesn’t match what your product actually delivers, either fix the product or fix the first time user experience so that you’re fulfilling users’ expectations.

Quantitative Approaches:

  • Look at the features that are currently getting the most use by the highest value customers. Try to figure out if there’s a pattern there and then test other features that fit that pattern.
  • Try a “fake” test by adding a button or navigation element that represents the feature you’re thinking of adding, and then measure how many people actually click on it. Instead of implementing an entire system for making friends on your site, just add a button that allows people to Add a Friend, and then let them know that the feature isn’t quite ready yet while you tally up the percentage of people who are pressing the button.

Still Don’t Know Which Approach to Take?

What if your change falls between the cracks here? For example, maybe you’re not making a single variable change, but it’s not a huge change either. Or maybe you’re making a pretty straightforward visual design or messaging change that will touch a lot of places in the product but that doesn’t actually affect the user process too much.

As many rules as we try to make, there will still be judgement calls. The best strategy is to make sure that you’re always keeping track of your metrics and observing people using your product. That way, even if you don’t do exactly the right kind of research at exactly the right time, you’ll be much more likely to catch any problems before they hurt your business.

Like the post? Follow me on Twitter!

Lean UX - A Case Study

For those very, very few (ok, none) of you who read my blog but don't read Eric Ries's blog, Startup Lessons Learned, I have some exciting news for you. But first, why the hell aren't you reading Eric's blog? You really should. It's great.

I've a written a guest post that now appears on the Startup Lessons Learned blog. It's a case study of a UX project I did with the lean startup Food on the Table.

If you're wondering whether design works well with lean startups, I answer that question in the post. Spoiler alert: The answer is 'yes'.

Testing Whether Your Users Will Buy

As you all know by now, I’m a huge proponent of qualitative user testing. I think it’s wonderful for learning about your users and product.

But it’s not a panacea. The fact is, there are many questions that qualitative testing either doesn’t answer well or for which qualitative testing isn’t the most efficient solution. I cover some of them in my A Faster Horse post.

The trick is knowing which questions you can answer by listening to your users and which questions need a different methodology.

Unfortunately, one of the most important questions people want answered isn’t particularly well suited to qualitative testing.

If I Build It, Will They Buy?

I get asked a lot whether users will buy a product if the team adds a specific feature. Sadly, I always have to answer, “I have no idea.”

The problem is, people are terrible at predicting their future behavior. Imagine if somebody were to ask you if you were going to buy car this year. Now, for some of you, that answer is almost certainly yes, and for others it’s almost certainly no. But for most of us, the answer is, “it depends on the circumstances.”

For some, the addition of a new feature - say, an electric motor - might be the deciding factor, but for many the decision to buy a car depends on a lot of factors, most of which aren’t controlled by the car manufacturer: the economy, whether a current car breaks down, whether we win the lottery or land that job at Goldman Sachs, etc. There are other factors that are under the control of the car company but aren't related to the feature: maybe the new electric car is not the right size or isn't in our price range or isn't our style.

This is true for smaller purchases too. Can you absolutely answer whether or not you will eat a cookie this week? Unless you never eat cookies (I'm told these people exist), it’s probably not something you give a lot of thought to. If somebody were to ask you in a user study, your answer would be no better than a guess and would possibly even be biased by the simple act of having the question asked.

Admit it, a cookie sounds kind of good right now, doesn’t it?

There are other reasons why qualitative testing isn't great at predicting future behavior, but I'm not going to bore you with them. The fact is, it's just not the most efficient or effective method for answering the question, "If I build it, will they come?"

What Questions Can Qualitative Research Answer Well?

Qualitative research is phenomenal for telling you whether your users can do x. It tells you whether the feature makes sense to them and whether they can complete a given task successfully.


To a smaller extent, it can even tell you whether they are likely to enjoy performing the task, and can certainly tell you if they hate it. (Trust me, run a few user tests on a feature they hate. You'll know.)

This obviously has some effect on whether the user will do x, since they’re a lot more likely to do it, if it isn’t annoying or difficult. But it's really better at predicting the negative case (ie. the user most likely won't use this feature as you're currently building it) than the positive one.

Sometimes qualitative research can also give you marginally useful feedback if your users are extremely likely or unlikely to make a purchase. For example, if you were to show them an interactive prototype with the new feature built into it, you might be able to make a decent judgement based on their immediate reactions if all of your participants were exceptionally excited or incredibly negative about a particular feature.

Unfortunately, this, in my experience, is the exception, rather than the rule. It’s rare that a participant in a study sees a new feature and shrieks with delight or recoils in horror. Although, to be fair, I’ve seen both.

What’s the Best Way to Answer This Question?

Luckily, this is a question that can be pretty effectively answered using quantitative data, even before you build a whole new feature. A lot of companies have had quite a bit of success with adding a “fake” feature or doing a landing page test.

For example, one client who wanted to know their expected purchase conversion rate before they did all the work to integrate purchasing methods and accept credit cards simply added a Buy button to each of their product pages. When a customer clicked the button, he was told that the feature was not quite ready, and the click was registered so that the company could tell how many people were showing a willingness to buy.

By measuring the number of people who thought they were making a commitment to purchase, the client was able to estimate more effectively the number of people who would actually purchase if given the option.

The upshot is that the only really effective way to tell if users will do something is to set up a test and watch what they actually do, and that requires a more quantitative testing approach.

Are There Other Questions I Can’t Answer Qualitatively?

Yep. Lots of them. I’ll probably cover them at some point in the future if people are interested. Feel free to ask about other specific questions in the comments, and I’ll try to let you know what sorts of testing methods work best for answering them.

Enjoy the post? Thanks!
How about following me on Twitter?

The Dangers of Metrics (Only) Driven Product Development

When I first started designing, it was a lot harder to know what I got right. Sure, we ran usability tests, and we looked generally at things like page counts and revenue before and after big redesigns, but it was still tough to know exactly what design changes were making the biggest difference. Everything changed once I started working with companies that made small, iterative design changes and a/b tested the results against specific metrics.

To be clear, not all the designers I know like working in this manner. After all, it's no fun being told that your big change was a failure because it didn't result in a statistically significant increase in revenue or retention. In fact, if you're a designer or a product owner and are required to improve certain metrics, it can sometimes be tempting to cheat a little.

This leads to a problem that I don't think we talk about enough: Metrics (Only) Driven Product Development.

What Is Metrics (Only) Driven Product Development?

Imagine that you work at a store, and your manager has noticed that when the store is busy, the store makes more money. The manager then tells you that your job is to make the store busier - that's your metric that you need to improve.

You have several options for improving your metric. You could:
  • Improve the quality of the shopping experience so that people who are already in the store want to stay longer
  • Offer more merchandise so that people find more things they want to buy
  • Advertise widely to try to attract more people into the store
  • Sell everything at half off
  • Remove several cash registers in order to make checking out take longer, which should increase the number of people in the store at a time, since it will take people longer to get out
  • Hire people to come hang out in the store
As you can see, all of the above would very likely improve the metric you were supposed to improve. They would all ensure that, for awhile at least, the store was quite busy. However, some are significantly better for the overall health of the store than others.



The same thing happens all the time when designing products. If your assigned goal is to increase the number of active users, there are lots of different design changes you could make, but not all of them will be equally effective for improving the actual goal, which is probably increasing the number of people who use the product and generate revenue.

How Does This Happen?

I think the biggest reason that this happens is that people fixate on metrics without understanding the reason behind the numbers. Designers and product owners are then pressured to move a number that represents a particular metric rather than focusing on improving the product.

One company I talked with had this problem with acquisition. The person who was responsible for acquiring new customers was simply given a budget and told to get as many users as possible for that amount of money. Unfortunately, the users that were cheapest to acquire were the least likely to spend money on the site. If, instead of trying to maximize the number of users, he had concentrated on maximizing the number of users who were likely to spend money, he would have acquired fewer people and missed his metric, but he would have increased revenue.

Another company had a design problem. They wanted to redesign their Invite a Friend feature to encourage people to invite more friends. Unfortunately, the "most effective" method of getting people to invite friends was to forcibly spam users' Facebook feeds and make it easy for users for accidentally invite everybody in their address books. While this resulted in more invitations sent, it also vastly increased the number of unhappy customers and decreased the percentage of invitations that were accepted. It also caused the company to be banned from Facebook and put on the spam list of several ISPs. It sure improved that invitation metric, though.

How Should You Avoid It?

There are three ways to avoid this problem, and you should use all of them.

Make sure that you're measuring the right metric.

If you care about revenue (and you should), measure revenue. If you care about retention, measure retention. If you care about page views, you're probably doing something wrong.

Unfortunately, it can be difficult to immediately see the impact of a particular design change on things like revenue and retention, which makes it tempting to use substitutes for the important number.

If you are using a substitute - for example, if you're using something like "customers returning once" as a shorthand for "becoming an active customer" make sure that the link is actually causal. In other words, if you can increase the number of people who come back once, make sure that that really does lead to an increase in people who become an active customer.

Make sure that you're not gaming the metrics.

Paying people to come back to your site may result in more returning customers, but it doesn't necessarily result in more customers paying you. If you cut your prices in half, you may end up selling twice as many items, but you're not making any more money. Make sure that, if you're moving a metric, you understand the second (and third and nth) order effects of whatever change improved the metrics.

Make your customer experience better.

This may seem obvious, but pissing off your customers is a terrible long term strategy, even if it briefly moves a metric. On the other hand, improving your customers' overall experience leads to happy, contented customers who stick around and continue to pay for your service. Over time, that's going to improve all of your metrics.

And always remember, metrics are just shorthand for real customer behaviors that are important to your business. They are a tool to help you understand your product, not a goal to be met at any cost. 

Like the post? Follow me on Twitter.

Please Stop Annoying Your Users

Once upon a time, I worked with a company that was addicted to interstitials. Interstitials, for those of you who don’t know the term, are web pages or advertisements that show up before an expected  content page. For example, the user clicks a link or button and expects to be taken to a news article or to take some action, and instead she is shown a web page selling her something.

Like many damaging addictions, this one started out innocently enough. You see, the company had a freemium product, so they were constantly looking for ways to share the benefits of upgrading to the premium version in a way that flowed naturally within the product.

They had good luck with one interstitial that informed users of a useful new feature that required the user to upgrade. They had more good luck with another that asked the user to consider inviting some friends before continuing on with the product.

Then things got ugly.

Customers could no longer use the product for more than a few minutes without getting asked for money or to invite a friend or to view a video to earn points. Brand new users who didn’t even understand the value proposition of the free version were getting hassled to sign up for a monthly subscription.
Every time I tried to explain that this was driving users away, management explained, “But people buy things from these interstitials! They make us money! Besides, if people don’t want to see them, they can dismiss them.”

How This Affects Metrics

Of course, you know how this goes. Just looking at the metrics from each individual interstitial, it was pretty clear that people did buy things or invite friends or watch videos. Each interstitial did, in fact, make us some money. The problem was that overall the interstitials lost us customers and potential customers by driving away people who became annoyed.

The fact that the users could simply skip the interstitials didn’t seem to matter much. Sure people could click the cleverly hidden “skip” button – provided they could find it – but they had already been annoyed. Maybe just a little. Maybe only momentarily. But it was there. The product had annoyed them, and now they had a slightly more negative view of the company.

Here’s the important thing that the company had to learn: a mildly annoyed user does not necessarily leave immediately. She doesn’t typically call customer service to complain. She doesn’t write a nasty email. She just gets a little bit unhappy with the service. And the next time you do something to annoy her, she gets a little more unhappy with the service. And if you annoy her enough, THEN she leaves.

The real problem is that this problem is often tricky to identify with metrics. It’s a combination of a lot of little things, not one big thing, that makes the user move on, so it doesn’t show up as a giant drop off in a particular place. It’s just a slow, gradual attrition of formerly happy customers as they get more and more pissed off and decide to go elsewhere.

If you fix each annoyance and A/B test it individually, you might not see a very impressive lift, because, of course, you still have dozens of other things that are annoying the user. But over time, when you’ve identified and fixed most of the annoyances, what you will see is higher retention and better word of mouth as your product stops vaguely irritating your users.

Some Key Offenders

I can’t tell you exactly what you’re doing that is slightly annoying your customers, but here are a few things that I’ve seen irritate people pretty consistently over the years:
  • Slowness
  • Too many interstitials
  • Not remembering information - for example, not maintaining items in a shopping cart or deleting the information that a user typed into a form if there is an error
  • Confusing or constantly changing navigation
  • Inconsistent look and feel, which can make it harder for users to quickly identify similar items on different screens
  • Hard to find or inappropriately placed call to action buttons
  • Bad or unresponsive customer service

It’s frankly not easy to fix all of these things, and it can be a leap of faith for companies who want every single change to show a measurable improvement in key metrics. But by making your product less annoying overall, you will end up with happier customers who stick around.

Like the post? Follow me on Twitter!

Also, come hear me speak on Wednesday, Sept. 29th, at Web 2.0 Expo New York. I’ll be talking about how to effectively combine qualitative research, quantitative analytics, and design vision in order to improve your products

5 Mistakes People Make Analyzing Qualitative Data

My last blog post was about common mistakes that people make when analyzing quantitative data, such as you might get from multivariate testing or business metrics. Today I’d like to talk about the mistakes people make when analyzing and using qualitative data.

I’m a big proponent of using both qualitative and quantitative data, but I have to admit that qualitative feedback can be a challenge. Unlike a product funnel or a revenue graph, qualitative data can be messy and open ended, which makes it particularly tough to interpret.

For the purposes of this post, qualitative information is generated by the following types of activities:
  • Usability tests
  • Contextual Inquiries
  • Customer interviews
  • Open ended survey questions (ie. What do you like most/least about the product?)

Insisting on Too Large a Sample

With almost every new client, somebody questions how many people we need for a usability test “to get significant results.” Now, if you read my last post, you may be surprised to hear me say that you shouldn’t be going for statistical significance here. I prefer to run usability tests and contextual inquiries with around five participants. Of course, I prefer running tests iteratively, but that’s another blog post.

Analyzing the data from a qualitative test or even just reading through essay-type answers in surveys takes a lot longer per customer than running experiments in a funnel or looking at analytics and revenue graphs. You get severely diminishing returns from each extra hour you spend asking people the same questions and listening to their answers.

Here’s an example from a test I ran. The customer wanted to know all the different pain points in their product so that they could make one big sweep toward the end of the development cycle to fix all the problems. Against my better judgment, we spent a full two weeks running sessions, complete with a moderator, observers, a lab, and all the other attendant costs of running a big test. The problem was that we found a major problem in the first session that prevented the vast majority of participants from ever finding an entire section of the interface. Since this problem couldn’t be fixed before moving on to the rest of the sessions, we couldn’t actually test a huge portion of the product and had to come back to it later, anyway.

The Fix: Run small, iterative tests to generate a manageable amount of data. If you’re working on improving a particular part of your product or considering adding a new feature, do a quick batch of interviews with four or five people. Then, immediately address the biggest problems that you find. Once you’re done, run another test to find the problems that were being masked by the larger problems. Keep doing this until your product is perfect (ie. forever). It’s faster, cheaper, and more immediately actionable than giant, statistically significant qualitative tests, and you will eventually find more issues with the same amount of testing time.

It’s also MUCH easier to pick out a few major problems from five hours of testing than it is to find dozens of different problems from dozens of hours of testing. In the end though, you’ll find more problems with the iterative approach.


Extrapolating From Too Small a Sample

I always do this don’t I? Say one thing, and then immediately warn you not to go too far in the opposite direction. The thing is, I get really tired of running five person tests and having a product owner only show up for one session and then go off and address whatever problems s/he saw during that one hour. One or two participants aren’t enough to really get a sense of the pattern of problems in your product.

Besides, I have this little rule of thumb I’ve developed for studies. No matter how great your screener or recruiter, on average for every 10 participants you schedule, one will be a no-show, one will be some sort of statistical outlier (intelligence, computer savvy…something), and one will be completely insane. If the product owner happens to show up only for one of the last two types, their perception of the product’s problems will be totally skewed.

I had one product where we interviewed ten people over the course of two tests. Nine of the ten people were wildly confused by the product, but one, who I swear was a ringer, nailed all the tasks in record time. Guess which session the product manager showed up for? Yeah.

The Fix: As the person making the decisions about what changes you should make in your product, you should be attending all or at least most of your user interview sessions, even if you’re not running them yourself. You should also be looking directly at all of your survey data, not just skimming it or reading a high level report. Honestly, if you’re the one making decisions about product direction, then you are the one who most benefits from listening to your users. If you’re not paying attention to the results, then the testing is really just a waste of time.

Look at all your data before drawing conclusions. I mean it.

Trying to Answer Specific Questions

Qualitative data is very bad at answering specific questions like “Which landing page will people like better?” or “How much will people pay for this?” What it’s great for is generating hypotheses that can then be tested with quantitative means.

In more than one test, I’ve had clients ask me to test various different images to use on landing pages to see which one was most appealing. I always explain that they’re better off just doing a split test to see which one does best, but sometimes they insist. Unfortunately, these sorts of preference differences are often very subtle. Since people are not making the decisions consciously, it's very hard for them to explain why they prefer one thing over another. We always end up getting a lot of people trying to rationalize why they something, and I rarely trust the results.

The Fix: Use qualitative data to generate hypotheses that you then test quantitatively OR to find major problems in your interface. Don’t try to use qualitative data to get a definitive answer to questions about expected user preferences.

Ignoring Inconvenient Results

Because qualitative testing doesn’t generate hard numbers, it’s easy to let confirmation bias sneak into the analysis. While it might be tough to argue with “this registration flow generated 12% more paying customers than the other one,” it’s pretty easy to discount problems observed in user sessions.

I dealt with a particularly resistant product owner who had an excuse for every single participant’s struggles with the product. One was unusually stupid. Another just didn’t understand the task. Another actually understood it but was, for some reason, actively screwing with us. This went on and on while every single participant had the same problems over and over. Also, the discussion guide, which the product owner and everyone on the team had originally thought was perfectly fair, suddenly became wildly biased and the tasks were judged to be impossible. The problem couldn’t possibly have been with the product!

The Fix: If you are finding fault with all of the participants or the moderator or the questions or the survey, it’s time to get somebody neutral into the room to help determine what is biasing the results. Hint: it’s almost certainly you.

Remember, your customers, moderator, and test participants don’t have a stake in making your product seem worse than it is. You, however, may have an emotional stake in making it better than it actually is. Make sure you’re not ignoring results just because they’re not what you want to hear.

Not Acting on the Data

Why would you even bother to run test if you’re not going to pay attention to the results? I mean, tests aren’t free. Even running surveys has an associated cost, since you’re interrupting what your user is doing to ask them to help you out. And yet, so many clients do exactly this.

One client I worked with wanted to set up a system where they ran tests every week. They wanted to have a constant stream of users and potential users coming in all the time so that they could stay in contact with their users. I thought this was a fantastic idea, and so I started bringing people in for them. Unfortunately, after a few months, people began to complain that they were hearing the same problems over and over again.

I explained that they were going to continue to hear the same problems over and over again until they fixed the problems. I gave them a list of the major issues that their current and new users were facing. Every once in awhile, if I complained loudly enough, they would fix one of the easier problems, and unsurprisingly these changes always positively affected their metrics. And yet, it was always a struggle to get the results from the tests incorporated into the product. I eventually stopped running tests and told them that I would be happy to come back and start again as soon as they had addressed some of the major problems.

The Fix: This one should be simple. If you’re going to spend all that time and money generating data, you should act on the results.

I want your feedback!

Have you had problems interpreting or using your qualitative data, or do you have stories about people in your company who have? Please, share them in the comments section!

Want more? Follow me on Twitter!

Also, if your company is currently working on getting feedback from users, I’d love to hear more about what you are doing and what you’d like to be doing better. Please take this short survey!

5 Big Mistakes People Make When Analyzing User Data

I was trying to write a blog post the other day about getting various different types of user feedback, when I realized that something important was missing. It doesn’t do any good for me to go on and on about all the ways you can gather critical data if people don’t know how to analyze that data once you have it.

I would have thought that a lot of this stuff was obvious, but, judging from my experience working with many different companies, it’s not. All of the examples here are real mistakes I’ve seen made by smart, reasonable, employed people. A few identifying characteristics have been changed to protect the innocent, but in general they were product owners, managers, or director level folks.

This post only covers mistakes made in analyzing quantitative data. At some point in the future, I’ll put together a similar list of mistakes people make when analyzing their qualitative data.

For the purposes of this post, the quantitative data to which I’m referring is typically generated by the following types of activities:
  • Multivariate or A/B testing
  • Site analytics
  • Business metrics reports (sales, revenue, registration, etc.)
  • Large scale surveys

Statistical Significance

I see this one all the time. It generally involves somebody saying something like, “We tested two different landing pages against each other. Out of six hundred views, one of them had three conversions and one had six. That means the second one is TWICE AS GOOD! We should switch to it immediately!”

Ok, I may be exaggerating a bit on the actual numbers, but too many people I’ve worked with just ignored the statistical significance of their data. They didn’t realize that even very large numbers can be statistically insignificant, depending on the sample size.

The problem here is that statistically insignificant metrics can completely reverse themselves, so it’s important not to make changes based on results until you are reasonably certain that those results are predictable and repeatable.

The Fix: I was going to go into a long description of statistical significance and how to calculate it, but then I realized that, if you don’t know what it is, you shouldn’t be trying to make decisions based on quantitative data. There are online calculators that will help you figure out if any particular test result is statistically significant, but make sure that whoever is looking at your data understands basic statistical concepts before accepting their interpretation of data.

Also, a word of warning: testing several branches of changes can take a LOT larger sample size than a simple A/B test. If you're running an A/B/C/D/E test, make sure you understand the mathematical implications.

Short Term vs. Long Term Effects

Again, this seems so obvious that I feel weird stating it, but I’ve seen people get so excited over short term changes that they totally ignore the effects of their changes in a week or a month or a year. The best, but not only, example of this is when people try to judge the effect of certain types of sales promotions on revenue.

For example, I've often heard something along these lines, “When we ran the 50% off sale, our revenue SKYROCKETED!” Sure it did. What happened to your revenue after the sale ended? My guess is that it plummeted, since people had already stocked up on your product at 50% off.

The Fix: Does this mean you should never run a short term promotion of any sort? Of course not. What it does mean is that, when you are looking at the results of any sort of experiment or change, you should look at how it affects your metrics over time.

Forgetting the Goal of the Metrics

Sometimes people get so focused on the metrics that they forget the metrics are just shorthand for real world business goals. They can end up trying so hard to move a particular metric that they sacrifice the actual goal.

Here’s another real life example: Once client decided that, since revenue was directly tied to people returning to their site after an initial visit, they were going to “encourage” people to come back for a second look. This was fine as far as it went, but after various tests they found that the most successful way to get people to return was to give them a gift every time they did.

The unsurprising result was that the people who just came back for the gift didn’t end up actually converting to paying customers. The company moved the “returning” metric without actually affecting the “revenue” metric, which had been the real goal in the first place. Additionally, they now had the added cost of supporting more non-paying users on the site, so it ended up costing them money.

The Fix: Don’t forget the actual business goals behind your metrics, and don’t get stuck on what Eric Ries calls Vanity Metrics. Remember to consider the secondary effects of your metrics. Increasing your traffic comes with certain costs, so make sure that you are getting something other than more traffic out of your traffic increase!

Combining Data from Multiple Tests

Sometimes you want to test different changes independently of one another, and that's often a good thing, since it can help you determine which change actually had an effect on a particular metric. However this can be dangerous if used stupidly.

Consider this somewhat ridiculous thought experiment. Imagine you have a landing page that is gray with a light gray call to action button. Let's say you run two separate experiments. In one, you change the background color of the page to red so that you have a light gray button on a red background. In another test, you change the call to action to red so that you have a red button on a gray background. Let's say that both of these convert better than the original page. Since you've tested both of your elements separately, and they're both better, you decide to implement both changes, leaving you with...a red call to action button on a red page. This will almost certainly not go well.

The Fix: Make sure that, when you're combining the results from multiple tests that you still go back and test the final outcome against some control. In many cases, the whole is not the sum of its parts, and you can end up with an unholy mess if you don't use some common sense in interpreting data from various tests.

Understanding the Significance of Changes

This one just makes me sad. I’ve been in lots of meetings with product owners who described changes in the data for which they were responsible. Notice I said “described” and not “explained.” Product owners would tell me, “revenue increased” or “retention went from 2 months to 1.5 months” or something along those lines. Obviously, my response was, “That’s interesting. Why did it happen?”

You’d be shocked at how many product owners not only didn’t know why their data was changing, but they didn’t have a plan for figuring it out. The problem is, they were generating tons of charts showing increases and decreases, but they never really understood why the changes were happening, so they couldn’t extrapolate from the experience to affect their metrics in a predictable way.

Even worse, sometimes they would make up hypotheses about why the metrics changed but not actually test them. For example, one product owner did a “Spend more than $10 and get a free gift” promo over a weekend. The weekend’s sales were slightly higher than the previous weekend’s sales, so she attributed that increase to the promotion. Unfortunately, a cursory look at the data showed that the percentage of people spending over $10 was no larger than it had been in previous weeks.

On the other hand, there had been far more people on the site than in previous weeks due to seasonality and an unrelated increase in traffic. Based on the numbers, it was extremely unlikely that it was the promotion that increased revenue, but she didn’t understand how to measure whether her changes actually made any difference.

The Fix: Say it with me, "Correlation does not equal causation!" Whenever possible test changes against a control so that you can accurately judge what effect they’re having on specific metrics. If that’s not possible, make sure that you understand ahead of time which changes you are LIKELY to see from a particular change and then judge whether that happened. For example, a successful “spend more than $10 promo” should most likely increase the percentage of orders over $10. 

Also, be aware of other changes within the company so that you can determine whether it was YOUR change that affected your metrics. Anything from a school holiday to an increased ad spend might affect your numbers, so you need to know what to expect.

I want your feedback!

Have you had problems interpreting your quantitative data, or do you have stories about people in your company who have? Please, share them in the comments section!

Also, if your company is currently working on getting feedback from users, I’d love to hear more about what you are doing and what you’d like to be doing better. Please take this short survey!

Why Your Customer Feedback is Useless

Here’s the scenario: You have a minimum viable product. You’re talking to your users about it. You’re asking them questions, and they’re answering. But for some reason, it’s just not turning into usable information.

You wonder what’s going on. You imagine that perhapsyour users suck at giving good feedback or else they don’t have anything useful to say. Maybe, you think in a moment of hopeful delusion, your product is so perfect that it can’t be improved by customer feedback.

While these are all possibilities, the reality is that it’s probably not your customers’ fault. So, if you don’t seem to be getting any good data, what IS the problem? Probably one of the following things:

You’re asking the wrong questions

I’ve written before about asking customers the wrong questions, but in summary, customers are very good at giving you certain kinds of information and very bad at other kinds. For example, users are great at telling you about their problems. They can very easily tell you when something isn’t working or interesting or fun to use. What they suck at is telling you how to fix it.

Customers are great at:
  • Complaining about problems
  • Describing how they currently perform tasks
  • Saying whether or not they like a product
  • Showing you parts of a product that are particularly confusing
  • Comparing one product to another similar product
  • Explaining why they chose a particular method of doing something
Customers are bad at:
  • Predicting their future behavior
  • Predicting what other people will like
  • Predicting whether they’ll pay for something
  • Coming up with innovative solutions to their own or other people’s problems
  • Coming up with brand new ideas for what would make a product more appealing
To take advantages of users’ strengths, have them describe things like, “Tell me about your most recent experience using the product and how that went for you.” Or ask them questions like, “What about [competitor product] do you particularly enjoy? What do you hate?” You can even ask questions like, “Of the following 5 features, which would you prefer?” You’ll want to be a lot more careful about listening to their answers to overly broad questions like, “What brand new feature would you like to see implemented?” or “What would make this product more fun to use?”


You’re asking the right questions the wrong way

It takes practice to ask good questions. Sometimes you’re too close to your product to be objective, and other times you don’t have very good moderation skills. Whatever the problem, you need to make sure that you’re being a good interviewer and not biasing the data.

One of the best ways to improve at this (aside from reading the above posts or spending years as a user researcher), is to have somebody objective and honest give you feedback on your interview skills. Get somebody else in the room who will watch you interview and tell you if you’re asking questions well. Make sure they read the above posts first though, so they’ll know what to look for.

You believe everything customers say

Why bother asking customers questions if you’re not going to listen to them? Well, you are going to listen to them, but you’re also going to verify their answers. Users lie. They don’t necessarily mean to, but they do, often because they simply don’t remember most of what typically happens when they use your product.

The easiest solution is to watch them using your product in addition to talking to them about it. Also, follow up with metrics whenever possible. Maybe they all say that they log into your website every day, but their customer history might tell a very different story. As a bonus, by watching people use your product and checking metrics, you will see your customers’ usage habits, which may vary significantly from what you expect.

You think you already know the answers

You know that saying, “when all you have is a hammer, everything looks like a nail (or a human head, depending on how sick your friends are)?” When you already have a great idea for a feature in mind, everything a customer says may lead you to believe that the feature is a good idea, even when there might be a much better solution.

It doesn’t matter what people tell you if you think you already know what you’re going to hear. When asking for feedback, you need to stay as neutral as possible, and, if you can’t, get somebody else to do the interviewing for you. Also, having more people in the room always helps, since you can compare notes about what everybody else thought the user said.

You haven’t fixed any of their old problems yet

Sometimes, you just hear the same old problems over and over and over. Do you know why that is? It’s because you haven’t fixed some very big problems! I know this sounds obvious, but I’ve worked with enough companies who completely ignored their data to know that it bears repeating. If customers keep complaining loudly about the same things, YOU SHOULD FIX THOSE THINGS. Otherwise, you’ll soon need to get some new customers.

Once you’ve fixed the big problems that are really annoying your users, you’ll be able to have a lot better discussion with them about the other issues they may be having.

You can't turn information into action items

Ok, this one’s hard. I’ve written a bit about how to improve the ROI on your user research, but there’s more to be said about this topic. The real problem is that a user test can generate hours of video and pages of notes, and it can be tricky to distill that down into a task that can be put on your scrum board and implemented by your engineers.

Here are a few suggestions to improve this:
  • Hold more targeted interviews, tests, surveys, etc. For example, concentrate one series of customer development entirely on your Registration process or your Purchasing flow, and only gather data on that. By focusing, you will simply have less data to comb through, and you’ll be able to go deeper into all the problems with a particular system.
  • Have several people observe interviews in real time and jot down the 5 most important things they heard during the session. Then quickly discuss everybody’s lists after the session to see if there’s consensus. This eliminates the long, costly process of going back after a series of interviews and digging through all those notes and videos.
  • Use A/B testing wherever practical to answer concrete questions, like which content improves customer conversion rates or which page layout works better. This means you won’t have to spend time gathering and sifting through certain types of data, and you can focus on areas where qualitative data is more helpful. Unsurprisingly, I have already written about how to more successfully integrate your qualitative testing with A/B testing to maximize efficiency.

You’re not asking them anything!

Of course, the number one reason that people don’t get good data from their customers is that THEY’RE NOT ASKING FOR IT. All of the above techniques require that you be committed to connecting with your customers on a regular basis and getting their feedback and opinions to make your product better. If you’re not doing that, you’re ignoring a huge amount of nearly free information, and your product will likely suffer for it.

I want to hear from you!

Have you tried all of these things and still feel like you’re not getting the information you want? Is there a problem that I’ve missed that has kept you from getting good feedback? How did you learn how to do good customer development? Let me know in the comments.

Like what you read? Want to read more like it? You should really follow me on Twitter.

A/B and Qualitative User Testing

Recently, I worked with a company devoted to A/B testing. For those of you who aren't familiar with the practice, A/B testing (sometimes called bucket testing or multivariate testing) is the practice of creating multiple versions of a screen or feature and showing each version to a different set of users in production in order to find out which version produces better metrics. These metrics may include things like "which version of a new feature makes the company more money" or "which landing screen positively affects conversion." Overall, the goal of A/B testing is to allow you to make better product decisions based on the things that are important to your business by using statistically significant data.

Qualitative user testing, on the other hand, involves showing a product or prototype to a small number of people while observing and interviewing them. It produces a different sort of information, but the goal is still to help you make better product decisions based on user feedback.

Now, a big part of my job involves talking to users about products in qualitative tests, so you might imagine that I would hate A/B testing. After all, wouldn't something like that put somebody like me out of a job? Absolutely not! I love A/B testing. It's a phenomenal tool for making decisions about products. It is not the only tool, however. In fact, qualitative user research combined with A/B testing creates the most powerful system for informing design that I have ever seen. If you're not doing it yet, you probably should be.

A/B Testing

What It Does Well

A/B testing on its own is fantastic for certain things. It can help you:
  • Get statistically significant data on whether a proposed new feature or change significantly increases metrics that matter - numbers like revenue, retention, and customer acquisition
  • Understand more about what your customers are actually doing on your site
  • Make decisions about which features to cut and which to improve
  • Validate design decisions
  • See which small changes have surprisingly large effects on metrics
  • Get user feedback without actually interacting with users

For example, imagine that you are creating a new check out flow for your website. There is a request from your marketing department to include an extra screen that asks users for some demographic information. However, you feel that every additional step in a check out process represents a chance for users to drop out, which prevents purchases. By creating two flows in production, one with the extra screen and one without, and showing each flow to only half of your users, you can gather real data on how many purchases are completed by members of each group. This allows you to understand the exact impact on sales and helps you decide whether gathering the demographic information is really worth the cost.

Even more appealing, you can get all this user feedback without ever talking to a single user. A/B testing is, by its nature, an engineering solution to a product design problem, which makes it very popular with small, engineering-driven startups. Once the various versions of the feature are released to users, almost anybody can look at the results and understand which option is doing better, so it can all be done without having to recruit or interview test participants.

Of course, A/B testing in production works best on things like web or mobile applications where you can not only show different interfaces to different customers, but where you can also easily switch all of your users to the winning interface without having to ship them a new box full of software or a new physical device. I wouldn't recommend trying it if you're designing, for example, a car.

What It Does Poorly

Now imagine that, instead of adding a single screen to an already existing check out flow, you are tasked with designing an entirely new check out flow that should maximize revenue and minimize the number of people who abandon their shopping carts. In creating the new flow, there are hundreds of design decisions you need to make, both small and large. How many screens should it have? How much up-selling and cross-selling should you do? At what point in the flow do you ask users for payment information? What should the screens look like? Should they have the standard header and footer, or should those be removed to minimize potential distractions for users when purchasing? And on and on and on...

These are all just a series of small decisions, so, in an ideal world, you'd be able to A/B test each one separately, right? Of course, in the real world, this could mean creating an A/B test with hundreds of different variations, each of which has to be shown to enough users to achieve statistical significance. Since you want to roll out your new check out process sometime before the next century, this may not be a particularly appealing option.

A Bad Solution

Another option would be to fully implement several very different directions for the check out screens and test them all against one another. For example, let's say you implemented four different check out processes with the following features to test against one another:
Option 1: Option 2: Option 3: Option 4:

Yellow Background
Three Screens
Marketing Questions
No Up-selling
No Cross-Selling
Header
No Footer
Help Link


    Blue Background
    Two Screens
    No Marketing Questions
    Up-selling
    No Cross-Selling
    Header
    Footer
    No Help


      Orange Background
      Four Screens
      Marketing Questions
      Up-selling
      Cross-Selling
      No Header
      Footer
      Live Chat Help

      White Background
      One Screen
      No Marketing Questions
      No Up-selling
      Cross-Selling
      No Header
      No Footer
      Live Chat Help

        This might work in companies that have lots of bored engineers sitting around waiting to implement and test several different versions of the same code, most of which will eventually be thrown away. Frankly, I haven't run across a lot of those companies. But even if you did decide to devote the resources to building four different check out flows, the big problem is that, if you get a clear winner, you really don't have very clear idea of WHY users preferred a particular version of the check out flow over the others. Sure, you can make educated guesses. Perhaps it was the particularly soothing shade of blue. Or maybe it was the fact that there weren't any marketing questions. Or maybe it was aggressive up-selling. Or maybe that version just had the fewest bugs.

        But the fact is, unless you figure out exactly which parts users actually liked and which they didn't like, it's impossible to know that you're really maximizing your revenue. It's also impossible to use those data to improve other parts of your site. After all, what if people HATE the soothing shade of blue, but they like everything else about the new check out process? Think of all the money you'll lose by not going with the yellow or orange or white. Think of all the time you'll waste by making everything else on your site that particular shade of blue, since you think that you've statistically proven that people love it!

        What Qualitative Testing Does Well

        Despite the many wonderful things about A/B testing, there are a few things that qualitative testing just does better.

        Find the Best of All Worlds

        Qualitative testing allows you to test wildly different versions of a feature against one another and understand what works best about each of them, thereby helping you develop a solution that has the best parts from all the different options. This is especially useful when designing complicated features that require many individual decisions, any one of which might have a significant impact on metrics. By observing users interacting with the different versions, you can begin to understand the pros and cons of each small piece of the design without having to run each one individually in its own A/B test.

        Find Out WHY Users Are Leaving

        While a good A/B test (or plain old analytics) can tell you which page a user is on when they abandon a check out flow, it can't tell you why they left. Did they get confused? Bored? Stuck? Distracted? Information like that helps you make better decisions about what exactly it is on the page that is causing people to leave, and watching people using your feature is the best way to to gather that information.

        Save Engineering Time and Iterate Faster

        Generally, qualitative tests are run with rich, interactive wireframes rather than fully designed and tested code. This means that, instead of having your engineers code and test four different versions of the flow, you can have a designer create four different HTML prototypes in a fraction of the time. HTML prototypes are significantly faster to produce since:
        • They don't have to run in multiple browsers, just the one you're testing
        • They don't have any backend code that needs to be done
        • They frequently don't have a polished visual design (unless that's part of what you're testing)
        And since making changes to a prototype doesn't require any engineering or QA time, you can iterate on the design much faster, allowing you to refine the design in hours or days rather than weeks or months.

        How Do They Work Together?

        Qualitative Testing Narrows Down What You Need to A/B Test

        Qualitative testing will let you eliminate the obviously confusing stuff, confirm the obviously good stuff, and narrow down the set of features you want to A/B test to a more manageable size. There will still be questions that are best answered by statistics, but there will be a lot fewer of them.

        Qualitative Testing Generates New Ideas for Features and Designs

        While A/B testing helps you eliminate features or designs that clearly aren't working, it can't give you new ideas. Users can. If every user you interview gets stuck in the same place, you've identified a new problem to solve. If users are unenthusiastic about a particular feature, you can explore what's missing with them and let them suggest ways to make the product more engaging.

        Talking to your users allows you to create a hypothesis that you can then validate with an A/B test. For example, maybe all of the users you interviewed about your check out flow got stuck selecting a shipment method. To address this, you might come up with ideas for a couple of new shipment flows that you can test in production once you've confirmed that they're less confusing with another quick qualitative test.

        A/B Testing Creates a Feedback Loop for Researchers

        A/B tests can also improve your qualitative testing process by providing statistical feedback to your researchers. I, as a researcher, am going to observe participants during tests in order to see what they like and dislike. I'm then going to make some educated guesses about how to improve the product based on my observations. When I get feedback about which recommendations are the most successful, it helps me learn more about what's important to users so I make better recommendations in the future.

        Any Final Words?

        Separately, both A/B testing and qualitative testing are great ways to learn more about your users and how they interact with your product. Combined, they are more than the sum of their parts. They form an incredibly powerful tool that can help you make good, user-centered product decisions more quickly and with more confidence than you have ever imagined.

        Like the post? Follow me on Twitter!


        This post originally appeared on the Sliced Bread Design blog.