You Can't Make Good Decisions with Bad Data

I think a critical lesson of the Lean Startup movement is that you have to learn quickly.

The “quickly” part of that lesson can lead to a culture of “good enough.” Your features should be good enough to attract some early adopters. Your design should be good enough to be usable. Your code should be good enough to make your product functional.

While this might drive a lot of perfectionists nuts, I’m all for it. Good enough means that you can spend your time perfecting and polishing only the parts of your product that people care about, and that means a much better eventual experience for your users. It may also mean that you stay in business long enough to deliver that experience.

I think though that there’s one part of your product where the standard for “good enough” is a whole lot higher: Data. Data are different.

You Can’t Make Good Decisions With Bad Data

The most important reason to do good research is that it can keep you from destroying your startup. I’m not being hyperbolic here. Bad data can ruin your product.

Imagine for a moment an a/b testing system that randomly returned the wrong test winner 30% of the time. It would be tough to make decisions based on that information, wouldn’t it? How would you know if you were choosing the right experiment branch?

Qualitative research can be just as bad. I can’t tell you how many founders have spent time and money talking to potential customers and then wondered why nobody used their product. Nine times out of ten, they were talking to the wrong people, asking the wrong questions, or using terrible interview techniques.

I had one person tell me, “bad data are better than no data,” but I strongly disagree here. After all, if I know I don’t have any data, I can go do some research and learn something.

But if I have some bad data, I think I already know the answers. Confirmation bias will make it even harder for me to unlearn that bad information. I’m going to stop looking and start acting on that information, and that may influence all of my product decisions.

If I “know” that all of my users are left handed, I can spend an awful lot of time building and throwing out features for left handed people before realizing that what I got wrong was the original premise. And, of course, that problem is made even worse if I’m not getting good information about how the features are actually performing.

You Have To Keep Doing It

Unlike any given feature or piece of code, collecting data is guaranteed to be part of your process for the life of your startup.

One of the best arguments for building minimum viable products and features is that you might just throw them out once you’ve learned something from them (like that nobody wants what you built).

This isn’t true of collecting data. Obviously you may change the way you collect data or the types of data you collect, but you’re going to keep doing it, because there’s simply no other way to make informed decisions.

Because this is something that you know is absolutely vital to your company, it’s worth getting it right early.

Data Collection Is Not a Mystery

Most of your product development is going to be a mystery. That’s the nature of startups.

You’ve got a new product in a new market, possibly with new technology. You have to do a lot of digging in order to figure out what you should be building. There’s no guide book telling you exactly what features your revolutionary new product should have.

That’s not true of gathering data. There is a ton of useful, pertinent information about the right way to do both qualitative and quantitative research. There are workshops and courses you can take on how to not screw up user interviews. There are coaches you can hire to get you trained in gathering all sorts of data. There are tools you can drop in to help you do a/b testing and funnel tracking. There are blogs you can read written by people who have already made mistakes so that you don’t have to make the same ones. There is a book called Lean Analytics that pretty much lays it out for you.

You don’t have to take advantage of all of these things, but you also don’t have to start from scratch. Taking a little time to learn about the tools and methods already available to you gives you a huge head start.

Good Data Take Less Time Than Bad Data

Here’s the good news: good data actually take less time to collect than bad data. Sure, you may have to do a little bit of upfront research on the right tools and methods, but once you’ve got those down, you’re going to move a hell of a lot faster.

For example, customer development interviews go much more quickly when you’re asking the right questions of the right people. You don’t have to talk to nearly as many users when you know how to not lead them and to interpret their answers well. Observational and usability research becomes much simpler when you know what you’re looking for.

The same is true for quantitative data collection. Your a/b tests won’t seem nearly so random when you’re sure that the information in the system is correct. You won’t have to spend time as much time figuring out what’s going on with your experiments if you trust your graphs.

Good Data Does Not Mean Complete Data

I do want to make one thing perfectly clear: the quest for good data should be more about avoiding bad data than it is about making sure you have every scrap of information available.

If you don’t have all the data, and you know you don’t have all the data, that’s fine. You can always go out and do more research and testing later. You just don’t want to put yourself into the situation where you have to unlearn things later.

You don’t have to have all the answers. You just have to make sure you don’t have any wrong answers. And you do that by setting the bar for “good enough” pretty damn high on your data collection skills.


Like the post? Please share it!

Want more information like this? 


My new book, UX for Lean Startups, will help you learn how to do better qualitative and quantitative research. It also includes tons of tips and tricks for better, faster design. 

Combining Qualitative & Quantitative Research


Designers are infallible. At least, that’s the only conclusion that I can draw, considering how many of them flat out refuse to do any sort of qualitative or quantitative testing on their product. I have spoken with designers, founders, and product owners at companies of all sizes, and it always amazes me how many of them are so convinced that their product vision is perfect that they will come up with the most inventive excuses for not doing any sort of customer research or testing. 

Before I share some of these excuses with you, let’s take a look at the types of research I would expect these folks to be doing on their products and ideas.

Quantitative Reserach

When I say quantitative research in this context, I’m talking about a/b testing, product analytics, and metrics - things that tell you what is happening when users interact with your product. These are methods of finding out, after you’ve shipped a new product, feature, or change, exactly what your users are doing with it. 

Are people using the new feature once and then abandoning it? Are they not finding the new feature at all? Are they spending more money than users who don’t see the change? Are they more likely to sign up for a subscription or buy a premium offering? These are the types of questions that quantitative research can answer. 

For a simple example, if you were to design a new version of a landing page, you might run an a/b test of the new design against the old design. Half of your users would see each version, and you’d measure to see which design got you more registered users or qualified leads or sales or any other metric you cared about.

Qualitative Research

By qualitative testing, I mean the act of watching people use your product and talking to them about it. I don’t mean asking users what you should build. I just mean observing and listening to your users in order to better understand their behavior. 

You might do qualitative testing before building a new feature or product so that you can learn more about your potential users’ behaviors. What is their current workflow? What is their level of technical expertise? What products are they already using? You might also do it once your product is in the hands of users in order to understand why they’re behaving the way they are. Do they find something confusing? Are they getting lost or stuck at a particular point? Does the product not solve a critical problem for them? 

For example, you might find a few of your regular users and watch them with your product in order to understand why they’re spending less money since you shipped a new feature. You might give them a task in order to see if they could complete it or if they got stuck. You might interview them about their usage of the new feature in order to understand how they feel about it. 


Excuses, Excuses

While it may seem perfectly reasonable to want to know what your users are really doing and why they are doing it, a huge number of designers seem really resistant to performing these simple types of research or even listening to the results. I don’t know why they refuse to pay any attention to their users, but I can share some of the terrible excuses they’ve given me. 


A/B Testing is Only Good for Small Changes

I hear this one a lot. There seems to be a misconception that a/b testing is only useful for things like button color and that by doing a/b testing you’re only ever going to get small changes. The argument goes something like, “Well, we can only test very small things and so we will test our way to a local maximum without ever being able to really make an important change to our user experience.”
This is simply untrue.

You can a/b test anything. You can show two groups of users entirely different experiences and measure how each group behaves. You can hide whole features from users. You can change the entire checkout flow for half the people buying things from you. You can test a brand new registration or onboarding system. And, of course, you can test different button colors, if that is something that you are inclined to do.

The important thing to remember here is that a/b testing is a tool. Itʼs agnostic about what youʼre testing. If youʼre just testing small changes, youʼll only get small changes in your product. If, on the other hand, you test big things - major navigation changes, new features, new purchasing flows, completely different products - then youʼll get big changes. And, more importantly, you’ll know how they affected your users. 


Quantitative Testing Leads to a Confused Mess of an Interface

This is one of those arguments that has a grain of truth in it. It goes something like, “If we always just take the thing that converts best, we will end up with a confusing mess of an interface.”
Anybody who has looked at Amazonʼs product pages knows the sort of thing that a/b testing can lead to. They have a huge amount of information on each screen, and none of it seems particularly attractive. On the other hand, they rake in money.

Itʼs true that when youʼre doing lots of a/b testing on various features, you can wind up with a weird mishmash of things in your product that donʼt necessarily create a harmonious overall design. You can even wind up with features that, while they improve conversion on their own, end up hurting conversion when they’re combined. 

As an example, letʼs say youʼre testing a product detail page. You decide to run several a/b tests simultaneously for the following new features:
  • 
customer photos

  • comments
  • ratings

  • extended product details

  • shipping information

  • sale price

  • return info
Now, letʼs imagine that each one of those items, in its own a/b test, increases conversion by some small, but statistically significant margin. That means you keep all of them. Now youʼve got a product detail page with a huge number of things on it. You might, rightly, worry that the page is becoming so overwhelming that youʼll start to lose conversions.

Again, this is not the fault of a/b testing – or in this case, a/b/c/d/e testing. This is the fault of a bad test. You see, itʼs not enough that you run an a/b test. You have to run a good a/b test. In this case, just because the addition of a particular feature to your product page improved conversions doesn’t mean that adding a dozen new features to your product page will increase your conversion. 

In this instance, you might be better off running several a/b tests serially. In other words, add a feature, test it, and then add another and test. This way you’ll be sure that every additional feature is actually improving your conversion. Alternatively, you could test a few different versions of the page with different combinations of features to see which converts best. 


A/B Testing Takes Away the Need For Design

For some reason, people think that a/b testing means that you just randomly test whatever crazy shit pops into your head. They envision a world where engineers algorithmically generate feature ideas, build them all, and then just measure which one does best.

This is just absolute nonsense.

A/B testing only specifies that you need to test new designs against each other or against some sort of a control. It says absolutely zero about how you come up with those design ideas.

The best way to come up with great products is to go out and observe users and find problems that you can solve and then use good design processes to solve them. When you start doing testing, youʼre not changing anything at all about that process. Youʼre just making sure that you get metrics on how those changes affect real user behavior.

Letʼs imagine that youʼre building an online site to buy pet food. You come up with a fabulous landing page idea that involves some sort of talking sock puppet. You decide to create this puppet character based on your intimate knowledge of your user base and your sincere belief that what they are missing in their lives is a talking sock puppet. Itʼs a reasonable assumption.

Instead of just launching your wholly re-imagined landing page, complete with talking sock puppet video, you create your landing page and show it to only half of your users, while the rest of your users are stuck with their sad, sock puppet-less version of the site. Then you look to see which group of users bought more pet food. At no point did the testing process have anything to do with the design process. 

Itʼs really that simple. Nothing about a/b testing determines what youʼre going to test. A/B testing has literally nothing to do with the initial design and research process. 

Whatever youʼre testing, you still need somebody who is good at creating the experiences youʼre planning on testing against one another. A/B testing two crappy experiences does, in fact, lead to a final crappy experience. After all, if youʼre looking at two options that both suck, a/b testing is only going to determine which one sucks less.

Design is still incredibly important. It just becomes possible to measure designʼs impact with a/b testing.


There’s No Time to Usability Test

When I ask people whether they’ve done usability testing on prototypes of major changes to their products, I frequently get told that there simply wasn’t time. It often sounds something like, “Oh, we had this really tight deadline, and we couldn’t fit in a round of usability testing on a prototype because that would have added at least a week, and then we wouldn’t have been able to ship on time.” 

The fact is you don't have time NOT to usability test. As your development cycle gets farther along, major changes get more and more expensive to implement. If you're in an agile development environment, you can make updates based on user feedback quickly after a release, but in a more traditional environment, it can be a long time before you can correct a big mistake, and that spells slippage, higher costs, and angry development teams. Even in agile environments, it’s still faster to fix things before you write a lot of code than after you have pissed off customers who are wondering why you ruined an important feature that they were using. 

I know you have a deadline. I know it's probably slipped already. It's still a bad excuse for not getting customer feedback during the development process. You're just costing yourself time later. I’ve never known good usability testing to do anything other than save time in the long run on big projects.


Qualitative Research Doesn’t Work Because Users Don’t Know What They Want

This is possibly the most common argument against qualitative research, and it’s particularly frustrating, because part of the statement is quite true. Users aren’t particularly good at coming up with brilliant new ideas for what to build next. Fortunately, that doesn’t matter. 

Let’s make this perfectly clear. Qualitative research is NOT about asking people what they want. At no point do we say, “What should we build next?” and then relinquish control over our interfaces to our users. People who do this are NOT doing qualitative research. 

Qualitative research isn’t about asking people what they want and giving it to them. Qualitative research is about understanding the needs and behaviors of your users. It’s about really knowing what problem you’re solving and for whom.

Once you understand what your users are like and what they want to do with your product, it’s your job to come up with ways to make that happen. That’s the design part. That’s the part that’s your job.


It’s My Vision - Users Will Screw it Up

This can also be called the "But Steve Jobs doesn't listen to users..." excuse. 

The fact is, understanding what your users like and don't like about your product doesn't mean giving up on your vision. You don't need to make every single change suggested by your users. You don't need to sacrifice a coherent design to the whims of a user test. You don’t even need to keep a design just because it converts better in an a/b test. 

What you do need to do is understand exactly what is happening with your product and why. And you can only do that by gathering data. The data can help you make better decisions, but they don’t force you to do anything at all.


Design Isn’t About Metrics

This is the argument that infuriates me the most. I have literally heard people say things like, “Design can’t be measured, because design isnʼt about the bottom line. Itʼs all about the customer experience.”

Nope.

Wouldnʼt it be a better experience if everything on Amazon were free? Be honest! It totally would. 

Unfortunately, it would be a somewhat traumatic experience for the Amazon stockholders. You see, we donʼt always optimize for the absolute best user experience. We make tradeoffs. We aim for a fabulous user experience that also delivers fabulous profits.

While itʼs true that we donʼt want to just turn our user experience design over to short term revenue metrics, we can vastly improve user experience by seeing which improvements and features are most beneficial for both users and the company.

Design is not art. If you think that thereʼs some ideal design that is completely divorced from the effect itʼs having on your companyʼs bottom line, then youʼre an artist, not a designer. Design has a purpose and a goal, and those things can be measured.


So, What’s the Right Answer?

If you’re all out of excuses, there is something that you can do to vastly improve your product. You can use quantitative and qualitative data together. 

Use quantitative metrics to understand exactly what your users are doing. What features do they use? How much do they spend? Does changing something big have a big impact on real user behavior?

Use qualitative research to understand why your users do what they do. What problems are they trying to solve? Why are they dropping out of a particular task flow when they do? Why do they leave and never come back.

Let’s look at an example of how you might do this effectively. First, imagine that you have a payment flow in your product. Now, imagine that 80% of your users are not getting through that payment flow once they’ve started. Of course, you wouldn’t know that at all if you weren’t looking at your metrics. You also wouldn’t know that the majority of people are dropping out in one particular place in the flow.

Next, imagine that you want to know why so many people are getting stuck at that one place. You could do a very simple observational test where you watch four or five real users going through the payment flow in order to see if they get stuck in the same place. When they do, you could discuss with them what stopped them there. Did they need more information? Was there a bug? Did they get confused?

Once you have a hypothesis about what’s not working for people, you can make a change to your payment flow that you think will fix the problem. Neither qualitative nor quantitative research tells you what this change is. They just alert you that there’s a problem and give you some ideas about why that problem is happening. 

After you’ve made your change, you can run an a/b test of the old version against the new version. This will let you know whether your change was effective or if the problem still exists. This creates a fantastic feedback loop of information so that you can confirm whether your design instincts are functioning correctly and you’re actually solving user problems. 

As you can hopefully see from the example, nobody is saying that you have to be a slave to your data. Nobody is saying that you have to turn your product vision or development process over to an algorithm or a focus group. Nobody is saying that you can only make small changes. All I’m saying is that using quantitative and qualitative research correctly gives you insight into what your users are doing and why they are doing it. And that will be good for your designs, your product, and your business.


Like the post? 

Here's the Problem With Your Product

As I mentioned on Twitter, I often answer emails that people send me asking questions about UX. I enjoy it. It helps keep me in touch with what type of questions entrepreneurs are having about their products.

Whenever I mention that I'm happy to answer UX questions (for free, guys! Seriously. I have a book to procrastinate, after all.) I tend to get one particular question over and over. It is some variant on "How is the UX for my product/site?"

I'm publishing an answer that is very similar to one I recently sent to a nice entrepreneur who asked me this question. I'm doing this because I basically end up writing the exact same thing over and over when people ask me this question, and I'd love to get some different questions. Please note, unless you are building one of a fairly small number of products that I use on a regular basis, this answer applies to you.


I can't give you insight into your site, because I'm not the target customer. If you ask for my opinion, it's going to be mostly useless, because it really doesn't matter what I think about your product. It matters what your user thinks about your product.

It's like if somebody asked you about your opinion of their spaceship. Presumably you don't fly spaceships, so your opinion is almost certainly not going to be super relevant to interspace travel methods. You want feedback about spaceships, you ask an astronaut or an extra terrestrial (no, I do not have suggestions on recruiting for that study).

In order to get in touch with some of your users, I'd recommend that you do the following:

Figure out exactly what you are concerned about with your site or product. 

  • Do you want to know if new users understand the messaging?
  • Do you want to know how people are finding specific information or performing tasks?
  • Do you want to know the general behavior of people coming to your site?
  • Do you care about the experience of current users, new users, returning users, etc.?
  • Do you care about what the look and feel of your site is telling new people? 
  • Are you wondering why your revenue is too low?
  • Are you concerned that people aren't coming back? 
  • Do you want to encourage people to share more? 
  • Are you having trouble converting free users into paying users? 
Figure out which metrics you care about that you'd like to change, and do some validation around why they are what they are. 


For example, if your conversion is too low, you're going to need to figure out if people don't want what you're selling, don't understand what you're selling, or don't care enough to pay you for what you're selling.

Based on what you want to learn, you need to find some way of learning that. You can ask me for specific advice on those sorts of things. The more specific you are about the type of user and the type of thing you want to learn, the easier it is for me to suggest doing something.

You can also ask me for advice on things like what to do when you've found out that people aren't sharing because they don't understand how to do that. Or if you've learned that people aren't converting from free memberships because they're not understanding the value that they'd get from your paid product. In fact, I'm happy to give you advice about how to proceed with your UX for anything that is at this level of specificity.

There is no such thing as generic "UX". Your user experience only makes sense in the context of your particular users, what their behavior is, and what you want their behavior to be.

I Don't Know What's Wrong with Your Product

When I’m talking with startups, they frequently ask me all sorts of questions. I imagine that they’re probably really disappointed when I respond with a shrug.

You see, frequently they’re asking entirely the wrong question. And, more importantly, they’re asking the wrong person.

It is an unfortunate fact that many startups talk to people like me (or their investors or their advisors or “industry experts”) instead of talking to their users.

Now, obviously, if they just asked the users the sorts of questions they ask me, the users couldn’t answer them directly either. This is the wrong question part. But the fact is, if they were to ask the right questions, they’d have a much better chance of getting the answers from their users.

Let’s take a look at a few of the most common sorts of questions I get about UX and how we might get the answers directly from users.

What’s Wrong With My Product?

I often get people who just want “UX advice.” I suppose they’re looking for somebody to come in and say something like, “oh, you need to change your navigation options,” or “if only you made all of your buttons green.”

Regardless of what they’d like to hear, what they typically hear is, “I have no idea.” That’s quickly followed by the question, “Which of your metrics is not where you want it to be?” If they can answer that question, they are light years beyond most startups.

You see, the first step to figuring out what’s wrong with your product is to figure out, from a business perspective, some realistic goals for where you’d like your product to be right now. Obviously, “We’d like every person on the planet to pay us $100/month” is probably not a realistic goal for a three month old venture, but hey, aim high.

Once you know what you want your key metrics to be, you need to look at which of them aren’t meeting your expectations. Are you not acquiring new customers fast enough? Are not enough of them sticking around? Are too few of them paying you money? While “all of the above” is probably true, it’s also not actionable. Pick a favorite.

Now that you know which of your key metrics is failing you, you need to conduct the appropriate sort of research to figure out why it’s so low. Note: the appropriate sort of research does not involve sitting around a conference room brainstorming why abstract people you’ve never met might be behaving in a surprising way. The appropriate sort of research is also not asking an expert for generic ways to improve things like acquisition or retention, since these things vary wildly depending your actual product and user base.

The appropriate sort of research depends largely on the sort of metric you want to move and the type of product/user you have. You will, without question, have to interact with current, potential, or former customers of your product. You may need to observe what people are doing. You may need to ask them why they tried your product and never came back. You may need to run usability tests on parts of your interface to see what is confusing people the most.

Feel free to ask people like me for help figuring out what sort of research you need to be doing. That’s the sort of things experts can do pretty effectively.

But if any expert tells you exactly what’s wrong with your product without considering your user base, your market, or your key metrics, either they’re lying to you or your problems are so incredibly obvious that you should have been able to figure them out for yourself.

What Feature Should I Build Next?

Let’s imagine for a moment that you have built a Honda Civic. Good for you! That’s a nice, practical car. Now, let’s imagine that you come to me and ask how you should change your Honda Civic to make more people want to buy it.

Well, I drive a Mini Cooper, so it’s very possible that I’ll tell you that you should make your Civic more adorable and have it handle better on curves. If, on the other hand, you ask somebody who drives a Ford F-150, they’ll probably tell you that you should make it tougher and increase the hauling capacity.

Do you see my point? I can’t tell you what feature you need to build next, because I almost certainly don’t use your product! To be fair, even the people who do use your product or might use your product in the future can’t just tell you what to build next.

What they can tell you is more about their lives. Do they frequently find themselves hauling lots of things around? Do they drive a lot of curvy mountain roads? Do they care about gas mileage? What about their other purchasing choices? Do they tend to buy very expensive luxury items? Do they care more about status or value?

You see, there is no single “right way” to design something. There are thousands of different features you could add to your product, and only the preferences and pains of your current and potential users can help you figure out what is right for you.

Should I Build an App or a Website or Something Else?

Another thing that people ask me a lot is whether they should be building an iPhone app, an iPad app, an Android app, a website, an installed desktop app, or some other thing.

That’s an excellent question...to do a little research on. After all, what platform you choose should have nothing to do with what’s popular or stylish or the most fun to design for. It should be entirely based on what works best for your product and market.

And don’t just go with the stereotypes. Just because it’s for teens doesn’t necessarily mean it’s got to be mobile, although that’s certainly something you should be considering. It matters where the product is most likely to be used and what sort of devices your market is most likely to have now and in the near future. It also depends on the complexity of your product. For example, I personally don’t want Photoshop on my phone, and I don’t want a check-in app on my computer.

Talk to your users and find out what sort of products they use and where they use them.

Are You Noticing a Pattern?

Experts are not oracles. You can’t use outside people as a shortcut to learning about your own product or your users. You need to go to the source for those things.

If you find yourself asking somebody for advice, first ask yourself if you’re asking the right question, and then ask yourself if you’re asking the right person.

And if anybody ever tells you definitively what you need to change about your product without first asking what your business goals are, who your users are, and what their needs are, you can bet that they’re probably wrong.

Hey, you got to the end! Now you should follow me on Twitter.

How Metrics Can Make You a Better Designer

I have another new article in Smashing Magazine's UX section: How Metrics Can Make You a Better Designer.

Here's a little sample:

Metrics can be a touchy subject in design. When I say things like, “Designers should embrace A/B testing” or “Metrics can improve design,” I often hear concerns.

Many designers tell me they feel that metrics displace creativity or create a paint-by-numbers scenario. They don’t want their training and intuition to be overruled by what a chart says a link color should be.

These are valid concerns, if your company thinks it can replace design with metrics. But if you use them correctly, metrics can vastly improve design and make you an even better designer.


Read the rest here >

Why Your Test Results Don't Add Up and What To Do About It

Check out my guest blog post for KISSmetrics: Why Website Test Results Don’t Always Add Up & What To Do About It!

Here's a little sample:

If you do enough A/B testing, I promise that you will eventually have some variation of this problem:

You run a test. You see a 10% increase in conversion. You run a different, unrelated test. You see a 20% increase in conversion. You roll both winning branches out to 100% of your customers. You donʼt see a 30% increase in conversion.

Why? In every world Iʼve ever inhabited, 10 plus 20 equals 30, right? Youʼve proven that both changes youʼve made are improvements. Why arenʼt you seeing the expected overall increase in conversions when you roll them both out?


Read the Rest at KISSmetrics.


Designers Need to A/B Test Their Designs

The other day, I posted something I strongly believe on Twitter. A few people disagreed. I’d like to address the arguments, and I’d love to hear feedback and counter-arguments in the comments where you have more than 140 characters to tell me I’m wrong.

My original tweet was, “I don't trust designers who don't want their designs a/b tested. They're not interested in knowing if they were wrong.”

Here are some of the real responses that I got on Twitter with my longer form response.

“There’s a difference between A/B testing (public) and internally deciding. Design is also a matter of taste.”

I agree. There is a big difference between A/B testing in public and internally deciding. That’s why I’m such a huge fan of A/B testing. You can debate this stuff for weeks, and often it’s a huge waste of time.

When you’re debating design internally, what you should be asking is “which of these designs will be better for the business and users.” A/B testing tells you conclusively which side is right. Debate over!

Ok, there’s the small exception of short term vs. long term effects, which is addressed later, but in general, it’s more definitive than the opinion of the people in the room.

With regard to the “matter of taste,” that’s both true and false. Sure, different people like different designs. What you’re saying by refusing to A/B test your designs is that your taste as a designer should always trump that of the majority of your users. As long as you like your design, you don’t care whether users agree with you.

If you want your design aesthetic to override that of your users, you should be an artist. I love art. I even, very occasionally, buy some of it.

But I pay for products all the time, and I tend to buy products that I think are well designed, not necessarily ones where the designer thought they were well designed.

“If Apple had done A/B tests for the iPod in 2001 with a user-replaceable battery, that version would’ve likely won—initially.”

Honestly, it still might win. Is taking your iPod to the Apple store when the battery dies really a feature? No! It’s a design tradeoff. They couldn’t create something with the other design elements they wanted that still had a replaceable battery. That’s fine. 


But all other things about the iPod being totally equal, wouldn’t you buy the one where you could replace the battery yourself? I would. The key there is the phrase “totally equal.”

“Seeing far into the future of technology is not something consumers are particularly great at.”

I feel like the guy who made this argument was confusing A/B testing with bad qualitative testing or just asking users what they would like to see in a product.

This isn’t what A/B testing does. A/B testing measures actual user behavior right now. If I make this change, will they give me more money? It has literally nothing to do with asking users to figure out the future of technology.

“A/B testing has value but shouldn't be litmus test for designer or a design”

Really? What should be the litmus test for a designer or a design if not, “does this change or set of changes actually improve the key metrics of my company”?

In the end, isn’t that the litmus test for everybody in a company? Are you contributing to the profitability of the business in some way?

If you have some better way of figuring out if your design changes are actually improving real metrics, I’d love to hear about it. We can make THAT the litmus test for design.

“Data is valuable but must be interpreted. Doesn't "prove" wrongness or rightness. Designer still has judgment.”

I agree with the first sentence. Data certainly must be interpreted. I even agree that certain design changes may hurt certain metrics, and that can be ok if they’re improving other metrics or are shown to improve things in the long run.

But the only way to know if your overall design is actually making things better for your users is by scientifically testing it against a control.

If your overall design changes aren’t improving key metrics, where’s the judgement there? If you release something that is meant to increase the number of signups and it decreases the number of signups, I think that pretty effectively “proves wrongness.”

The great thing about A/B testing is that you know when this happens.

“Is it the designers fault, surely more appropriate to an IA? After all the IA should dictate the feel/flow.”

First off, I don’t work for companies that are big enough to draw a distinction between the two, but I’m sure there’s enough blame to go around.

Secondly, I think that everybody in an organization has the responsibility to improve key metrics. If you think that your work shouldn’t increase revenue, retention, or other numbers you want higher, why should you be employed?

Design of all kinds is important and can have a huge impact on company profitability. That impact can and should be measured. You don’t get a pass just because you’re not changing flow.

“A/B tests are a snapshot of current variables. They don’t embody nor convey a bigger strategy or long-term vision.”

Also, “That’s only an absolute truth you can rely on if you A/B test for the entire lifespan of the product, which defeats the point.”

These are excellent points, and they are a drawback of A/B testing. It’s sometimes tough to tell what the long term effects of a particular design change are going to be from A/B testing. Also, A/B testing doesn’t easily account for design changes that are a part of a larger design strategy.

In other words, sometimes you’re going to make changes that cause problems with your metrics in the short term, because you strongly believe that it’s going to improve things long term.

However, I believe that you address this by recognizing the potential for problems and designing a better test, not by refusing to A/B test at all.

Just because this particular tool isn’t perfect doesn’t mean we get to fall back on “trust the designers implicitly and never make them check their work.” That doesn’t work out so well sometimes either.

An Argument I Didn’t Hear

There’s one really good argument that I didn’t get, although some of the above tweets touched on it. Sometimes changes that individually test well don’t test well as a whole.

This is a really serious problem with A/B testing because you can wind up with Frankenstein-style interfaces. Each individual decision wins, but the combination is a giant mess.

Again, you don’t address this by not A/B testing. You address it by designing better tests and making sure that all of your combined decisions are still improving things.

How I Really Feel

Look, if I’m hiring for a company that wants to make money (and most of them do), I want my designers to understand how their changes actually affect my bottom line.

No matter how great a designer thinks his or her design is, if it hurts my revenue and retention or other key metrics, it’s a bad design for my company and my users.

Saying you’re against having your designs A/B tested sounds like you’re saying that you just don’t care whether what you’re changing works for users and the company. As a designer, you’re welcome to do that, but I’m not going to work with you.

Like the post? Follow me on Twitter!

When Is a Design Done?

I was talking with a designer about Lean UX. I was explaining that one of the hallmarks of Lean UX is to get a good, but not complete version of a product or feature designed and built and then iterate on it later. She thought this sounded like an interesting approach, but then she asked, “When do you know you’re done?”

Figuring out when you’re “done” is tricky for any design or redesign project, unless you’re a consulting agency, of course, in which case the answer is, “when the client runs out of money.” But I realized that, in Lean UX, figuring out when you’re done is actually incredibly easy.

You’re done when your metrics tell you you’re done.

Let me explain. No product is ever actually “done.” There is always something you could do to improve it. However, projects can certainly be done. The trick is that you have to choose your projects correctly.

What’s the correct way to choose a Lean UX project? Every Lean UX project should be chosen based on a metric.

This may piss off a lot of designers who want to make wonderful, exciting, super cool designs just for the sake of design or user happiness, but when it comes down to it, unless you’re independently wealthy, every design change you make should move a number that is important to your business.

Now, it is a lucky break for those of us who care deeply about our users that improving the overall user experience of the product frequently improves some number that the business people care about. But not every single thing you can do to make a user happy has the same ROI for the business. And not every improvement makes the right people happy at the right time.

That’s why the UX projects you choose should be based on metrics.

Let me give you an example. Whoever it is at your startup who is in charge of running the business should have a pretty good idea of what your various metrics have to be in order for you to all retire and buy yachts. For example, your Activation number may have to be 20% and your Retention may have to be 70%. (Please note, I made these numbers up. Your metrics may vary.)

They pick these numbers because they know that having, for example, a 99% retention rate and a 1% activation rate may lead to retaining 3 incredibly happy users forever, which is suboptimal from a business perspective.

So, if your activation number is at 10%, your business folks may come to you and say, “we need to turn more of our acquired traffic into regular users because we have identified this as the most important problem to solve at this moment.” You respond, “Great! How many more do you need?” They explain that you need to get activation from 10% to 20%.

You will notice that the metrics are not driving your design decisions. Nor are they driving your feature requirements or any other product changes. They are simply telling you what your biggest business problem currently is.

Now, it’s up to you as a designer or product owner to figure out what is keeping the activation number low and then come up with some ideas of how to fix it. You do this with what I like to call “research and design” or alternately “that thing you are paid to do.”

You may have dozens of wonderful ideas for how to fix the problem, and you may love and believe in all of them. You may not, however, actually execute every single one of them.

This is where the Lean part comes in.

Ideally, you will design and execute as many of the fixes as necessary in order to move the number to where you want it to be. Maybe you’re awesome (or awesomely lucky), and you move that activation number on the first try with a very small bug fix.

Does that mean you never get to implement the super sweet, but somewhat complicated, feature that you know will make users incredibly happy and improve activation even more? No! Unfortunately, you may not get to implement it just yet.

You see, once you got your activation number to where it needed to be, it stopped being the most important problem to solve. Now, maybe you need to work on getting retention higher or improving revenue or referral.

On the flip side, maybe you redesign the first time user flow and improve activation, but not by enough. That means you should continue working on it. Figure out why your changes didn’t have as big of an impact as you thought they would, and then try some new things.

You’re not “finished” until your metrics are where you want them to be.

Why is this important? Startups have a ridiculous number of things to do, and they typically have limited resources. It can be incredibly difficult to prioritize when to keep working on a feature or an area of the product, and when to move on.

By setting the goals ahead of time based on metrics that are critical to the business, it becomes much easier to know when you’re “done,” and when you should keep optimizing or redesigning.

Like the post? Follow me on Twitter!

Like Lean UX but hate reading? I'll be on the UX panel at the Lean Startup track at SXSW. You should come see it and then say hi to me afterward.

Qual vs. Quant: When to Listen and When to Measure

I have written about qualitative vs quantitative research before, but I still get a lot of questions about it. To answer some of those questions, I want to do a bit of a deeper dive here and give a few examples to help startups answer the key question.

To be clear, that key question is “when should I use qualitative research, and when should I use quantitative research for the best results?” Another way of looking at this is, “when should I be listening to users, and when should I just be shipping code and looking at the metrics?”

The real answer is that you should do both constantly, but there are times when one is significantly more helpful than the other.

I will continue to repeat my cardinal rule: Quantitative research tells you WHAT your problem is. Qualitative research tells you WHY you have that problem.

Now, let’s look at what that actually means to you when you’re making product decisions.

A One Variable Change

When you’re trying to decide between qualitative and quantitative testing for any given change or feature, you need to figure out how many variables you’re changing.

Here’s a simple example: You have a product page with a buy button on it. You want to see if the buy button performs better if it’s higher on the page without really changing anything else. Which do you do? Qualitative of quantitative?

That’s right, I said this one was simple. There’s absolutely no reason to qualitatively test this before shipping it. Just get this in front of users and measure their actual rate of clicking on the button.

The fact is, with a change this small, users in a testing session or discussion aren’t going to be able to give you any decent information. Hell, they probably won’t even notice the difference. Qualitative feedback here is not going to be worth the time and money it takes to set up interviews, talk to users, and analyze the data.

More importantly, since you are only changing one variable, if user behavior changes, you already have a really good idea WHY it changed. It changed because the CTA button was in a better place. There’s nothing mysterious going on here.

There’s an exception! In a few cases, you are going to ship a change that seems incredibly simple, and you are going to see an enormous and surprising change in your metrics (either positive or negative). If this happens, it’s worth running some observational tests with something like UserTesting.com where you just watch people using the feature both before and after the change to see if anything weird is happening. For example, you may have introduced a bug, or you may have made it so that the button is no longer visible to certain users.

A Multi-Variable or Flow Change

Another typical design change involves adding an entirely new feature, which may affect many different variables.

Here’s an example: You want to add a feature that allows people to connect with other users of your product. You’ll need to add several new pieces to your interface in order to allow users to do things like find people they know, find other interesting people they don’t know, manage their new connections, and get some value from the connections they’ve made.

Now, you could simply build the feature, ship it, and test to see how it did, much the way you made your single variable change. The problem is that you’ll have no idea WHY it succeeded or failed - especially failed.

Let’s assume that you ship it and find that it hurts retention. You can assume that it was a bad feature choice, but often I find that people don’t use new features not because they hate the concept, but because the features are badly implemented.

The best way to deal with this is to prevent it from happening in the first place. When you’re making large, multi-variable changes or really rearranging a process flow for something that already exists on your site, you’ll want to perform qualitative testing before you ever ship the product.

Specifically, the goal here is to do some standard usability testing with interactive prototypes, so that you can learn which bits are confusing (ps. yes, there are confusing bits, trust me!) and fix them before they ever get in front of users.

Sure, you’ll still do an a/b test once you’ve shipped it, but give that new feature the best possible chance to succeed by first making sure you’re not building something impossible to use.

Deciding What To Build Next

Look, whatever you take from this next part, please do not assume that I’m telling you that you should ask your users exactly what they want and then build that. Nobody thinks that’s the right way to build products, and I’m tired of arguing about it with people who don’t get UCD or Lean UX.

However, you can learn a huge amount from both quantitative and qualitative research when you’re deciding what to build next.

Here’s an example: You have a flourishing social commerce product with lots of users doing lots of things, but you also have 15 million ideas for what you should build next. You need to narrow that down a bit.

The key here is that you want to look at what your users are currently doing with your product and what they aren’t doing with it, and you should do that with both qualitative and quantitative data.

Qualitative Approaches:

  • Watch users with your product on a regular basis. See where they struggle, where they seem disappointed, or where they complain that they can’t do what they want. Those will all give you ideas for iterating on current features or adding new ones.
  • Talk to people who have stopped using your product. Find out what they thought they’d be getting when they started using it and why they stopped.
  • Watch new users with your product and ask them what they expected from the first 15 minutes using the product. If this doesn’t match what your product actually delivers, either fix the product or fix the first time user experience so that you’re fulfilling users’ expectations.

Quantitative Approaches:

  • Look at the features that are currently getting the most use by the highest value customers. Try to figure out if there’s a pattern there and then test other features that fit that pattern.
  • Try a “fake” test by adding a button or navigation element that represents the feature you’re thinking of adding, and then measure how many people actually click on it. Instead of implementing an entire system for making friends on your site, just add a button that allows people to Add a Friend, and then let them know that the feature isn’t quite ready yet while you tally up the percentage of people who are pressing the button.

Still Don’t Know Which Approach to Take?

What if your change falls between the cracks here? For example, maybe you’re not making a single variable change, but it’s not a huge change either. Or maybe you’re making a pretty straightforward visual design or messaging change that will touch a lot of places in the product but that doesn’t actually affect the user process too much.

As many rules as we try to make, there will still be judgement calls. The best strategy is to make sure that you’re always keeping track of your metrics and observing people using your product. That way, even if you don’t do exactly the right kind of research at exactly the right time, you’ll be much more likely to catch any problems before they hurt your business.

Like the post? Follow me on Twitter!

Lean UX - A Case Study

For those very, very few (ok, none) of you who read my blog but don't read Eric Ries's blog, Startup Lessons Learned, I have some exciting news for you. But first, why the hell aren't you reading Eric's blog? You really should. It's great.

I've a written a guest post that now appears on the Startup Lessons Learned blog. It's a case study of a UX project I did with the lean startup Food on the Table.

If you're wondering whether design works well with lean startups, I answer that question in the post. Spoiler alert: The answer is 'yes'.

The Dangers of Metrics (Only) Driven Product Development

When I first started designing, it was a lot harder to know what I got right. Sure, we ran usability tests, and we looked generally at things like page counts and revenue before and after big redesigns, but it was still tough to know exactly what design changes were making the biggest difference. Everything changed once I started working with companies that made small, iterative design changes and a/b tested the results against specific metrics.

To be clear, not all the designers I know like working in this manner. After all, it's no fun being told that your big change was a failure because it didn't result in a statistically significant increase in revenue or retention. In fact, if you're a designer or a product owner and are required to improve certain metrics, it can sometimes be tempting to cheat a little.

This leads to a problem that I don't think we talk about enough: Metrics (Only) Driven Product Development.

What Is Metrics (Only) Driven Product Development?

Imagine that you work at a store, and your manager has noticed that when the store is busy, the store makes more money. The manager then tells you that your job is to make the store busier - that's your metric that you need to improve.

You have several options for improving your metric. You could:
  • Improve the quality of the shopping experience so that people who are already in the store want to stay longer
  • Offer more merchandise so that people find more things they want to buy
  • Advertise widely to try to attract more people into the store
  • Sell everything at half off
  • Remove several cash registers in order to make checking out take longer, which should increase the number of people in the store at a time, since it will take people longer to get out
  • Hire people to come hang out in the store
As you can see, all of the above would very likely improve the metric you were supposed to improve. They would all ensure that, for awhile at least, the store was quite busy. However, some are significantly better for the overall health of the store than others.



The same thing happens all the time when designing products. If your assigned goal is to increase the number of active users, there are lots of different design changes you could make, but not all of them will be equally effective for improving the actual goal, which is probably increasing the number of people who use the product and generate revenue.

How Does This Happen?

I think the biggest reason that this happens is that people fixate on metrics without understanding the reason behind the numbers. Designers and product owners are then pressured to move a number that represents a particular metric rather than focusing on improving the product.

One company I talked with had this problem with acquisition. The person who was responsible for acquiring new customers was simply given a budget and told to get as many users as possible for that amount of money. Unfortunately, the users that were cheapest to acquire were the least likely to spend money on the site. If, instead of trying to maximize the number of users, he had concentrated on maximizing the number of users who were likely to spend money, he would have acquired fewer people and missed his metric, but he would have increased revenue.

Another company had a design problem. They wanted to redesign their Invite a Friend feature to encourage people to invite more friends. Unfortunately, the "most effective" method of getting people to invite friends was to forcibly spam users' Facebook feeds and make it easy for users for accidentally invite everybody in their address books. While this resulted in more invitations sent, it also vastly increased the number of unhappy customers and decreased the percentage of invitations that were accepted. It also caused the company to be banned from Facebook and put on the spam list of several ISPs. It sure improved that invitation metric, though.

How Should You Avoid It?

There are three ways to avoid this problem, and you should use all of them.

Make sure that you're measuring the right metric.

If you care about revenue (and you should), measure revenue. If you care about retention, measure retention. If you care about page views, you're probably doing something wrong.

Unfortunately, it can be difficult to immediately see the impact of a particular design change on things like revenue and retention, which makes it tempting to use substitutes for the important number.

If you are using a substitute - for example, if you're using something like "customers returning once" as a shorthand for "becoming an active customer" make sure that the link is actually causal. In other words, if you can increase the number of people who come back once, make sure that that really does lead to an increase in people who become an active customer.

Make sure that you're not gaming the metrics.

Paying people to come back to your site may result in more returning customers, but it doesn't necessarily result in more customers paying you. If you cut your prices in half, you may end up selling twice as many items, but you're not making any more money. Make sure that, if you're moving a metric, you understand the second (and third and nth) order effects of whatever change improved the metrics.

Make your customer experience better.

This may seem obvious, but pissing off your customers is a terrible long term strategy, even if it briefly moves a metric. On the other hand, improving your customers' overall experience leads to happy, contented customers who stick around and continue to pay for your service. Over time, that's going to improve all of your metrics.

And always remember, metrics are just shorthand for real customer behaviors that are important to your business. They are a tool to help you understand your product, not a goal to be met at any cost. 

Like the post? Follow me on Twitter.

When Talking to Customers Isn't Enough

A few weeks ago, I was talking to the head of a small company about next steps. His company had a product with many happy, paying customers, and he wanted to know what to do next to make his current customers happier and attract some new ones.

The company had already made a good start. They had done surveys of current users asking the standard questions like, "How disappointed would you be if you could no longer use this product." They'd even surveyed current users about what new features they would like to see. The problem was, the happy customers couldn't think of anything else they wanted, and while the slightly less happy customers wanted some new features, there was no general consensus on one big, missing piece or fantastic new idea. So, the CEO wanted to know what to do next.

I believe this can be a problem with the "go out and talk to your customers" solution to product development. We can get so focused on talking to customers that we forget that it's not always the best way to figure out what to do next.

Observe Your Customers

Customers lie. They don't always mean to lie, but it often ends up that way. It's especially problematic when you ask people to explain how they currently use your product. Sometimes they forget parts of their process, or they don't even realize that they're doing certain things because it's all become so routine. Also, they tend to explain their process of using your product from start to finish, as if they weren't doing seven other things while trying to use your product. This can give you a totally skewed vision of what people are actually doing with your product.

What's the solution? Go out and watch them. Sit with them in their offices or homes and observe their real behavior. Most importantly, watch what they do immediately before and after they use your product.

I was talking with somebody who used to work for a marketplace that allows people to buy and sell products directly to each other. While observing users, her team noticed that, while people had very little trouble using the marketplace itself, the sellers spent a huge amount of time arranging shipping of the items. In fact, the shipping process took so much time that it limited the number of items they could list. By integrating with a shipping company to help customers print labels and schedule pickups, the company increased the number of items that could be listed which increased revenue.

Why didn't users ask for that? Well, the customers had a particular way of doing things. They thought of the marketplace as a place where they could buy and sell things, not as a product that helped them mail things. They had another solution that helped them mail things, and they didn't know that there was a better way to do that until they were presented with it.

Another critical thing you can learn by watching people is the environment in which your product is being used.  In one study I conducted, I was watching people process payroll. When asked how they processed payroll, customers could easily explain all the steps they went through. However, when I sat down beside them and watched, I realized that it wasn't nearly that simple. Not a single person got through the process uninterrupted. Phones rang. Coworkers stopped by to ask questions. Information was missing and had to be hunted down. Bosses needed tasks performed immediately. What they had described as a quick, linear process actually happened in fits and starts and could take place over a day or two.

Were they lying when they described their experiences? Not intentionally. They weren't asked to describe everything that could possibly happen while processing payroll, and they probably couldn't have answered that question if I'd asked it, since the interruptions varied wildly depending on the day and the office. In the end, the observations helped make the product more tolerant of this working style by allowing people to save state, skip areas where they didn't have the right information, and easily track what had already been done and what was still pending.

Connect With People Who Were Almost Your Customers

Don't forget, there's another really important group of people out there: people who were almost your customers. For every one person who signs up for your service or converts to a paying customer, there are lots of people who took a look or maybe used a free trial and then decided not to pull the trigger. A great way to build your customer base is to figure out why they made that decision.

The company I mentioned at the beginning of the post had a perfect audience for this. They offered a one month free trial, which meant that they had information about people who used the product, saw exactly what they had to offer, and then chose not to become customers. Maybe they didn't convert because the product was lacking a key feature. Maybe they didn't understand how to use it. Maybe it didn't do what they expected it to from reading the description on the website. These are all totally fixable problems, but you can't fix them until you know what they are.

Let me just head off the inevitable criticism of this approach right now. I am not advocating that you listen to every single thing that your almost-customers ask for and start building those features immediately. That would be insane. What I am suggesting is that you listen to the reasons that they give for not using your product and then analyze the data for patterns. Are lots of people saying that the product didn't do what they expected? Maybe the problem is that your marketing materials are deceptive. Are they complaining that it didn't do what they wanted? Find out what they wanted to do and what they're currently using to do it. How you incorporate their feedback is up to you, but you can't respond to feedback unless you're asking for it.

Of course, non-customers can be a little harder to connect with than customers, and they do tend to be less likely than customers to allow you to come hang around their offices all day and watch them work. Starting with a survey or an email asking to interview them on the phone can get you lots of good information about what is keeping them from becoming customers. Once you've built a rapport, some of them might even let you come watch them use the product.

Take Another Look at Customer Data

Not all companies have the ability to collect extensive customer data, but if you do, you may not be taking full advantage of it. For example, companies often fail to figure out the difference between the sorts of people who do become customers and those who don't.

Is your product only being purchased by companies with fewer than five employees? If so, that may be a signal to increase your marketing efforts to small companies while decreasing your spend on advertising to larger ones. Are your customers disproportionately mothers or college students or left handed circus performers? If so, start connecting with people who fit that profile to see what they think of your product and whether it solves a particular need for them that you might not have known anything about.

Or, the difference could be based on behavior rather than demographics. For example, if you have a freemium model or a free trial period, you should be looking at the behavior leading up to a user converting to paid or abandoning the product.

One client I worked with created a huge model showing all the different behaviors of users to try to understand which behaviors were most likely to lead to higher revenue and retention. Once we knew that users who explored a particular part of the product in the first five minutes were most likely to pay us, we could start experimenting with what would happen if we emphasized that part of the product early. Once we found that people who went down a different path abandoned the product, we could study that particular flow and find out if we were unintentionally confusing people or driving them away.

So Should I Still Talk To Customers?

Of course you should! Staying in constant contact with customers is vital to understanding your market and keeping people happy. It's just not always enough. If you feel like talking to customers has left you at a dead end or you want to get a perspective from somebody who isn't already a customer, give some of these alternate methods a try. You might be surprised at what you learn.

Like the post? Follow me on Twitter, please.

You should also check out some of my other posts on user research and customer development:

5 Mistakes People Make Analyzing Qualitative Data

My last blog post was about common mistakes that people make when analyzing quantitative data, such as you might get from multivariate testing or business metrics. Today I’d like to talk about the mistakes people make when analyzing and using qualitative data.

I’m a big proponent of using both qualitative and quantitative data, but I have to admit that qualitative feedback can be a challenge. Unlike a product funnel or a revenue graph, qualitative data can be messy and open ended, which makes it particularly tough to interpret.

For the purposes of this post, qualitative information is generated by the following types of activities:
  • Usability tests
  • Contextual Inquiries
  • Customer interviews
  • Open ended survey questions (ie. What do you like most/least about the product?)

Insisting on Too Large a Sample

With almost every new client, somebody questions how many people we need for a usability test “to get significant results.” Now, if you read my last post, you may be surprised to hear me say that you shouldn’t be going for statistical significance here. I prefer to run usability tests and contextual inquiries with around five participants. Of course, I prefer running tests iteratively, but that’s another blog post.

Analyzing the data from a qualitative test or even just reading through essay-type answers in surveys takes a lot longer per customer than running experiments in a funnel or looking at analytics and revenue graphs. You get severely diminishing returns from each extra hour you spend asking people the same questions and listening to their answers.

Here’s an example from a test I ran. The customer wanted to know all the different pain points in their product so that they could make one big sweep toward the end of the development cycle to fix all the problems. Against my better judgment, we spent a full two weeks running sessions, complete with a moderator, observers, a lab, and all the other attendant costs of running a big test. The problem was that we found a major problem in the first session that prevented the vast majority of participants from ever finding an entire section of the interface. Since this problem couldn’t be fixed before moving on to the rest of the sessions, we couldn’t actually test a huge portion of the product and had to come back to it later, anyway.

The Fix: Run small, iterative tests to generate a manageable amount of data. If you’re working on improving a particular part of your product or considering adding a new feature, do a quick batch of interviews with four or five people. Then, immediately address the biggest problems that you find. Once you’re done, run another test to find the problems that were being masked by the larger problems. Keep doing this until your product is perfect (ie. forever). It’s faster, cheaper, and more immediately actionable than giant, statistically significant qualitative tests, and you will eventually find more issues with the same amount of testing time.

It’s also MUCH easier to pick out a few major problems from five hours of testing than it is to find dozens of different problems from dozens of hours of testing. In the end though, you’ll find more problems with the iterative approach.


Extrapolating From Too Small a Sample

I always do this don’t I? Say one thing, and then immediately warn you not to go too far in the opposite direction. The thing is, I get really tired of running five person tests and having a product owner only show up for one session and then go off and address whatever problems s/he saw during that one hour. One or two participants aren’t enough to really get a sense of the pattern of problems in your product.

Besides, I have this little rule of thumb I’ve developed for studies. No matter how great your screener or recruiter, on average for every 10 participants you schedule, one will be a no-show, one will be some sort of statistical outlier (intelligence, computer savvy…something), and one will be completely insane. If the product owner happens to show up only for one of the last two types, their perception of the product’s problems will be totally skewed.

I had one product where we interviewed ten people over the course of two tests. Nine of the ten people were wildly confused by the product, but one, who I swear was a ringer, nailed all the tasks in record time. Guess which session the product manager showed up for? Yeah.

The Fix: As the person making the decisions about what changes you should make in your product, you should be attending all or at least most of your user interview sessions, even if you’re not running them yourself. You should also be looking directly at all of your survey data, not just skimming it or reading a high level report. Honestly, if you’re the one making decisions about product direction, then you are the one who most benefits from listening to your users. If you’re not paying attention to the results, then the testing is really just a waste of time.

Look at all your data before drawing conclusions. I mean it.

Trying to Answer Specific Questions

Qualitative data is very bad at answering specific questions like “Which landing page will people like better?” or “How much will people pay for this?” What it’s great for is generating hypotheses that can then be tested with quantitative means.

In more than one test, I’ve had clients ask me to test various different images to use on landing pages to see which one was most appealing. I always explain that they’re better off just doing a split test to see which one does best, but sometimes they insist. Unfortunately, these sorts of preference differences are often very subtle. Since people are not making the decisions consciously, it's very hard for them to explain why they prefer one thing over another. We always end up getting a lot of people trying to rationalize why they something, and I rarely trust the results.

The Fix: Use qualitative data to generate hypotheses that you then test quantitatively OR to find major problems in your interface. Don’t try to use qualitative data to get a definitive answer to questions about expected user preferences.

Ignoring Inconvenient Results

Because qualitative testing doesn’t generate hard numbers, it’s easy to let confirmation bias sneak into the analysis. While it might be tough to argue with “this registration flow generated 12% more paying customers than the other one,” it’s pretty easy to discount problems observed in user sessions.

I dealt with a particularly resistant product owner who had an excuse for every single participant’s struggles with the product. One was unusually stupid. Another just didn’t understand the task. Another actually understood it but was, for some reason, actively screwing with us. This went on and on while every single participant had the same problems over and over. Also, the discussion guide, which the product owner and everyone on the team had originally thought was perfectly fair, suddenly became wildly biased and the tasks were judged to be impossible. The problem couldn’t possibly have been with the product!

The Fix: If you are finding fault with all of the participants or the moderator or the questions or the survey, it’s time to get somebody neutral into the room to help determine what is biasing the results. Hint: it’s almost certainly you.

Remember, your customers, moderator, and test participants don’t have a stake in making your product seem worse than it is. You, however, may have an emotional stake in making it better than it actually is. Make sure you’re not ignoring results just because they’re not what you want to hear.

Not Acting on the Data

Why would you even bother to run test if you’re not going to pay attention to the results? I mean, tests aren’t free. Even running surveys has an associated cost, since you’re interrupting what your user is doing to ask them to help you out. And yet, so many clients do exactly this.

One client I worked with wanted to set up a system where they ran tests every week. They wanted to have a constant stream of users and potential users coming in all the time so that they could stay in contact with their users. I thought this was a fantastic idea, and so I started bringing people in for them. Unfortunately, after a few months, people began to complain that they were hearing the same problems over and over again.

I explained that they were going to continue to hear the same problems over and over again until they fixed the problems. I gave them a list of the major issues that their current and new users were facing. Every once in awhile, if I complained loudly enough, they would fix one of the easier problems, and unsurprisingly these changes always positively affected their metrics. And yet, it was always a struggle to get the results from the tests incorporated into the product. I eventually stopped running tests and told them that I would be happy to come back and start again as soon as they had addressed some of the major problems.

The Fix: This one should be simple. If you’re going to spend all that time and money generating data, you should act on the results.

I want your feedback!

Have you had problems interpreting or using your qualitative data, or do you have stories about people in your company who have? Please, share them in the comments section!

Want more? Follow me on Twitter!

Also, if your company is currently working on getting feedback from users, I’d love to hear more about what you are doing and what you’d like to be doing better. Please take this short survey!

5 Big Mistakes People Make When Analyzing User Data

I was trying to write a blog post the other day about getting various different types of user feedback, when I realized that something important was missing. It doesn’t do any good for me to go on and on about all the ways you can gather critical data if people don’t know how to analyze that data once you have it.

I would have thought that a lot of this stuff was obvious, but, judging from my experience working with many different companies, it’s not. All of the examples here are real mistakes I’ve seen made by smart, reasonable, employed people. A few identifying characteristics have been changed to protect the innocent, but in general they were product owners, managers, or director level folks.

This post only covers mistakes made in analyzing quantitative data. At some point in the future, I’ll put together a similar list of mistakes people make when analyzing their qualitative data.

For the purposes of this post, the quantitative data to which I’m referring is typically generated by the following types of activities:
  • Multivariate or A/B testing
  • Site analytics
  • Business metrics reports (sales, revenue, registration, etc.)
  • Large scale surveys

Statistical Significance

I see this one all the time. It generally involves somebody saying something like, “We tested two different landing pages against each other. Out of six hundred views, one of them had three conversions and one had six. That means the second one is TWICE AS GOOD! We should switch to it immediately!”

Ok, I may be exaggerating a bit on the actual numbers, but too many people I’ve worked with just ignored the statistical significance of their data. They didn’t realize that even very large numbers can be statistically insignificant, depending on the sample size.

The problem here is that statistically insignificant metrics can completely reverse themselves, so it’s important not to make changes based on results until you are reasonably certain that those results are predictable and repeatable.

The Fix: I was going to go into a long description of statistical significance and how to calculate it, but then I realized that, if you don’t know what it is, you shouldn’t be trying to make decisions based on quantitative data. There are online calculators that will help you figure out if any particular test result is statistically significant, but make sure that whoever is looking at your data understands basic statistical concepts before accepting their interpretation of data.

Also, a word of warning: testing several branches of changes can take a LOT larger sample size than a simple A/B test. If you're running an A/B/C/D/E test, make sure you understand the mathematical implications.

Short Term vs. Long Term Effects

Again, this seems so obvious that I feel weird stating it, but I’ve seen people get so excited over short term changes that they totally ignore the effects of their changes in a week or a month or a year. The best, but not only, example of this is when people try to judge the effect of certain types of sales promotions on revenue.

For example, I've often heard something along these lines, “When we ran the 50% off sale, our revenue SKYROCKETED!” Sure it did. What happened to your revenue after the sale ended? My guess is that it plummeted, since people had already stocked up on your product at 50% off.

The Fix: Does this mean you should never run a short term promotion of any sort? Of course not. What it does mean is that, when you are looking at the results of any sort of experiment or change, you should look at how it affects your metrics over time.

Forgetting the Goal of the Metrics

Sometimes people get so focused on the metrics that they forget the metrics are just shorthand for real world business goals. They can end up trying so hard to move a particular metric that they sacrifice the actual goal.

Here’s another real life example: Once client decided that, since revenue was directly tied to people returning to their site after an initial visit, they were going to “encourage” people to come back for a second look. This was fine as far as it went, but after various tests they found that the most successful way to get people to return was to give them a gift every time they did.

The unsurprising result was that the people who just came back for the gift didn’t end up actually converting to paying customers. The company moved the “returning” metric without actually affecting the “revenue” metric, which had been the real goal in the first place. Additionally, they now had the added cost of supporting more non-paying users on the site, so it ended up costing them money.

The Fix: Don’t forget the actual business goals behind your metrics, and don’t get stuck on what Eric Ries calls Vanity Metrics. Remember to consider the secondary effects of your metrics. Increasing your traffic comes with certain costs, so make sure that you are getting something other than more traffic out of your traffic increase!

Combining Data from Multiple Tests

Sometimes you want to test different changes independently of one another, and that's often a good thing, since it can help you determine which change actually had an effect on a particular metric. However this can be dangerous if used stupidly.

Consider this somewhat ridiculous thought experiment. Imagine you have a landing page that is gray with a light gray call to action button. Let's say you run two separate experiments. In one, you change the background color of the page to red so that you have a light gray button on a red background. In another test, you change the call to action to red so that you have a red button on a gray background. Let's say that both of these convert better than the original page. Since you've tested both of your elements separately, and they're both better, you decide to implement both changes, leaving you with...a red call to action button on a red page. This will almost certainly not go well.

The Fix: Make sure that, when you're combining the results from multiple tests that you still go back and test the final outcome against some control. In many cases, the whole is not the sum of its parts, and you can end up with an unholy mess if you don't use some common sense in interpreting data from various tests.

Understanding the Significance of Changes

This one just makes me sad. I’ve been in lots of meetings with product owners who described changes in the data for which they were responsible. Notice I said “described” and not “explained.” Product owners would tell me, “revenue increased” or “retention went from 2 months to 1.5 months” or something along those lines. Obviously, my response was, “That’s interesting. Why did it happen?”

You’d be shocked at how many product owners not only didn’t know why their data was changing, but they didn’t have a plan for figuring it out. The problem is, they were generating tons of charts showing increases and decreases, but they never really understood why the changes were happening, so they couldn’t extrapolate from the experience to affect their metrics in a predictable way.

Even worse, sometimes they would make up hypotheses about why the metrics changed but not actually test them. For example, one product owner did a “Spend more than $10 and get a free gift” promo over a weekend. The weekend’s sales were slightly higher than the previous weekend’s sales, so she attributed that increase to the promotion. Unfortunately, a cursory look at the data showed that the percentage of people spending over $10 was no larger than it had been in previous weeks.

On the other hand, there had been far more people on the site than in previous weeks due to seasonality and an unrelated increase in traffic. Based on the numbers, it was extremely unlikely that it was the promotion that increased revenue, but she didn’t understand how to measure whether her changes actually made any difference.

The Fix: Say it with me, "Correlation does not equal causation!" Whenever possible test changes against a control so that you can accurately judge what effect they’re having on specific metrics. If that’s not possible, make sure that you understand ahead of time which changes you are LIKELY to see from a particular change and then judge whether that happened. For example, a successful “spend more than $10 promo” should most likely increase the percentage of orders over $10. 

Also, be aware of other changes within the company so that you can determine whether it was YOUR change that affected your metrics. Anything from a school holiday to an increased ad spend might affect your numbers, so you need to know what to expect.

I want your feedback!

Have you had problems interpreting your quantitative data, or do you have stories about people in your company who have? Please, share them in the comments section!

Also, if your company is currently working on getting feedback from users, I’d love to hear more about what you are doing and what you’d like to be doing better. Please take this short survey!