Comparing how different audio experiences affect user behavior and attitude 

Company

Spotify

Date / Location

July - Aug. 2017
New York, NY

Methods

Interviews
Published research
Surveys
A/B testing

Topics

User research
Experiments


Problem

To prepare for the release of a new product, Spotify wanted to know how different types of audio would affect behavior and attitude for a key segment of listeners. (This research is subject to a non-disclosure agreement.)

Approach

Working together with another intern, I used key informant interviews, published research and internal data analysis to develop hypotheses about what might have the strongest effect on listeners' behavior and attitude. We tested these hypotheses by running an AB test. To understand of how the audio experiences might impact people's attitudes, we ran a survey where users were exposed to an audio recording and  asked to describe their perception of its content.

Impact

After running our A/B test and survey, we did not detect significant differences in key behavioral and attitudinal metrics (comparing control and experimental audio experiences). The differences we did detect were small (1-6% different than control). To determine if these differences were statistically significant, we would need several thousand more survey respondents, which was too expensive to justify.

 Our interviews and published research did, however, uncover actionable insights into user needs and pain points.  Overall, our research enabled us to:

  • Recommend that the product team abandon its pursuit of a singular, optimum audio experience for a segment of this scope.
  • Define a specific user need that was not being addressed in the current design.
  • Provide concrete design recommendations for how to satisfy the unmet user need (including a mockup).
  • Share our findings and recommendations with the product team in a 20-minute  presentation.

Highlights

[Redacted]

Next Steps

Spotify can take these steps to iterate on the product:

  • Create a prototype with a new design that addresses the pain point we discovered.
  • Conduct usability tests with five target users to gauge how discoverable, understandable and efficient the new solution is. Iterate and repeat testing as needed.
  • Based on usability testing results, create hypotheses about how the new design will impact key behavioral metric(s).
  • Build the new design.
  • Run an A/B test to see if our chosen behavioral metric changes as expected.

Takeaways

"Unsuccessful" A/B tests are still valuable!

Your experiments won't always confirm your hypothesis. Sometimes they'll be inconclusive. But that doesn't mean they are not providing you valuable information. It is important to know when a line of inquiry is too broadly scoped, or when a product feature is not affecting the metrics you care about. This lets you focus on higher-impact features, and ask questions that have clearer answers. In the end, that can save significant time and money, and it can lead your team toward more valuable, high-impact research.

As researchers and data scientists, we should iterate on our experiments. A/B tests only "fail" when we fail to learn from them.