How to Use Data Anyltics for Design
Incorporating data analytics into the UX practice
Or three things that I have learned from my Data and Analytics Master Class.
A while ago I wrote an article about why I decided to pursue a Master's Degree in UX Design. Since then, I've already completed four courses at the Northwestern University online Master of Science in Information Design and Strategy, and this semester I am taking a fifth one. So far I have learned a lot about user-centric design, information architecture, and leadership, but the most challenging — and at the same time the most valuable — class I've taken has been Data and Analytics.
The materials, r e adings, and new concepts were overwhelming. My skills in statistics were limited, and my statistics vocabulary was rudimentary at best. In addition, our intimidating final assignment was to predict the outcome of the 2020 presidential campaign, which is itself a very complex topic.
Before the class I had some basic knowledge of how designers should use data to make design decisions, but as the semester progressed, I realized how shallow my perspective had been. The class gave me a great background that I would like to share with all of you.
Imperfections and errors are a part of the big data.
The problem starts with the process of collecting the data. To mitigate risk at the early stage, we should focus on key questions that we want to be answered and be clear about what we hope to achieve. Then we need to focus the data collection around those specific questions rather than grasping in the dark for random trends or attempting to collect and synthesize all possible data. Focusing on particular questions will help to convey the story we want to tell.
To mitigate risks post-data-collection, we should work with data scientists and follow some basic steps.
"No matter how much you trust your quants, don't stop asking them tough questions." — HBR Guide Series
First of all, we should evaluate where the data came from. We can dig deeper by assessing data quality independently from the data scientists and learn both where the data was created and how it was defined. Asking the right questions can help us better understand the data set and build a stronger relationship with data analysts.
What was the source of your data?
How well does the sample data represent the population?
Does your data distribution include outliers? How did they affect the results?
What assumptions are behind your analysis? Might certain conditions render your assumptions and your model invalid?
Why did you decide on that particular analytical approach? What alternatives did you consider?
How likely is it that the independent variables are actually causing the changes in the dependent variable? Might other analyses establish causality more clearly?
Using cleansing methods, such as rinse, wash, and scrub, can help to ensure that the data is correct, consistent, and useable. All of these techniques most likely won't be able to eliminate all error, but we have to remember the risk and try to mitigate it as much as we can.
Noah Yonack, who is a data scientist at and a Harvard graduate, states that "The point here is merely that most data about most things is radically imperfect, so we shouldn't throw it around without acknowledging its biases." Asking all the above questions will help to mitigate some of the inherent issues with represented data that lead to wrong conclusions and also to extract the information that will let you make business or design decisions with confidence.
"The point here is merely that most data about most things is radically imperfect, so we shouldn't throw it around without acknowledging its biases." — Noah Yonack, Chan Zuckerberg Initiative
Quantitative data is also biased and bias is inherent in any dataset.
As a product designer, I mostly deal with qualitative data, which is based mostly on observations and interpretations. Because of its subjective nature, it makes it difficult for the researcher to be detached completely from the data. However, while quantitative data might for many reasons seem somehow more scientific or credible, it has very similar issues.
"There are three kinds of lies: lies, damned lies and statistics." - Mark Twain
One example of issues that quantitative data can have is the disappearance of outliers when averaging data. Or, on the other hand, the skewing of an average that can result from factoring in outliers. Yet if we approach this data from a qualitative research perspective, these outliers are properly addressed. Using a mixed approach in research is always helpful.
Correlation does not mean causation.
We must be careful to avoid just letting the available data lead us to conclusions we would like find. Digging deeper into the available data set and understanding the topic thoroughly can help to avoid wrong correlations and instead come to conclusions that the data actually indicate.
Just because two trends seem to fluctuate in tandem, this rule posits, that doesn't prove that they are meaningfully related to one another. — Mark Wilson, Fast Company
An old joke that was viral a few years ago can help us understand this better. In the Bread kills article, multiple statements that use data show that bread is bad for our health and, further, even kills people.
Bread is associated with all the major diseases of the body. For example, nearly all sick people have eaten bread. The effects are obviously cumulative:
- 99.9% of all people who die from cancer have eaten bread.
- 100% of all soldiers have eaten bread.
- 96.9% of all Communist sympathizers have eaten bread.
- 99.7% of the people involved in air and auto accidents ate bread within 6 months preceding the accident.
- 93.1% of juvenile delinquents came from homes where bread is served frequently.
This case greatly shows how statistics and big data can be misleading and how important it is to understand context. There are a lot of similar, more insidious examples on the Internet that do not sound as absurd as the above one but have the same issue. For example: "Number people who drowned by falling into a swimming-pool correlates with number of films Nicolas Cage appeared in" or "Per capita consumption of cheese (US)
correlates with number of people who died by becoming tangled in their bedsheets"
Seema Singh, in her article, perfectly explains why correlation does not imply causation. Correlation shows how strongly the pair of variables linearly related, but does not tell why. While causation indicates that one event is the result of the occurrence of the other event.
Just after finding correlation, don't draw the conclusion too quickly. Take time to find other underlying factors as correlation is just the first step. Find the hidden factors, verify if they are correct and then conclude. — Seema Singh
Summary
While data can empower your design process and can help validate your ideas and design decisions, it should not be used blindly.
How to Use Data Anyltics for Design
Source: https://uxdesign.cc/incorporating-data-analytics-into-the-ux-practice-9fe1f3a6acac
0 Response to "How to Use Data Anyltics for Design"
Post a Comment