Data Storytelling: How to Engage with Your Audience
Because remembering numbers is not as easy as remembering stories
While nowadays people keep talking about data being a new oil, what if we can’t convert it into energy? Won’t it be such a wasteful resource if this ‘new oil’ can’t be utilized and converted into actionable insights? We need to know how to deliver data to be easily understood by the audience, also how to make it more engaging to excite them to know more! In this post, let me share a few things you might need to know about storytelling with data.
Types of Storytelling
Firstly, we need to know about which medium are we gonna use to deliver the insights? Are we going to be a speaker in a workshop, or publish a YouTube video, or write in an academic paper? Each medium will need a different technique on how to engage with the audience.
Oral presentation
In delivering data through oral presentation, especially in a live event (e.g. seminar, workshop, conference, etc), the audience may be more forgiving if you make a little mistake. Also, the possibility is the audience will be in a smaller scope. Though, unlike a live event, an oral presentation in a recorded format (e.g. YouTube or online course video) will require more preciseness and accuracy, as well as can be watched by a bigger audience. However, in an oral presentation, the duration is more likely to be limited. So, we need to focus on the bigger idea first, then explain the details in the remaining time. We are also able to explain with intonation and gesture to help to engage with the audience.
Written document
Unlike in an oral presentation, we are more likely to have a bigger audience in a written storytelling format, and the audience can be less forgiving if they find any mistake. The audience can read our analysis over and over again. However, since there is no limited duration, we are able to elaborate more to explain our data and get into more details. It would be great to combine the explanation with visualization, as well as by highlighting the key takeaways for the audience.
Five Stages of Storytelling
To engage our audience with data storytelling, context is important. We need to make sure that they have understood the background first before focusing on the data and findings. Therefore, we should follow these five steps in data storytelling:
- Exposition: Explain the context, background of the research, scope of the data.
- Rising action: Define the conflict and challenge the status quo, such as what is the hypothesis for our current problem, or what has the potential to be improved?
- Climax: Here is where data plays the most important role. Determine the key findings, such as the result of hypothesis testing, the most efficient method that can be implemented, the most potential segment to be developed, etc.
- Falling action: Elaborate on the key findings that we have mentioned during the climax stage, such as what are the implications from the hypothesis testing result? On what justifications do we determine the most efficient method or the most potential segment? How can these help to grow our business?
- Resolution: Propose the action plan regarding the findings. We may also mention the stakeholder we need to collaborate with, as well as how we may prioritize the action plan.
For example, imagine we are a data analyst for a retail company. We want to investigate our online channel sales performance from last month’s campaign, so that we can develop a better strategy for next month. This is the story we need to tell in the Exposition stage, so the audience can understand that we talk about last month’s data and offline channel sales are excluded, and they also know what is the objective from the analysis.
After that, we deliver our concern about how last month’s sales performance still did not meet the target, and we may address our hypothesis about why we think we underperformed, such as saturated market, payment issue, etc. These are the potential conflicts we tell in the Rising Action stage. Then, we prove them and explain them by data in the Climax stage. We can break down sales performance by each product category, or by customer segment, then analyze which category or segment that did not perform well. We can also track the payment flow during the campaign and address the issue if any.
In the Falling Action stage, we elaborate on the details regarding how we found that product category A or customer segment X was underperformed, did we compare them based on 6 months growth or a year growth? Or maybe we just compared last month’s market share? We may also add our analysis about why product category A or customer segment X is relatively less profitable. For the last stage, which is the Resolution stage, we deliver our idea on how to handle the issue and to improve sales performance for next month’s campaign. Do we need to penetrate the existing market or should we target a new market? Or perhaps we need to develop a new product feature? The proposed action plan will make data-driven insights more impactful to the audience.
Know your audience
Experts/non-experts
By knowing whether if our expected audience is the experts in the field or not, we can adjust how to deliver the story. Avoid using any technical terms if the audience is not experts, and conversely, we may need to explain more in detail if the experts will be the audience. Doctors do not explain their diagnosis to the patients using medical terms, they need to deliver it as simple as possible to be understandable by the patient. Though, they need to explain those medical terms in detail when they discuss it with other doctors.
However, if we don’t really know who is going to be the audience, it is suggested to assume the audience as non-experts. We can explain the general things from a broader view first, then giving the hint that we are willing to explain further details in another follow-up session (if any), such as a Q&A session.
Horizontal/vertical
In a company organizational structure, sometimes the audience is horizontal — we present our analysis to our colleagues, and sometimes the audience is vertical — we present it to the managers (or maybe the owner!). When presenting to our colleagues, we want it to be a bit more relaxed and informal, and we may add a little humor during the presentation. However, we want to look decent and well-prepared when presenting to the managers, so it is not the right place to insert any humor. If we do not have any clue about who is going to be the audience, it is better to keep the storytelling to be formal and avoid being disrespectful.
Data visualization
Choosing the right visualization is essential to deliver our idea to the audience efficiently. Conversely, a wrong visualization may cause the audience unable to understand the data, or even misunderstand it. The most commonly used chart types for visualization are bar charts, line charts, and pie charts. Sometimes we may also need to use scatter plots or heatmaps (but we can leave these for later).
Bar chart is useful for comparing the values of several objects, mostly for discrete or categorical data. Line graph is useful for determining a trend during a specific interval of time and the data is continuous (do not use a line graph for categorical data!). Pie chart is useful for representing proportions of a whole population, thus the total of all proportions should be 100%. However, if there are too many objects dividing a pie chart, it can be too hard to be compared and we can use a bar chart instead.
There are several common wrong use cases in data visualization that we should be aware of:
- Misleading Y-axis
When comparing several values, we should set the Y-axis to 0. The graph below has the Y-axis for 50,000 and it misleads the viewer that there are very wide income gaps within counties.
- Showing comparison using a line chart
This line chart below is used for the wrong purpose. The up and down trendline does not mean anything. So what if there is an increase in travel expenses from IT to Sales department? For comparison like this, we can use bar charts instead.
- Pie charts are evil
There are many possibilities of wrong use cases in pie charts, such as having a total of more than 100%, too many sections, using 3D pie charts (which make them harder to read), and so on. I find a great Medium article about 5 Things You Should Know Before You Make a Pie Chart, to avoid us from making those mistakes.
Your audience is not you
- Guide them through the chart
While explaining data using a chart, we may have seen the same chart over and over again when preparing the presentation and become very familiar with it. But our audience doesn’t! It is the very first time they see the chart and they will need time to process the information. When the chart pops up on the screen, they most likely will try to observe the axis and the legend first, before analyzing the numbers. Take more time to explain this first, before jumping into the meaning of the numbers in the chart.
- Important things first
When delivering the findings, don’t start with the details. Start with the big picture first. This way, we can get the maximum attention from the audience, also avoid them from being confused and distracted.
- Make them connected
Relevance is key. To make the audience cares about our data, we need to make them feel related to the story. That’s why we need to share the context of the analysis in the Exposition stage before jumping into the main findings. We may also need to research our audience’s background, if necessary. For example, a housewife may not be familiar with automotive spare parts, but we can try to make them relate to the story by telling them that similar to kitchen appliances, we need to be equipped with the product knowledge first before selling the products in order to attract and engage with customers, therefore we can increase our sales performance.
Ethics in data storytelling
Data manipulation
Data manipulation is different from mistakes in data preparation. While data manipulation is intentional, mistakes in processing data are unintentional. Data manipulation occurs when the data is made up and is used for decision making. Oftentimes, we may encounter datasets that do not meet our expectations, e.g. many missing values, unexpected patterns, do not support the hypothesis, etc. But if the data says so, then we should use it and never make up the data just to be exactly what we want. This will lead to data manipulation and cause us to make the wrong decision.
Data misrepresentation
Data misrepresentation may happen when we take the wrong sampling. For example, we want to understand the customer experience in purchasing through an online channel, then we collect data by conducting a survey toward the population of our retail store customers. From 500 respondents contributing to the survey, eventually, there are 300 respondents who only browse for products on online channels for price comparison, but they still make purchases at offline stores. The result of the survey then will be biased, since more than half of the respondents do not have full experience in purchasing through the online channel, then this causes data misrepresentation. Be careful with collecting the sample data, understand the objective thoroughly first so that we don’t end up misrepresenting the data, and making an inaccurate conclusion.