Data is the new oil. But is your app’s refinery set up correctly?
Just like oil, raw data isn’t valuable in and of itself - rather, its value is created when it is captured quickly, completely, and accurately.
Emma Raz, Vincent Goff, Hector Durham
Nov 9 · 4 min read
Once properly refined, data is an invaluable decision-making tool for app publishers: it has the power to multiply user engagement and retention rates by informing product development, usability and content personalization, to increase in-app purchases by informing what to offer whom and when best to reach them, to strengthen user acquisition by revealing who an app’s biggest fans are, and to suggest which brands are most likely to pay premium for an app’s ad inventory.
As a data technology company, we’re very aware of common mishaps publishers and data providers tend to make and the types of biases they typically fail to account for. While there’s no such thing as perfect data, by becoming aware of such issues and correcting common biases, you can get pretty close.
The best part? Data quality follows the Pareto principle. That is to say, if you know what to address, you can get 80% of the results with 20% of the effort. We stuck our heads together with our machine learning engineering team, which built and trained our contextual and behavioral models, and listed the low hanging fruit that every app publisher should consider addressing.
The result is this blogpost. It consists of 2 parts:
- Part 1 sheds a light on the 3 key considerations that make or break app publishers’ data strategies, with actionable recommendations to maximize your data quality
- Part 2 is a script that app publishers can follow to assess the quality of data (technologies) offered by vendors
Let’s jump right in.
Key determinants of data quality
When we talk about ‘data quality’, what we mean is the extent to which the captured information represents reality. High quality data is an accurate representation of reality, while low quality data is a poor one.
When building your data collection strategy, there are 3 key considerations that will have a big impact on data quality:
- Research type: how will the data be captured?
- Sampling frequency: how up to date will the data be?
- Contextualization: does the dataset include context?
To get the most out of this section, we’d encourage you to think of an example use case you’re currently working on (e.g. personalizing content recommendations in your radio player, increasing in-app purchases in your game) and check it against each consideration. Now, let’s take a closer look at each.
1. Research type: observational vs. self reporting
Observational studies, also called naturalistic observation, involve watching and recording the actions of your users without interfering with their behavior. Self-report studies, on the other hand, rely on users entering information through a survey, onboarding flow, account setup process or other data aggregation method.
While both are important and have their own use cases, self-report studies tend to be less accurate because they contain a self-report bias: the difference between the reported value and the true value. Self-report bias arises from any combination of 4 causes:
- Social desirability: what people report about themselves is typically a reflection of how they’d like to see themselves, rather than an objective measure of who they are
- Recall period limitations and the fact that “every time we take a memory off the shelf in our brain, we put it back just a tiny bit differently”
- Selective recall, which arises because memories of highly emotional experiences are more easily retrieved than memories of non-emotional experiences
- Sampling bias, since people who complete forms are categorically different to people who don’t – and may, thus, not accurately represent the population
To minimize self-report bias (eg. from onboarding flows, profile settings, or surveys), supplement such data with observational data, which is typically analytics-based (e.g., content consumed, time spent, in-app purchases). When using 3d party data services, make sure to use technologies that are based on observational techniques (such as technologies that leverage sensor data to predict the context a user is whilst using their phone) rather than surveys or other self-reporting methods.
2. Sampling frequency: one-off vs. ongoing
I’m an iPhone user, so every now and again my phone throws up a ‘last year, today’ image. More ￼often than not, I’m some degree of mildly shocked by my outfit (did purple Crocks with green polka dots really seem like a good idea, like, ever?). In my defense, it appears I’m not alone: to see how quickly our preferences change, look no further than the booming second hand clothing market.
If people’s preferences change quickly, any one-off datapoint is likely to be inaccurate. This inaccuracy snowballs when buying third party data, in which timestamps are often discarded: it is now altogether impossible to gauge how recent and thus relevant the data is.
With ongoing data collection, in which you are constantly re-calculating the insight based on new data points, the data is fresh by design - and therefore of higher quality.
Now, if you’re Netflix looking to recommend a show to me, the user, you are looking at everything I watched over the last few months. But why, then, are you willing to target me to become a Netflix user based on unverified datasets from walled gardens and DMPs?
Supplementing your survey data with observational data typically means an automatic increase in your sampling frequency, because observational data is often ongoing (whilst self-reported data is typically one-off). That means incorporating observational solutions such as sensor-based technologies in your set up will kill two birds with one stone.
To be sure: self-reported data can provide extremely valuable insights for app publishers. In the context of this article, we’re merely arguing that observing ‘what is’ gives the most accurate performance overview (for example, measure campaign performance through an attribution link) - then, to understand why the results were what they were (eg. Twitter performance went through the roof, while LinkedIn didn’t convert at all), surveys can provide an invaluable resource.
3. Contextualization: contextualized vs. non-contextualized data
So I use this biomedical app to try and keep my work-induced stress hormones in check (I know, I know). The app promises to alarm me when my stress levels hit a certain threshold, and follow with a stress-reducing protocol, to help me manage my biochemistry. Kinda cool. But then I hit the gym last week, and my phone splurts out an alarm tone and starts sending me breathing techniques. Hmm..
Here’s another one. I often get this ad for Star Trek Fleet Command, the new game by Scopely. I’m a gamer, I love Scopely, and this game is right up my alley. But I received this ad while I was out walking my dog, and then while working - and I just couldn’t be bothered. Then I received the ad again while I was chilling at home, and I installed the game. Had Scopley understood my context, that would’ve saved him some 70% of the advertising expense involved in converting me.
Context is king. And that’s not just true for product development and advertising, but also for content personalisation. Imagine a user churns mid-podcast. We’ll probably assume the content wasn’t relevant to them. But if I told you they’d just arrived at a doctor’s office, we can safely assume he got called into his appointment - and it would be a mistake to update our content recommendation algorithm based on this churn.
The point I’m making here is that in order to be able to interpret user data, we have to understand our users’ contexts. Put differently: we can only draw meaningful conclusions if we know the ‘why’ (why my stress hormones are heightened, why I’m churning mid-content).
Basic contextualization is common in a handful of formats. Broadcasters, for example, use ‘dayparting’ to make sure their programming reaches the right audience: Sesame Street airs at 6pm and the news at 8pm, because that’s when these age groups tend to be in the context of ‘watching tv’. Such contextualisation, however, is limited. To take it one step further (and, disclaimer, this is where we come in): you can use mobile context technology, which combines all of the sensor data made available by the device to predict users’ environment and behavior.
Mobile context enhances the relevancy of virtually everything your app does, by providing a direct insight into your users live moments (such as at the gym, chilling at home, or partying) as well as behavioral segments (such as university students, night owls, or cardio champions). You can read more about what that can do for your ad inventory here, and for your engagement and retention rates and in-app purchases here.
Assessing third party data technology vendors
When considering purchasing data technology services, it’s important to vet both the vendor and the data thoroughly. Poor data, once adopted into your workflows, can have significant negative ramifications for your business and its impact can be difficult to reverse.
We suggest doing a preliminary interview to validate the data’s general legitimacy, followed by a series of data sampling tests following the initial integration for quality assurance.
The vendor interview
Validating the data’s legitimacy is more of an art than a science: there are no hard and fast rules. By developing an understanding of the origins of the model, the logic behind it, and the vendors’ quality verification process, you should be able to get a good idea.
The interview might include questions such as:
Where does the data that the model was trained on, come from?
- A dataset that was created in collaboration with academia or one an otherwise well established institution is likely to be reliable
- If the dataset was collected in-house, it will be extra important to check what in-house testing procedures the vendor has used
How would you explain the model / how are specific data points computed?
- If the explanations make sense, that’s a great sign because it means you can validate the logic
- If the vendor can’t explain the model, because their AI is a ‘black box’ or for any other reason, tread with caution
How do you verify the quality of your data?
Specific questions could include things like “Did you control for sources of error/bias?”, “Have you checked it against common-sense correlations?” and “Are you collecting associated data to explain the data (e.g., an ice-cream sales dataset should also contain weather data)?”
- If the vendor demonstrates they’ve thought about these questions, that’s a good sign
- If the vendor is unable to answer these questions, tread with caution
How fresh is the data?
- If the dataset includes data like purchasing habits during the pandemic, you should reasonably expect this to be updated monthly
- If the dataset regards things like the correlation between Heart Rate Variability and cardiovascular health, data freshness is less important
Once you’ve committed to a vendor and gathered about 3 months worth of data, run a few sampling tests to confirm that the data is of the expected quality.
Data sampling test
A good data sampling test compares the data to 3 sources: to itself, to your own data, and to established statistics. Let’s review each.
- Correlations within the dataset: confirm that there’s a correlation between characteristics that are obviously correlated, within the dataset. For example: there’s a positive correlation between users who buy trainers and users who buy treadmills.
- Agreement with your own data: confirm that users who have a certain characteristic in the dataset have corresponding characteristics in your own data. For example: people who are cardio champions tend to listen to workout music.
- Agreement with established stats: confirm that the data is similar to that of the general population (and if not, whether there is a good reason for that). For example: if the data reflects that 52% of global mobile gamers are female and reputable statistics confirm that. Or, if you have an app targeting people who work out, you’d expect to get a higher representation of joggers than the normal population.
Note that a positive outcome doesn’t mean the data is accurate - rather, each test that checks out suggests the data is not inaccurate. Thus, the statistical confidence in the accuracy of the data increases with each test for which the data holds up.
NumberEight helps app publishers bring their data quality to the next level by unleashing the power of mobex (mobile context). Unlike its web-based cousin ‘context’, mobex looks beyond the content consumed to capture the environment that this content is consumed in and the actions users take while consuming it. To see which contextual signals we provide, download our taxonomy. And if you’d like to test the vendor interview on us, book a demo today!