Data Onboarding : questions to ask your Onboarding partner

By Richard Foster, UK managing director, LiveRamp.

  • 6 years ago Posted in
New things are great aren’t they? They bring possibilities, interest and even excitement. Yet with newness, there often comes the risk of unfamiliarity, confusion and contradiction.

Data Onboarding is one of the most significant developments in marketing for many a year, one which is worth an estimated $250m today and forecast to reach $1bn+ by 2020 (Winterberry Group, “The State of Consumer Data Onboarding: Identity Resolution in an Omnichannel Environment,”).  Why? Because it makes massive inroads into helping marketers deliver more targeted, personalised and measurable experiences in today’s omnichannel world.

Data onboarding takes CRM data and links it to the same individual in the digital world resulting in far more relevant marketing – a better result for both the consumer and the brand.

It may be the exciting new kid on the block but that doesn’t mean it’s fully understood, not least because the principle is being executed in more than one way and to varying degrees of accuracy.

After all, just because you exercise three times a week wouldn’t lead people to believe you’re likely to become an Olympic champion any time soon. In the same way, not all ‘data onboarding’ is alike, in fact, it can be pretty confusing especially in these early stages when so many are talking about it while still learning about it.

A marketer will ‘onboard’ their CRM data to their onboarding partner, who will then help match it to individuals in the digital space and use that improved insight to generate a more relevant marketing message. This improved insight can be used to improve personalisation, make targeting more effective and also improve the efficiency in measurement.

1. How do you match: deterministically or probabilistically (statistically)? What does each   mean?

This is perhaps the single biggest question you can ask as there are two fundamental approaches within Data Onboarding. It all comes down to just how accurate your partner can be in resolving the identity of the individual in the digital space.

Probabilistic Matching uses statistical algorithms to identify the likelihood of the person being targeted as being a specific individual. This can use postcode level data, IP addresses, device behaviour etc to predict the likelihood they have the right individual. For example, if someone’s behaviour is the same or similar on two devices at the same location, this is probably the same person. Perhaps this assumption is right more often than wrong but the fact remains it’s only  ‘probably’ right, to varying degrees, and not ‘definitely’ right.

Deterministic Matching uses a wider and richer set of identity resolution techniques and data, including login data to resolve that identity to the individual level. It is not ‘probably’ that person, it is determined to be that person.

Most marketers are familiar with the notions of reach and relevance and the fact that there is often both a link between the two and a trade-off. As you can imagine, deterministic matching is more desirable because it is more accurate and delivers better results through greater relevance. However, there is a place for probabilistic matching so long as you understand the differences.

2. When is deterministic really deterministic?

Some data onboarders are already using the term deterministic when it isn’t determined to the individual level, only the IP level. It’s not that this onboarding is not valid, it’s just not truly deterministic and here’s why.

IP addresses are useful when helping to identify households and even individuals (to a degree) but there are limitations, particularly because consumer IP addresses are dynamic not static. When modems are rebooted, the geo-mapping can become rapidly out of date.

Compromising on accuracy can have a big impact on the success of your marketing.

For example, if your use case is ‘file suppression’ (aka stop hitting your existing customers with prospect campaigns) then accuracy can have a huge impact. In fact, in one test we did around new parents, when we deterministically matched to the individual level, we returned a 15% match.  That’s not a huge number but it ensures you’re speaking directly to specific individuals who recently had a baby. When we opened out the match to the postcode level, the match returned a figure of 90%! We know that 90% of the people in that postcode are not new parents, so we need to accept that to perhaps 75% of them, the marketing message would be irrelevant to wasteful.

The bottom line is, you must determine what accuracy you need and determine what deterministic really means with your partner.

3. What is a match rate?


Spoiler alert. You’ll encounter more than one definition of match rate. We’ve already talked of the different match rates between probabilistic and deterministic, and you can usually expect the former to be higher than the latter; that’s pretty logical. However, another key difference is between match rate and sync rate.


The match rate is, or at least should be, the percentage of individual matches made between the onboarded file and the consumer in the digital world. The sync rate is the rate at which we can connect that individual identity to devices, from one to often several. Let’s take an example.


If you gave 1 million records to your onboarding partner, they may say they’ve achieved a 100% match when in reality, the true match rate may be 20%. By the time they’ve pushed that ID to their DSPs, they have on average 5 devices associated with that ID – 5 x (1m x 20%) = 1 million – devices, not individuals. This ‘sync rate’ gives you a feel for the number of ‘bites of the cherry’ you may have for an individual, but it is not the same as a single person. The number of determined individuals matched, is what we consider to be the true ‘match rate.’


4. How representative is the match data, your ‘Truth Set’ and does it skew/over index for any audiences?


All this talk of being accurate and ensuring we can differentiate apples from pears and other fruits, could run the risk of chipping away at a marketer’s enthusiasm for onboarding at all. We hope not, because if you’re that marketer, it’s extremely likely you’ll miss out and get left behind as more and more marketers attempt to close that offline to digital customer experience gap.


Being able to link between the offline and digital worlds fundamentally depends upon identity resolution, and the identity graph of individuals. Some data onboarders rely on very specific sets, perhaps from certain industries rather than across the population. It’s possible they may be able to generate good deterministic match rates, but they may be skewed. You need to ensure that works not just for specific segments, but across your entire file or business.


5. How do you solve for privacy/user consent?


As soon as we marketers talk about data, consumers and identity resolution we are rightly faced with questions around privacy. This is not a matter to be taken lightly, indeed it is something that must be embraced with privacy being designed into everything we do. It is fast-evolving and far too complex a subject to address fully here, so we simply encourage you to ask your potential onboarding partner to explain their approaches and credentials. These days it’s vital to have a partner who can assure you your marketing is using data to drive consumer value within both the letter and spirit of the law.


6. Once your data is onboarded, what is the distribution reach?


Like everyone else, you’re a consumer too and like all people, we have our own way of shopping. We research our own way, consider our own way, negotiate, buy and hold relationships with publishers and brands in our own sweet ways. For brands to reach us, they need to be where we are, when we’re there and with more relevance than ever before.


So assuming you’ve sorted the relevance angle with the right match, can you now reach the consumer where they are, across the ever more complex customer journey? In the worlds of publishing, martech and adtech, the rate of innovation far outstrips that of consolidation Wherever your audiences are, you need to be to, so ask about the range of your partner’s integrations, their reach and importantly, how quickly it is to add new destinations that may be really important for your key customers and audiences.


7. How agnostic is your onboarding?


This is similar to previous questions, but where we have covered accuracy and skews when it comes to matching individuals in the onboarding process, this is more about any skews or biases when reaching the individuals with your marketing. Yet again, the great advantages of data onboarding come with some opacity. Some onboarding is offered by companies who are also active in other parts of the value chain, such as ad networks or a particular DMP. This links to distribution reach, because you need to be mindful that some onboarding will only work with certain other parts of the marketing ecosystem. If that reaches your audiences, fine. If not, and you need to reach wider with relevance, think again.

Savvy data-driven marketers today need to be discerning about the partners they choose. Multiple suppliers in the onboarding and activation process can ultimately lead to lower ROI and data loss. But the all-encompassing solution of a one supplier process allows brands to segment customers, activate target audiences across multiple devices and measure the results across all channels under one roof. In turn, this helps to increase the integrity, potential and reach of the data collected. You simply have to be sure to ask the right questions, and be careful where you put your data.

By Raja Rao, Head of Growth Marketing, Redis.
By Joe Beaumont, Head of Hospitality at Exponential-e.
How you can harness the power of graph analytics to achieve a 360 customer view without rebuilding the entire IT system. By Martin Darling, VP EMEA, TigerGraph.
By Simon Spring, Account Director EMEA, WhereScape.
By James Fisher, Chief Product Officer, Qlik.
By Dale Murray, CEO at SalesAgility.
By Mathias Golombek, CTO of Exasol.