Though many companies are incorporating multi-touch attribution into their marketing strategies, they often struggle to turn data into sales. Payman Sadegh, former Visual IQ chief scientist and Bain advisor, joins Bain's Cesar Brea to discuss how companies can implement a multi-touch strategy, as well as use cases and common pitfalls.
Read a transcript of the conversation below:
CESAR BREA: Hi, everybody. Today's topic is going to be getting multi-touch attribution right, lessons from a practitioner. And very lucky today to have my friend and a Bain advisor, the former chief scientist at Visual IQ, the multi-touch attribution originator that was eventually acquired by Nielsen a couple of years ago, Payman Sadegh. Thank you very much for joining us, Payman. It's a great honor to have you with us.
PAYMAN SADEGH: Thank you so much, Cesar. Thank you. This is really such a pleasure to be speaking with you always, but specifically for this very interesting and important topic. Very nice to be here and talking to you.
BREA: Thanks for that. Very kind. So usually when we start these conversations, I want to start by just asking really about your background, your original training, kind of how you got started, how you got to the job, and a little bit about what you're up to now, which we'll come back to later.
SADEGH: OK, sure. My background, my training is in mathematical statistics. I have my PhD in that field. And how I got there was from an engineering background. So it's very much like a winding path. That's how I got into statistics, mathematical statistics, how I got interested in that field.
In the last 20 years, I've been in different roles in various industries. But the common denominator of all of them has been on data science, advanced analytics. But the industries were different. I worked in the aerospace industry for five or six years. And at some point, you know, I was working on analyzing onboard data, the aircraft onboard data, to understand how faults occur and how you can optimally maintain fleets of aircraft. So it's very interesting, heavy data science work.
I also worked as the head of data science in education assessment industry, and also in the last 10 years, it's been mostly focused on marketing and advertising. And that's where I joined Visual IQ in 2010. And I was with the company until 2019. So about eight and a half, nine years I was with the company, very exciting.
And one of the things that, both from my background perspective and also the work I did at Visual IQ, is that working on different problems, really exciting to me. And having that opportunity in Visual IQ was—and it allowed me to be working with various industries from banking and financial services to retail to automotive, telecom, and applying all these learnings that we have and all the tools and approaches that we were building, the platforms that we were building across multiple industries—was really exciting.
BREA: Very, very cool. Of course, there's an arc there. You're not the only person I've spoken with who was started with engineering roles and sort of physical science kind of applications, or whether it's weather or aerospace engineering and somehow gravitated to the world of, the black hole of, marketing and advertising. But anyway, I guess we all follow the data to a certain extent.
So I want to pivot to the topic here. The central question, I think, that we're going to consider and that I think is central for a lot of CMOs and their teams today is really this question about how far to push multi-touch attribution. Like, how real is it? It's a holy grail obviously to be able to know exactly what everything contributes to a sale and to a returning customer and so forth. But I think it's important to try to understand what the practical limits of that would be.
And so I think that's a really interesting area for us to explore, given how there's definitely a move now, an accelerating move, to digital channels, both for brand purposes as well as for actual conversion. But maybe before we get into all that, maybe you could define the term a little bit further for us. You know, so how would you define what MTA is? How is it different from traditional media mix modeling and testing approaches? Just say a little bit about that to ground things.
SADEGH: Yeah, absolutely. It's a very good question. How do you even define it? So I start with how we defined it when I joined Visual IQ. How we define this term? Multi-touch attribution, where it came from?
From a historical point of view at the time, a lot of advertisers were focused on last-touch attribution, right? It's an easy way. Everyone understands it. But obviously, that's also very wrong. I mean, you cannot attribute the entire conversion to only the last touch.
And one of the examples that—and I was thinking about this at some point—I came across, and now we're in the election season. And we have election in 18 days. So imagine that there is a state that you have one candidate wins by a single vote. You know, that could happen.
That's very unlikely. But that can happen. Actually, in Florida in 2000, we were not that far off.
BREA: Yes, 500 votes. Yeah.
SADEGH: Now if you're just thinking of that, and there are, let's say, 5 million have voted in that state for that, let's say, the winning party. Now whose one vote actually contributed to the win? Can you say the last person who voted? Well, maybe. That's kind of logical.
But, you know, if the first person didn't vote, you would also have to flip the election. So that's kind of the idea of last click was easiest, because you can always track. OK, the last guy who voted is the winner. And all the credit goes to that last person. And then when people start thinking about, well, maybe there are other ways of doing this. You can think of all those 5 million votes in that state to be equally consequential, right?
As a matter of fact, at the time, this was one of the approaches of attribution. So, well, OK, we cannot attribute everything to the last touch. So let's do it equally across. And then some other philosophies came about that, well, maybe some channels, they're openers. Some are more solidifiers; some are closers.
But, again, there is a lot of debate into what channel is closer? What is opener? Like, search usually is thought of as a closer. But is it really?
I mean, if you're thinking of buying a car, it's not that you see a display ad with a car, and say, OK, I want to buy a car. It doesn't usually, you know, happen that way. You are thinking about this purchase, and you go online, do some search. And you may see an ad that takes you to a website that ultimately you stick with and make your purchase.
So as a matter of fact, in this case, search is an opener not a closer. So then the thought became, OK, how can we now maybe have data science? And at that point, data science was becoming a thriving field, and people were thinking about tying all kinds of data points together. That's where we started thinking about algorithmic attribution. And at the time, we called it a fractional attribution.
So at the very high level, it was about: Now that we will live in a world that we can track every action of a consumer, especially on the digital side, how about just figuring out what differentiates the converter set from the nonconverter set? So if you're seeing certain touchpoints more often among the converters, probably they're more consequential than the ones that you see more among nonconverters. And that's how this algorithmic things came about. Now you can have multiple approaches. You know, every vendor thinks of their solution as probably the best one, and so forth. But the bottom line of this is understanding how the converters, nonconverters differentiate.
Now I want to tell you also another story about the definition of the term MTA. We were at a client meeting at some point, and so the client started talking about giving us every single characteristics of an MMM solution. But they were talking about MTA. And I was just thinking, you know, and then after, like, maybe a half hour, I asked are you talking about MTA or MMM? Say, well, yeah, we call this MTA because our management thought of it as MTA. So we kept the name.
You know—and in some way—well, we laughed at that point. But when you think about it, they were not that wrong, because for a client, they don't think of it in engineering terms as some of us engineers do. They think of it as a business problem. They don't have an MTA or MMM problem. They have a business problem. And to them, you know, whether this touchpoint is an offline touchpoint, which comes from TV, or is it an online touchpoint that comes from display, that's more or less the same thing, right?
So they were thinking of it more holistically across all terms, all channels. So the reason I'm saying this is that the definition has evolved. So initially started with fractional attribution as a contrast to last touch. But I think in some cases, it's evolved to become, at least some advertisers think of it more broadly as, more of a cross-channel, which could even include, in a way become, some sort of an MMM solution.
So where you want to pick that? I would probably, anything that is addressable, if this is my definition, anything that is addressable, I will call it within the domain. Or any channel that is addressable, I would call it within the domain of MTA. And probably other things in MMM. But, again, that's just a definition, matter of definition.
BREA: MTA is personalization for MMMs, basically.
SADEGH: Yeah. Personalization, more granular. Some think of it as more granular MMM, which is not actually a very bad definition. I don't like that definition, actually.
BREA: No. But I think your point here about not getting hung up on the semantics and focusing on really understanding, at as fine-grained a level as is practical, the relationship between investment you make and outcomes you get. And it just happens to be more sophisticated, more machine power to do it. So that's very helpful.
One thing we'd like to do in these conversations is to tell stories. And so we'd love to have you comment on some specific examples—no names needed—that you had a chance to work on. Tell us a little bit about the specifics, what you were brought on to do, what worked, what was challenging, how you kind of got to a good solution. Just pick a couple of stories here that are sort of your most memorable ones.
SADEGH: Sure. I can think of some retail applications. And the way I'm picking these examples is more talking about challenges, because that is probably more interesting to know what are the challenges and how I can overcome them. So one of the typical—and we also had the same thing with the financial and banking. It's a problem across the board.
When you have online and offline conversions, usually, sometimes it's hard to tie—obviously, there are solutions out there that tie offline and online, but they are not very credible—so usually the way, the reason that we at Visual IQ was brought on board to solve the MTA or MMM problem was to solve the marketing problem, right? We are spending all our money, a lot of money, on digital channels, on TV, on cross-channel.
Now how can we understand what works? Like, for example, if I'm making $500 million a year, and, you know, I'm spending about $15 million on advertising, how much of that $500 was generated by the $50 million? So that was the key question. But very quickly, we would realize that this is not really—this is a means to an end—but it's not an end of itself.
It's more about, OK, now I understand how these channels work. How can I invest in a more effective channel? And how I dial back on the less one, I mean, less effective channels? And how I can predict what happens in the future?
And this was one of the things that always, you know, was an issue for us: the predictive power of the models, right? Because, you know, the sales and conversions are not only obviously a function of marketing activities. Sure, as a matter of fact, maybe, at most, 20%, 30% of this. A lot of that comes from other reasons or factors. And, you know, we are not there to build a holistic model across everything.
You know, how external factors are contributing to this. But a lot of the times when the models are built for clients, they want to have a kind of a broader understanding. But just to get back to the MTA question at hand, essentially, the idea was to understand how, basically, the digital channels work, how the offline channels work and how the interactions between them work. So that's really one of the reasons.
And in terms of challenges, there are many challenges in MTA that I can go over, and how we can overcome. But one of them is, again, tying offline and online together. That's always obviously a big problem. Then one of the things that, especially when you're thinking about MTA and a multi-touch attribution solution, not all the touchpoints are on the same position along the conversion funnel.
And for building a model that treats them the same is always a challenge. I mean, first of all, it's not very correct. And also, differentiating among them is not very practical. And so one of the key challenges that we had was understanding the role of each channel within the conversion funnel. And another thing is that actually becoming much more, in a way, that a lot of clients more and more have and continue to have now, is incremental—causal vs. core relational model.
So many clients want to understand the causal relation between marketing activities and conversions. A lot of these models that, in a way, many vendors offer, they are correlational. And when you build a model, and when you come to, in a way, act on the models, then you can very easily get into wrong answers, because the same correlations may not be holding in the future. But building the causality is very important and very difficult and one of the challenges. And when we confronted those challenges in terms of the solutions that we offered, we always thought of this as, you need to take a step back and use MTA as an important piece of the puzzle but not as the whole piece.
You need to kind of have a broader perspective of things. And understand the causal relationship. Maybe you can compliment your MTA with an MMM solution, and then let them work together so that you can make more of a holistic decisions based on your MMM. And when you get into more granular decision making, then you can leverage your MTA.
BREA: The example of that distinction that sort of got driven home for me a few years back was this notion that search behavior and display behavior may be correlated. But a search doesn't necessarily drive you to click on a display ad. Rather, exposure to a display ad may drive you to actually do a search. And that's fundamentally, I think, the difference between sort of a correlation and a causal one in this case.
One of the things that I had the good fortune to work on with Visual IQ a few years back actually—and actually, this is in a professional services context in terms of the customer that we were jointly serving—was to try to reel folks in from the temptation to kind of throw all the data into the bucket together and sort of ask, and just let the machine tell you kind of what the right price for each channel was. I wonder just if there are particular stories that you have that sort of illustrate kind of the best way to kind of sequence a roadmap, an MTA implementation? How much can you do at once? Or is there an example of that stands out for you that is instructive?
SADEGH: Yeah. I think you definitely want to stage it out. And in terms of the data collect, first of all, I think the most effective MTA solution is one where you have your data house in order. Your internal, essentially your first party data, I call it—when I said first party, I'm not only talking about traffic data, but much more broadly, I think.
BREA: Your house file, for example. Your CRM file.
SADEGH: Yeah, yeah, yeah, exactly. It's things that ultimately or hopefully they end up in your CRM searching file.
BREA: Yeah, yeah. True.
SADEGH: And understanding basically to be able to collect them. Obviously, you have to be collecting them. You have to be able to organize them and name them by the right attribute. And the taxonomy question, right? And another thing that you really want to do is connecting the data, right? And so break the data silos.
I don't think before you go through this exercise, you will be able to implement anything successful, right? And the most successful applications are where the house is in order first, or the vendor helps in some cases. And we helped the client to put the house in order. But that is really essential.
And I think throwing everything in the mix is always very tempting. I talked about online and offline in the position of, to stimulate in a way, the touchpoints along the conversion funnel. That is also something that we want to be very, very careful about. For example, let's say it's a bank. And then you have a bunch of ads served.
And when you are in the ATM at the bank, let's say you want your, you have some solution that can tie to your other media that you have served. But do you really want to treat them on the same level? Obviously not, because the person who is at the bank is very different from someone who doesn't even probably know this bank has branches in their area, right? So they're very different. But the customers are very much, or clients are very tempted into putting them all together.
And we always had to be educating the client a lot in terms of do not do this, and make sure to take things gradually. If you want to add more down the funnel channels, let's do it later. So once you better have a better grasp of what's happening on top of the funnel. So I think these are a couple of things that I would definitely think that will be helpful in terms of implementing this. And also I think that ultimately going back to our question about causality and so forth, a lot of these MTA solutions—unless you're willing to run experiments and do design of experiments and measure the incrementality—which, again, you can have technical debate whether that actually captures the entire incrementality.
That's another thing that, especially on smaller channels, that they want to put. A lot of clients, in our experience, were very much tempted that, let's say, online video works very well, right? And they can see that in their attribution models. Say, well, I want to put all my money on this. Well, maybe if you do that, you start getting into the point of diminishing return. Maybe you want to run some experiment and gradually dial it up. Right?
And then running those things are also, I think, some challenges involved, obviously. Because you have to incur extra costs of experimentation. But, you know, I think going back to your point, I think that's extremely important that you don't just jump into solution because a model indicated a positive direction. Because if you do, you may not get as much benefit as you were getting before.
BREA: One thing I want to come back to here that I think is really important is you're saying basically, to a certain degree, you have to have your data house in order before you can really take advantage of MTA. And in particular, you have to have your first-party data in good shape. What litmus test would you apply to judge whether or not somebody was ready to do, say, a relatively straightforward kind of two- or three-channel cross-optimization? Let's say, search display in email for retargeting, right? How would you judge, what are the things you would listen for and look for in a client's data situation to say, OK, we're ready to go?
SADEGH: Yeah. So, good question whether there is a formula for that. But I think the things I would be looking for is, I think understanding, one of the very first things that we'll be looking for—it doesn't matter if it is two channels or three channels or email or display—one of the things I would be looking for is that do you understand the why. Typically, you're running a bunch of campaigns. Do you know how these campaigns, or do you at least have some hypothesis that relates these campaigns to the outcome of interest?
Do you know why you have been running them?
BREA: Oh, interesting point. Yeah.
SADEGH: And you even put a name on, yeah, I'm running this to get this, right? And interesting, you expect that it must be known. But in many cases, it's not.
BREA: Oh, I agree. I agree. It's funny.
SADEGH: And if it is done, it's more like what we call, or used to call at Visual IQ, tribal knowledge, right? So if you, after talking to five people, you may figure out. But it's not, nowhere in the data that tells you that.
BREA: So that is a really interesting answer. Because I thought you were going to go someplace that was much more technical to basically say, well, you know, give us a data sample; if we see a lot of missing values or outliers or things like that, then we'll kind of know that things are messy. But you're actually getting at a point that I think is much more practical and fundamental that I really like, which is this notion of: If folks can't answer a simple question like, hey, how did that campaign do? You know?
SADEGH: Or why you even ran it, right?
BREA: Then they probably don't, it's sort of like where there's one mouse, there are many mice. There's that expression, right?
SADEGH: That's right.
BREA: And I think it's funny because I've had a very similar experience in the past, I actually, way, way in the deep past, you know, once I asked a very senior executive that question. And the answer was, well, you know, we were just too busy getting the campaign out the door to actually instrument it to measure it. That's right. OK, well, that's actually a fairly, that's interesting not just at a technical level, that's a very philosophically interesting sort of answer, you know?
SADEGH: At least be able to tell what was the expectation of it and why you're running it, right?
BREA: Even more fundamental, like, yeah. Like, what did you expect?
SADEGH: What was the expectation? Why you were running this? And also be able to characterize these. Because that's really the characterization of those campaigns, the thing that ultimately helps the model to be better and smarter.
SADEGH: But what you said on the readiness is also absolutely right. But that comes more after you start looking at the data, then you see, oh my goodness. I mean, this is, in some cases, it's really in good shape. You know, some clients are really good. And in many cases, they start with a little bit of a not so good position. But you know, after two or three refreshes, they have everything in order, right? And that is also extremely indicative of how ready they are and how successful that exercise will be.
BREA: Yeah. Also, I think I want to come back to the point you made about testing, because obviously for us, testing is a very key piece of what we found is really helpful to clients—in particular, as they try to look forward under rapidly changing conditions. But they tend to think of testing as sort of this thing you do off to the side as opposed to sort of incorporating it into your data stream and actually having look-back windows that are sort of not so deeply historical that they sort of drown out the impacts of the tests. Is there sort of a practical sort of balance that you've observed in terms of the mix of sources of historical data vs. test data for making good judgments?
SADEGH: Yeah. I mean, one thing you can do first, I think, you want to reserve mostly these tests for channels. We've talked about bottom of the funnel, top of the funnel. Usually the ones that are bottom of the funnel get overattributed. Maybe you want to do the testing. If you want to prioritize which channels you want to test, you want to do it on maybe more off the bottom of the funnel.
SADEGH: Because otherwise, they can't get all the credit.
SADEGH: But in terms of exactly how we can combine them, I think there are different, mathematically speaking, there is a way to do that. I mean, you can think of the test as some creative of some kind, right? And then if, you know, whatever that creative gets attributed to, then you can take as a baseline and subtract from everything else. So if you don't want to do a separate exercise, if you want to blend everything into one holistic way of composing your data, this is one way to go.
SADEGH: Right? And it'll calculate a baseline based on your experimental or PSAs or what have you, and then subtract.
SADEGH: But there are methodologies, many, many different methodologies of doing this. That's in and of itself a science, right?
BREA: That's part of the fun of this, exactly. So let me shift gears. This has been obviously a crazy year. What do you observe about trying to do MTA under the conditions that we've experienced this year?
What advice would you have for people who are trying to get something like that off the ground under these conditions? And feel free to say everything from hunker down to just put it on pause. I'm just curious where along that spectrum you might end up, or maybe there's a different axis.
SADEGH: So I think it's a fantastic question. I don't know if there's a right answer that can be helpful for everyone. But I think there's several things you can always tell, especially when it comes to MTA. One of the problems with MTA, or challenges of MTA, is that if you think of it as more like digital attribution as opposed to broader MMM or granular MMM definition, then it usually misses all these external factors.
One of the things that I personally like about granular MMM as opposed to standard MMM is that you can start including all these external factors in the model, right? So that's one of the things that MTA definitely has a challenge with respect to that. Because now that we are in this area in this period of time, things have changed.
SADEGH: Some things might have temporarily been on hold, but some things might have permanently shifted. For example, consumer behavior, we can hypothesize that it has probably permanently shifted. Like a lot of things that were happening offline are going to be happening online.
Even the field that I'm now interested in, healthcare: Telemedicine was really more like an experimental. Nobody took it seriously eight months ago.
BREA: Right, right. Yeah.
SADEGH: Now it's becoming mainstream.
SADEGH: And I don't think this is going to go away after Covid is over. Many other things. So I think that understanding those consumer behavior shifts is going to be critical. I think MTA will have huge challenges trying to measure that. Again, because it's narrow.
BREA: It's very narrow and blinkered now.
SADEGH: It's very narrow. A lot of things that I've seen that a lot of folks in many companies, they want to focus, and they think that OK, the best way to get out of this is give an offer and promotions. But do you want to really do that? Or how you can optimize that? Is really offers and promotions the solution to slow the sales? Or are there other ways that you can do this?
Another big thing is how, if you're building models that, or solutions that, use data across a bigger period prior and post-Covid—and for example, for granular MMM, that's going to be very important, because usually you have to look at this one year, preferably two or three years of data—and then blending these periods are going to be extremely difficult, right? Because pre-Covid and with post-Covid now, how you can smartly blend the data, I think that's one thing that many, I think, advertisers should be thinking about at this point.
Now, I think, if I want to, again, from the what's the best thing to do at this point? I think understanding consumer behavior changes, shifts is extremely important. Maybe running some tests, small tests. Whether that's offering offers and promotions or any other things that you want to test in the new market, I think that's the way to go. Start small and experiment and collect some data. And then also how smartly you can blend your data prior and after this Covid broke out, or the pandemic, it is going to be, I think, very important.
BREA: That's very helpful. You know, a popular—I mean, to put a buzzword on it—a popular question right now from clients is like, how do I do personalization 2.0? Whatever that means. Whatever 1.0 was. And a lot of people are tempted in trying to, when they go to that level of grain of action, to go to a super deep level of grain of analysis and repricing of the channels.
And I think what I'm taking from this conversation—and I don't mean to put words in your mouth, but just distilling it for myself here—that is really, really powerful is this notion of trust but verify. When you're using MTA, be sure that the data is good in the place that you're applying it. Start by applying it narrowly. But make sure that you do not lose sight of the bigger picture of the other factors outside of whether or not testing should be in the mix to sort of supplement this. A very much of a human in the loop kind of a philosophy to what you're doing as opposed to thinking of it as this thing that is just basically a black box implementation and then you sort of run with it. That's actually really valuable.
So I wanted to ask a little bit about your latest venture. You're up to some really cool stuff now. Before we go, I wanted to hear a little bit about that, how you navigated to that idea, and how it extends what you've been doing in the past?
SADEGH: Absolutely. I'm focusing now on my company Change Squared. I'm focusing on, or we're focusing on, healthcare at this point.
And one of the things that how I was drawn to this area, I identified a lack of consumer focus in healthcare in general. And we can think of all different reasons that might be the case, whether it's because our healthcare system is based on sickness model as opposed to wellness model, and why that is. We can always find reasons that why that is the situation. But if you think about it more and more, the consumer focus needs to be brought back into healthcare to make it more, to help with the costs of delivering the care. And when we're thinking about also healthcare, I mean, we always think of it as: We go to doctor because you have to, not because you want to.
And you buy an iPhone because you want to. So there is a difference. So someone may argue, well, healthcare care is not the same as, let's say, consumer products and so forth. To some extent, it's true.
But if you look at the broader healthcare landscape from medical, dental practices to other businesses, organizations that have more of a consumer focused, like, you know, dental implants. Or you might have cosmetic surgery and things like that. They're expensive. The time to make a decision whether you want to go ahead with this procedure or not is long, is expensive.
So one of the things we're focusing around when we find an opportunity is how we can best acquire customers. Patient acquisition is actually extremely important. And operations is extremely important. How effectively you can service your patients and customers? Payments and the reimbursements is extremely important. And these are areas of someone with my background in AI and machine learning can be very instrumental to all these elements.
And you get to ask about how this tie back to what I've been doing. You know, especially we are focusing now on customer or patient acquisition and the operational aspects of it, which is very much in line with understanding how different touchpoints drive. Think of, again, making that decision about cosmetic surgery or about some dental implants or what have you, then these are expensive decisions. And there are a lot of multiple touchpoints involved in how you can run analytics and advanced analytics and so forth and so on.
So the way we are now focusing on, we're building provider-facing analytics and AI solutions like dashboards or prescriptive analytics. What should be your right patient mix and so forth, right? And another group of products and services that we are thinking about is—what I call virtual concierge—is patient facing or customer facing and answering a lot of repetitive questions or scheduling. That's key thing we're working on. Because putting the patient in the right spot is not always easy either.
There is a little bit of a science involved in doing that. This is an area where we think that technologies such as AI, advanced analytics and learning can be very instrumental. So that's what we are building. And very excited for that actually. We are getting some good conversations in this area. We'll see how it works.
BREA: I'm always fascinated to hear about how people like you who have the background that you do, the engineering and in this case the entrepreneurial background as well, sort of are inspired by patterns they see in one domain and are able to apply those patterns in another. And I think this is certainly one of the more interesting translations of that. When you said patient mix, my mind immediately ran to media mix. And I guess at some level, it's really disturbing to think about a patient mix model and a media mix model sort of being at some level, having some common DNA. But there you go.
Well, Payman, thank you so much. This has been a fascinating conversation. And we're really grateful to have your insights and perspectives on this topic, because I know that it's top of mind for a lot of executives who are trying to make big decisions about how to take their whole organizations forward. And there's a lot of wisdom in what you've shared with us. So, I really appreciate it.
SADEGH: Thank you so much. It was really a pleasure to speak with you. And, yeah, I really enjoyed this conversation.
BREA: Terrific. Well, I'll look forward again to the next chat soon.