Sramana Mitra: It sounds like you need a series of ads to get people to even take a survey. Is that the right measure? Because some people will never take surveys, right?
Bill Simmons: That is true. But we also work with other ways to measure awareness. This methodology works well in some cases, and in others it doesn’t. The other way to measure awareness is to go to the site and interact with car construction widgets. This way we can measure interaction, and we can score this as a positive signal.
SM: That is like a cookie click-through. I am trying to understand if the approach is based on clicks. What is considered engagement, and when do you consider that someone has finally engaged with the ad? How many ads did it take to get there, or how many ads does it take to get to the first survey? These are the things I am trying to understand.
BS: Typically we don’t believe in clicks that much. Most of our work is done by tagging the advertiser’s site and seeing if that stream of ads results in positive website activity, and the ultimate thing would be purchasing something, signing up for a test drive, or searching for a dealer – whatever the advertiser sees as a valuable activity. We found that these car manufacturers have an accurate dollar value as to what it is worth to them when somebody searches for a dealer on their site. They know exactly how that can create revenue for them.
SM: So the work that you do – the big data and analytics work or combinatorial optimization work – occurs more in figuring out where to place the ads to drive people to take action. Is that correct?
BS: Yes.
SM: Let’s dive into that a bit more. Could explain in a use case or a scenario how that works, when you either structure or capture the data? What can you tell us about the kinds of algorithms you run on this data to get to the kind of optimization you get?
BS: The first place to start is measurement – inside analytics – and then optimization. We measure all the digital placements, clicks, any wired up interactions and inside activities. All of that is collected by our system. We also overlay it with third-party data that we buy. We have a pool of anonymized data, and we match it with anonymized data from other vendors so that we can get age, gender, income, or interest data, or even signals from these third-party vendors. We overlay that with all the advertising data. This allows us to produce insights and analytics in a fairly straightforward way. However, since the data [pool] is so large, it is challenging. We can produce charts that show […] that these kinds of audiences have a lower index, but you can buy media for them at a lower price, so the ROI is better. The insights we pull generally help our advertisers, and even if they are not buying media, it helps them better allocate their media spending. If you have realized that certain segments or populations are over-indexed […] in certain combinations, you can buy magazines or print out the TV ends that are also cover that demographic.
Beyond that, the data goes into our learning system. We apply machine learning models, and we use things like decision trees or support vector machines to build production algorithms. Given a particular ad impression, which has a certain size, is on a certain page, and has a certain context to a particular user, we calculate the likelihood that the user in a particular time and context will take positive action, like a purchase or a conversion for the advertiser. This is all done in real time. The learning is done in batches on a daily basis, but the decision making about the media is done in real time. We already built an algorithm, let’s call it a machine learning classifier, which is loaded into our real-time bidding engine. We currently process more than half a million potential ad impressions per second and pick and choose the best ones for our clients.
This segment is part 3 in the series : Thought Leaders in Big Data: Interview with Bill Simmons, CTO of DataXu
1 2 3 4 5 6 7