Naveen Sharma: Our challenge is not the question of volume, it’s more of variety because if you look at our call center data for example, that’s primarily an audio-based database. If you look at transportation services, that’s primarily video-based data. If you look at our call centers, they’re also looking at social media data, which is unstructured. Much of the data we are dealing with is unstructured and multi-modal. The challenge for our researchers is to bring all those multi-modal data streams and fuse it in a way that it is ready for analysis, so that we can actually build a model that can span across and accommodate multiple data modalities. That’s one of the challenges from the research perspective that we are constantly dealing with. Data fusion is the industry term that a lot of the publications refer to, but for us, data fusion really has this special meaning because we deal with this in real time.
Sramana Mitra: Naveen, let’s take that point a little bit further. Your challenge is in the variety of data formats and then doing meaningful things with that. What I’d like to do is take a couple of multi-modal use cases. You can choose your vertical or a specific client scenario. Walk us through exactly what the data contains and what you are doing with it? What actionable results are coming out of it? What actionable value proposition is coming out of that process?
Naveen Sharma: Let me start with one example and then we can to go to the second question. The first one is in the financial services domain. The back-office process that we are looking at is basically a student loan processing, and in general terms, loan processing. Without giving specific examples, we do back-office loan processing for multiple, big banks. Let’s take an example of a bank that has a set of clients who come in to borrow money. The idea here is that the client entrusted us with developing a predictive model of individual consumers. The predictability was, “Is this particular consumer going to default or not?” on a particular loan that he/she may have taken. The predictability was, “Can we predict three to six months in advance that this particular consumer is going to be delinquent?”
We’ve developed a system – think of it as some kind of early-warning system – that helps the bank predict if customers are more likely to default on their loan, let’s say, three months from now. Using this predictive model, the bank agent can call Consumer X and say, “Hey, do you foresee any problem in our payments? Maybe, we can restructure a loan in such and such a way to help you out.” It’s a preemptive action. It’s reaching out to the consumer rather than just taking action after the first delinquency.
What the researches did, essentially, was look at the consumer activity through web logs. For each consumer, we looked at their payment behaviors and any call center data that we may have about the consumer. For example, he may have called in a few months back inquiring about certain things, such as payment option and loan balance inquiries. We also take into account any public-domain data coming from the federal government, such as employment data from that region of the country this person is in as well as that person’s own activity. We can take into account data the bank’s own default database when it comes to payments and other related issues. We brought all of this data together and built in the predictive model for a consumer. We are pleased to report that we were able to predict, with a very high accuracy, three to six months in advance, if the consumer is going to become delinquent. The bank fed that information to their call centers, which eventually became a preemptive action to reach out to consumers before they become delinquent. The whole process was a success as we were able to help consumer not default and in the process, the bank saves some money and was able to recover more loans.
I’ll give you another example. Today, with the increasing use of social media, it’s a great source of data and information for companies to mine and use to improve customer service. There are many social media listening tools out there but they primarily rely on keyword-centric search to determine sentiment, which isn’t always accurate. We are applying our data fusion expertise to create a new tool that automatically performs data analytics and teaches computers to accurately determine the sentiment of a comment, even detecting sarcasm and abbreviated wording in multi-modal or multi-channel data. We are using text mining, machine learning, and predictive modeling expertise to do this. Our customers can use this tool to get at the data quickly and efficiently so they can be proactive on what could become potentially negative customer service issues.
Folks are talking about smart cities and we’ve got researchers applying our competencies in big data analytics and complex systems to the problems of urban environments. Urban applications areas range from transportation, public health, and government services to crime prediction and enforcement. In this effort, Xerox has joined New York University-led Center for Urban Science and Progress (CUSP) as an industrial partner.