Sramana Mitra: Where do you sit from a competitive point of view?
Dave Rich: We think there are three major areas of interest where we are typically engaged by a customer. It is either because they already have exposure to the open source community version – they have it in research labs or on their desktops – but they are not playing with it. For the most part, they are getting comfortable with it. We call it going from little r to big R. That means they are ready to design applications, put them into production, and make them part of mission-critical operations. They need more robust capability, and they need somebody to stand behind the community version to provide service and support. That is one major reason we get called in.
Another reason is when they have an incumbent that is in a legacy environment. People like SAS or SPSS, which, rightly or wrongly, they don’t think scale to the new architectures. They believe that R is more suited to their needs, so they want to do an SAS-to-R conversion. That is a consideration. I spent 28 years of my career at Accenture and had major roles in a lot of industries and a lot of responsibilities for punctual domain expertise areas. Advances in analytics applications development remind me of more of the business application development back in the 1990s – things like ERP – where we went from Cobalt to the client server.
In many ways this is similar, the legacy environment being SAS or SPSS. What is happening is that people want to make the transformation, which I spent the biggest part of my career doing in large companies that wanted to migrate from mainframe custom Cobalt to client server applications like SAP, Oracle, PeopleSoft or Salesforce. I think this is very similar, so we call that SAS to R.
Lastly, there is a lot of momentum around Hadoop. There is a great interest in pairing up R with Hadoop for lots of reasons, like the infrastructure being new and its being a data landfill, which is how I like to call it. Maybe it is not a positive term, but the point is that we have a lot of customers who are collecting lots of data. Then they say, “Now what? We need tools to take advantage of all this data we are collecting.” Michelle mentioned network analytics. Think of the data that a typical service provider collects from the network – be that a telco company, a cable company, or an Internet service provider. They collect lots of data and need tools to figure out “Now what?” That is where we come in. We are an application developer’s dream of taking what we have and building their applications to support their business customers.
SM: Who are your competitors? Big data is a crowded field. We talk to a lot of people [in our interview series], and we continue to do so.
DR: SAS is an ongoing concern. It is a three-billion dollar company and tends to grow. The traditional competitors are SAS and SPSS. SPSS was bought by IBM, but it is not quite as big as SAS. That is the common legacy environment. Now there are a lot of other smaller players. One of them is KXEN. There are several others, but they are typically smaller. In the emerging ones, there is a language called Mahout. Depending on some of the industries, there are other languages out there.
As it relates to advanced analytics or predictive analytics application development, there is only one true general purpose tool, which is the open source community version of R. To be frank, our competitors are either SAS or SPSS or, free. My experience working in large enterprises is that when it comes to actually doing something that is going into production of a mission-critical application, most large enterprises won’t put an open source community version in production. That is why Red Hat and others exist in open source around Hadoop distribution and other tools. In our space, predictive analytics tools, there really are only SAS or SPSS and a few smaller entities which have been around for a long time. Open source R isn’t a competitor in many ways. When most of our clients want to go into production, they want someone to stand behind the open source distribution.