Billy Bosworth is the CEO of DataStax, a big data company based in San Mateo, California. It provides a scalable, flexible big data platform built on Apache Cassandra. DataStax has more than 250 customers, including startups and several Fortune 100 companies. Billy got a degree in computer science from the University of Louisville and counts more than 22 years of experience in the database industry. In this interview, he talks about how DataStax provides a back-end data store to service its clients. He also lays out the different layers in the big data industry, providing interesting tips for entrepreneurs who wish to enter that space.
Sramana Mitra: Billy, let’s start by setting some context about DataStax. Tell us what you do and what the scope of your work is.
Billy Bosworth: At DataStax we have a database technology that helps people write the apps to transform their business. We are a big data company. >>>
Sramana Mitra: Could you take us through a couple of use cases of how your customers use your product?
John Plavan: The easiest one to explain is last winter, because it was such an extremely warm one. I think it was the warmest winter on record in the U.S.* Going into the winter, around October, the general consensus among meteorologists was that we were going to have another relatively cold, if not an extremely cold winter, as we saw in the winter of 2010-2011. The price of natural gas in October was around a $4.40 per one million BTUs. That price was relatively high given our current situation of supply and demand of gas. But i was expected if you were going to have an extremely cold winter: since there was going to be a high demand for natural gas, the price should be relatively high. >>>
Sramana Mitra: In terms of the heuristics you use to make the correlations, what is involved in that?
John Plavan: We use machine learning, generic algorithms, network techniques, and basic statistical techniques. It depends on the type of data and how we have quantified it. We use a rotated empirical orthogonal function to define an actual weather pattern, which is a result of individual and variable weather inputs such as sea level pressure or jet stream winds or pressure. There is no set algorithm or function. >>>
Sramana Mitra: Basically you work with the online budgets of these companies? There are obviously print, TV, and radio budgets that operate orthogonally from your budget. Those are areas that are still largely manual.
Bill Simmons: That is true.
SM: What about the big data problem in general? What other areas are you tracking that are interesting for data? >>>
Sramana Mitra: Let’s get down to the specifics of what you are selling to these energy traders.
John Plavan: We sell subscriptions to a software as a service web-delivered product that enables our clients to identify extreme risks for extreme heat or extreme cold events. They use that information as an input in their energy trading strategies. If the current state-of-the-art forecast for a temperature event is valid a week to ten days out, using traditional weather models and simulating the atmosphere, there is too much chaos in the system. It tends to break down after about a week to ten days. They just don’t get accurate forecasts. >>>
Sramana Mitra: Is DataXu a 30 percent services and 70 percent products kind of company?
Bill Simmons: Currently, the majority of our customers are managed service clients, but we continue to see an uptick in self-serve clients, and we expect that to continue in 2013. I think the reason our technology gets adopted is because our service is good. >>>
John Plavan is the CEO, cofounder, and chariman of EarthRisk Technologies, a company that uses big data tools to create weather forecasts, especially in the fields of high and low temperatures. The special thing about EarthRisk systems is that they can create forecasts up to 30 or 40 days prior to any respective event, far more than traditional weather forecast systems. In this interview, John talks about the tools and systems EarthRisk uses to create its forecasts and how this affects the energy market. Further, he outlines potential opportunities for future entrepreneurs in this sector. >>>
Sramana Mitra: Let’s talk about the optimization bit. Once you have the data measurement and analytics infrastructure, what kind of optimization are you able to do? Give me some use cases, please.
Bill Simmons: The typical use case for traditional media is called direct response advertising. If you want to place ads and drive people to your site to purchase things, that is our core, and we do that very well. In newer use cases a customer typically buys guaranteed buys from a large site. One million impressions a day for $10,000, for example. >>>