Billy Bosworth is the CEO of DataStax, a big data company based in San Mateo, California. It provides a scalable, flexible big data platform built on Apache Cassandra. DataStax has more than 250 customers, including startups and several Fortune 100 companies. Billy got a degree in computer science from the University of Louisville and counts more than 22 years of experience in the database industry. In this interview, he talks about how DataStax provides a back-end data store to service its clients. He also lays out the different layers in the big data industry, providing interesting tips for entrepreneurs who wish to enter that space.
Sramana Mitra: Billy, let’s start by setting some context about DataStax. Tell us what you do and what the scope of your work is.
Billy Bosworth: At DataStax we have a database technology that helps people write the apps to transform their business. We are a big data company. For us, that means we are one of the databases that runs on the back end of that process. It is technology at the start of the distributed database management system called Apache Cassandra. The company’s first task was to service that community, continue the project, advance the project, and service the customers who were moving into production with that technology. We then took that technology at the core and started to integrate and add other features on top of it to equip people with everything they needed in that back-end data store, to develop their applications and handle the type of skill they need.
SM: When you say your roots are in the Apache world, did this start as an open source project?
BB: It did. The technology we are based on – Apache Cassandra – did start as an open source project. In the big data realm there are two large bucket problems. We will paint these with very broad brushes so it is easier to quantify: On the one hand you have the big data technology that has to deal with massive volume. You want to have a massive bit bucket collector. You put everything in there. Once that data is in there, you want to do an investigation of it. You will ask questions, look for trends, patterns, analysis, and outline conditions. That is the big bucket number one.
Big bucket two in the big data world is a different problem. This market deals with a velocity issue. You have data coming through at transaction rates per second in a greater magnitude than any other technologies were built to handle. This is actually the transactional engine. This is the conversation you have with your applications – we call it reads and writes. When the app needs something it has to request the data, when it needs to put something in it has to store the data, and it has to do this very fast. The reason it has to do it very fast is that, unlike doing an investigation, in this world you are trying to change somebody’s experience. You are trying to change the experience with your company or with your service through this application. That means you have to interact in real time. The Cassandra technology was built to handle that latter category: the hyper-velocity, the transactional nature and the experience of the application. That is the world we play in.
SM: How much of this was developed in the open source business model? Were you operating in the open source world and servicing customers before you moved to a more commercial domain?
BB: I joined the company in May 2011, and the company was created in April 2010. I had the opportunity to partner with them in my last company, called Quest Software. When we first started to partner with them, when DataStax was launched, there was no product offering to speak of. It was truly there to help people who were trying to move into production with Apache Cassandra and needed the experts. The cofounder of our company is Jonathan Ellis. Johnathan is the chairman of the Apache Cassandra project, which is how the dots got connected. When Jonathan started to look at all the use cases that were happening out there, he and his cofounder said, “Nobody is out there to help these people. They need a commercial company behind this technology.”
As to the very genesis of the company, it was all about going out and helping people use this open source technology and learning. “What are the use cases? What do you guys need? What are you not getting out of the technology today that you want?” These were the kinds of questions we would ask. I would absolutely say that our roots were in servicing the open source community. Even today that remains a very big charter of our company. We spend a material amount of money both on the marketing side and on the engineering side to make sure that community continues to thrive.