Sramana Mitra: So, the way end customers access your product is through other application vendors that are using your product to develop their applications?
Ari Zilka: It’s a 50-50 split. Half the world doesn’t even need to know we’re there. They use other applications that use us. We help those vendors sell our product to the customers to get more scale, more throughput. And then half the customers have end developers in-house who go to our site directly, download our products, and integrate them by hand into their own business logic. So, it’s not just for packaged apps. It’s also for custom apps.
SM: How big is the company?
AZ: Big on what measure?
AZ: We’re part of a public company. Software AG publishes its revenue numbers yearly. It’s roughly $1.5 billion a year in revenue. Terracotta is a business unit. We do not break out our individual revenue, but it is mid-eight digits in U.S. dollars per year.
SM: Okay. You seem to be situated at a very interesting part of the ecosystem given what’s going on around us right now. I think somebody made the comment – and I’m trying to remember who – that a very large percentage, maybe 80% or 90%, of the IT data that has been generated in the world has been generated in the past two years because of this immense escalation in social media . Do you know who I’m talking about? Is it Sheryl Sandberg quote maybe?
AZ: I’m not sure. The quote sounds familiar.
SM: Anyway, there is a lot more data in the world now, and there are a lot of applications that are dubbed “big data applications” that are trying to go after this data and extract meaning from it for various business applications. Would you talk about that trend and how it affects what you’re doing? What do you see in that trend?
AZ: Sure. Your main thesis is cloud and One Million by One Million. My main thesis is data management. What’s going to make the cloud really go fast and big, and you’re asking me what can entrepreneurs focus their efforts on.
SM: No, I’ve not gotten to that yet. Right now I’m asking a much broader question, which is what do you see in the big data trend, a trend that everybody is running after right now. Large companies are running after it. VCs are running after it. Entrepreneurs are running after it. I’m asking you, because of your vantage point in that space, what do you see in that trend?
AZ: Big data in contextual, meaning the ones who invented big data, like Yahoo! and Google and Amazon, and eBay. They have petabytes of data, huge volumes of data. For the rest of us, big data is terabytes. That’s the first trend I’m seeing, terabytes of data. For most of us, as IT organizations, the databases are falling over. They cannot do analysis on their data, and there are two types of analysis they want to do. They want to do real-time analysis, which a product like Terracotta helps you do. Terracotta Big Memory helps you do real-time analysis. They also want to do what’s called batch analytics, the more traditional style – let me crunch over my data through the night while my customers are asleep and look for patterns or execute tasks I need to execute. A healthcare organization goes through all the insurance claims made each day, and then in the middle of the night, prints out the benefits paperwork that you and I get after going to the hospital that says, “We got a bill for $70, and Kaiser Permanente paid $50.” That batch processor is a Terracotta-based processor at Kaiser.
The point is that companies are looking at data and saying, “I have this old, stodgy, batch-based infrastructure legacy. It is breaking down because of the influx of data, and I need to process my batch processes faster. Also, I can’t use batch-based tools to get real-time telemetry on what’s going on in my business.” So, the notion of a bank, for example, looking at their overnight positions to figure out how much money they spent yesterday, now seems ludicrous to them. They want to know in real time, What’s the running tally of buying and selling in an investment bank we have done today or out in the various markets around the world? What’s our position? How much more should we buy? How much more aggressive or more conservative should we be for the rest of the day, based on the news flowing in? Banks calculate their positions overnight. The move to real time is now here, and the flood of data is making real time difficult.
The legacy from Oracle and IBM is all batch-based analytics – I’ll tell you overnight the answer to your data questions. The third thing I’m seeing is that applications are now being affected by this. I wrote some rigid business logic. It took me 12 months as, let’s say, Amazon.com. Amazon is not a customer of Terracotta’s, but just as a construct, they take months and months to build a new feature on their website. That feature has a very tight coupling on the website to the data stored inside a database below it. If the business wakes up one day, and says, I’d like to ask a new question of my data, maybe they don’t have that data stored because they couldn’t retain it because there’s a flood of data.
I have that problem at Terracotta, where we actually increase our internal databases of users using our technology by about 20 gigabytes a day, every day. I’m at a multi-terabyte store as an operator of a database right now. My marketing team wants to know what are all the regions of all the users of our products, as an example. It will take me a couple of hours to answer them. If they ask me a question like what are the cities, not just the countries, in which people are using our technology – because we’ll go hire new sales reps in the cities where the most people are using our technology – I don’t know because I don’t record cities in my database. My database is growing too fast, so I’ve made trade-offs. I’ve said, let’s not record this data. Let’s record only that data. No one needs this data. That’s the trend I’m seeing.
To summarize, you basically have a push toward real time. Batch infrastructure is falling down. And applications are finding themselves not agile enough to even be mutated to store the data that people want. Applications are making hard trade-offs about what data to keep and what data to lose. Does that make sense?