Sramana Mitra: Why is this a big data issue?
Sasha Gilenson: Because when you look at the IT and the amounts of operational information that IT is creating, generic incubation is just one of the aspects that we started to cover because the space was not addressed by any other solution. If you look at the world of IT systems in general – performance and monitoring data, change of configuration information, machine data, etc. – we speak about amounts of information that are tremendous and are also changing rapidly.
I will give you an example: Something like an operating system has between 1,500 and 2,500 parameters. If you go to the configuration of a database, it is more. Application servers have over 10,000 configuration parameters. Then they have applications and a variety of other components. If you look at the average IT environment, you have millions and millions of configuration parameters, and those are changing at a rapid pace. When we speak about Fortune 100 and also Fortune 2000 companies with hundreds of thousands of servers, that is a significant amount of information you need to be able to analyze, practically in real time, to be able to address numerous use cases, including the ones that we started to discuss.
SM: I have been talking to a lot of big data companies for the series. One of the determining definitions that I have come up with on big data: “What is big data vs. straightforward analytics?” The volume of data has been very large for a long time, and business analytics has also been around for a long time. Where I put the big data bar is where you are dealing with lots of data but the solution requires applying machine learning and self-learning mechanisms in handling that data within the system. That is where I have arrived at after doing some 20 of these interviews. What is your reaction to that?
SG: I am not sure if limiting it to machine learning is the right thing. Business intelligence is there – business analytics is more limited.
SM: Whichever term you would like to use, but it has been around for a long time.
SG: There is a big difference, because business intelligence essentially is a means for which the user to slice and dice the data. It is a convenient way to drill down to the data, and the customer is actually responsible for the processing. When we speak about huge amounts of data, you cannot really slice and dice it. It is too large. Analytics [and other] mechanisms can make inferences for you. They take the data and provide insights. That is how you can work with tremendous amounts of information – focusing on the insights only. That is the difference. Business intelligence is about reporting and summarizing the data. It is not just about making these automatic insights. This is where I think the distinction lies between analytics and intelligence. Analytics is about making automatic insights based on a significant amount of data.
SM: It sounds as though what you are doing is not using machine learning, and it sounds like it is more about giving pointers to people to then diagnose based on those pointers where the problem may be, where the changes are happening, and where there are likely possibilities of a problem lurking.
SG: Think about the fact again: We look at the system that comprises millions of data points. Then you need to pick the few that are linked to the incident you are investigating. This centralization from the different amounts of data to the few that are potential root causes requires certain analytical methods. What we have are statistical algorithms and heuristic algorithms. We have the base knowledge we are implementing as part of the knowledge base to do that. Again, it is about the inference, about taking a significant amount of data to reflect that to the inside specific to the use case.