The DevOps world is fast moving towards total automation and self-correction. However, the holy grail is still another 3-5 years away. How do we get there? A fascinating conversation with Chris Nguyen who is steeped in building the data infrastructure necessary to run AI/ML algorithms subsequently.
Great interview, short and punchy.
Sramana Mitra: Let’s start by introducing our audience to yourself and LogDNA.
Chris Nguyen: I’m the Co-Founder and CEO of LogDNA. We empower developers to debug and monitor their infrastructure. We tell them what happened in the past, what’s happening right now, and what’s about to happen to ensure that their lights are up and running.
Sramana Mitra: So it’s a DevOps tool?
Chris Nguyen: Yes. Think of us as mission control. You want to make sure that things are healthy in your environment technology-wise. You want to make sure that you proactively see insights before things become too chaotic and crazy.
Sramana Mitra: I’d like to double-click down. Help me with a few use cases where your technology makes a real difference. Tell me the before and after LogDNA.
Chris Nguyen: A lot of the use cases is around infrastructure. You aggregate all the logs. Think of logs as the audit trail. We help engineers find those needles in a haystack. Is it a server problem? Is it a code problem? What’s going on?
One of our customers is a spin scooter. They want to see that the scooters are home and connected. There are other SaaS customers. Reddit is a customer. They want to ensure that their website and servers are up and running.
At the end of the day, these engineers want to deliver the best customer experience. When things do go down, they want to be able to troubleshoot.
Sramana Mitra: You said there is a capability in your technology to predict what is likely to happen, and then subsequently prevent that. Tell me what are some examples of that.
Chris Nguyen: We monitor things like errors. If you see a couple of errors, that’s one thing. But when you see a hundred errors in the frequency of a minute, that’s an alarm. That’s a key example to say, “I think there might be fire over there based on the following parameters.”
Sramana Mitra: What kinds of errors?
Chris Nguyen: Access control errors. Every use case is different. If someone checked in the wrong code, there’ll be an error that something is not computing properly. It’s up to the DevOps team or the engineering team to determine if this is a human or technology error. Our goal is to determine how to fix it.
Sramana Mitra: I’m going to double-click down on what you said, and ask you a trend question. How much of this kind of error currently needs human intervention to fix? How far are we from self-correcting systems?
Chris Nguyen: I think everyone strives to move to a world of self-correction. From a roadmap perspective, you want to be in that position where not only do I tell you what’s going to happen, but also tell you the solutions you need to have. Everyone’s aiming for that. We’re not ready yet. We’re probably 3 to 5 years from that.