This discussion highlights a set of open problems in Big Data.
Sramana Mitra: Let’s start by introducing our audience to yourself as well as to Search Technologies.
Paul Nelson: I’m the Chief Architect for Search Technologies. Search Technologies is a global search engine and Big Data firm. We specialize in search engines, search engine installation, and anything related to data processing. We have some 200 consultants around the world, and we also do a bit of product development usually to fill in key missing components in the industry.
I built my own search engine in 1989. I have been working on search engines for 28 years now. I took that product to market after a couple of mergers. It eventually became a company called Excalibur Retrieval Ware. It was eventually purchased by another search engine which was later purchased by Microsoft.
Sramana Mitra: Talk to me a little bit about what kind of customers you cater to.
Paul Nelson: It’s a wide range. We don’t segment our industry into traditional industry segments. We tend to segment our customers by use cases. Those use cases can be across industries. A typical use case for us is corporate-wide search that’s for employees searching across their internal data. It’s challenging because usually the data is split across many different systems with different security models.
A second common use case is Big Data search analytics. We gather data from data warehouses and log files in business systems into a Big Data machine. We index that with search engines and develop dashboards and analytics for that. It can be for operations, system health checks, as well as for healthcare frauds, email compliance, and life sciences research reviews. Another typical example is our publishers. We have a lot of large-scale publishers as our customers.
One of our customers does precision agriculture. They download data from tractors and farmers and analyze exactly how much fertilizers and what type of seed every square meter of land would require. We’re getting a lot of those use cases. Search and Big Data are natural for those sorts of use cases because they can scale to just about any size of data.
Sramana Mitra: Let’s double-click down on a couple of your favorite use cases where your technology and trends really show off.
Paul Nelson: In terms of key industry trends, what we’re seeing is a general migration from data warehouses to Big Data platforms like Hadoop and other flavors of Hadoop. Search is a great way to democratize that data and provide self-service analytics and dashboarding to the business.
All the old traditional data warehouse technologies like OOLTP are moving over to data platforms and search. A second key industry trend we see is NoSQL, primarily for scalability. NoSQL is a backbone for large-scale data processing. We’re seeing a lot more natural language processing. This is fun for me because my first search engine was a natural language processing-based search engine. I’ve been working on NLP ever since I started in the search engine business.