Super interesting crowdsourced data model conversation!
25,000 data scientists participating in building algorithms for specific use cases.
Sramana Mitra: Let’s start by having you introduce yourself as well CrowdANALYTIX.
Divyabh Mishra: I am the founder and CEO of CrowdANALYTIX. I started the company nine years ago. I am an engineer with an MBA. I have been working across industries. I started in telecom and slowly moved into marketing and strategy.
I got interested in crowdsourcing, and the value it can add to companies in my last organization before I started CrowdANALYTIX. Crowdsourcing has been applied to things like Uber. You need to go from point A to B. The service is easy to deliver but there are other situations where smaller companies struggle to get access to good talent. Crowdsourcing with the current technology could help in that area. That is why I created CrowdANALYTIX. We didn’t know where the applications would be. What we did know was that there was a scarcity of data scientists. We first started building out the community. I started doing that by creating a competition on our platform.
The first competition brought us 400 data scientists who were keen on trying to solve the problem. We were trying to create a predictive model for predicting the quality of wine. That got the VCs interested and so we got funded by some partners. We got a seed fund of about $2 million. That got us going.
The whole idea of the seed round was to explore and build the community out and look for opportunities and where the use cases are. We worked with companies across industries. Pfizer, AT&T, Olive Wine, McKenzie, KPMG, and Honeywell are some of the companies that we worked with. We tried to do a bunch of things.
We tried to build a custom solution. We faced a bunch of hurdles. One of the things that we realized is that in AI, the biggest hurdle apart from the data scientist was the data itself. We noticed that data is not usually in order, untrustworthy, or missing information.
This is the problem that most of the industry faces today. For example, in autonomous cars where a lot of data has been accumulated, Many others struggle with the data itself. That made us focus on problems where we could perhaps apply AI to structure and unstructured information.
Anything unstructured is images, texts, and videos. They are reliable but not something that can be used to create models on top of. We started using AI and deep learning models to tag some of these data. That became a useful application. A lot of people were interested in that. We extracted valuable information and structured that information at scale. That would then lead to other solutions.
One of the biggest areas that we try to apply that is in the product catalog data. Our first big client what had us going. They wanted to grow their catalog from 1 million SKU’s to 200 million-plus. One of the biggest hurdles that you face when you are building an online catalog is the quality of your product data.
If the data is bad, then the searches do not work. You may have the product, but you cannot find it. That becomes a bigger problem as your catalog size grows. Amazon addressed this early on. Others are still struggling to do so. We are now one of the leading providers of this autonomous product catalog onboarding using various machine learning and deep learning algorithms.