Sramana Mitra: How easy is it to access the sanctioned list in the structured category?
Charlie Delingpole: That is easy because the US Treasury makes it publicly available. The real challenge is name matching. In Latin alphabets, it’s easy to name match because you can do simple heuristics.
>>>Sramana Mitra: How do you go to market? Is this a professional services go-to-market strategy? Technology-enabled services kind of strategy?
David Talby: It’s a mix. It’s about half and half. We do software licensing and we also do professional services.
Sramana Mitra: Is the professional service yours or is it outsourced or tied up with system integrators?
>>>Sramana Mitra: Am I understanding it correctly if I say that you are doing natural language processing on publicly-available news coverage from around the world to identify names of people who are involved in questionable activities?
Charlie Delingpole: That is correct.
>>>Sramana Mitra: What technique do you use? Since this is not big data, you cannot use machine learning. Do you use expert systems? How are you setting these things up?
David Talby: We do use machine learning and deep learning. We use a lot of deep learning and transfer learning. Now, we have our own built-in buildings.
>>>This conversation explores the use of AI to create a database of questionable players to address money laundering and other shady behavior. Excellent PaaS strategy!
Sramana Mitra: Let’s start by having you introduce yourself as well ComplyAdvantage.
>>>Sramana Mitra: Talk about the healthcare and life science domain-specific products. Is that all internal or is some of that also open source?
David Talby: That’s our product. That’s a licensed product. We have two licensed products: Spark NLP for healthcare and Spark OCR. Spark NLP for healthcare is an extension of the open-source library but it uses a separate code base and a separate set of models.
>>>Sramana Mitra: If I understand this correctly, you have the NLP engine, which is fairly horizontal; and you are applying various domain-specific heuristics and workflows on top of that to create solutions for different use cases in different industry segments.
Although they are all data science users, the workflow is different and the domain is different. The oncology knowledge is different from the clinical trial identification.
>>>Sramana Mitra: Talk about customers that you are currently working with and also customers that you would like to work with.
David Talby: We are most famous for our work on natural language processing in the stock NLP library. In terms of customers, we work in the health care and life science sector. The last NLP industry survey done in September was done by Gradient Flow. It shows that we have a 54% share of all the healthcare AI teams that use NLP.
>>>