Jeremy Howard is president and chief scientist at Kaggle, and sits on the faculty at Singularity University. Previously, he founded FastMail (sold to Opera Software) and Optimal Decisions (sold to ChoicePoint, now called LexisNexis Risk Solutions). Prior to that he worked in management consulting, at McKinsey & Company and A.T. Kearney.
Jeremy’s passion is applying algorithms to data. Kaggle is the world’s largest community of data scientists – more than 75,000 at last count. Kaggle’s data scientists have solved some of the toughest problems for some of the world’s smartest organizations, including NASA, Merck, Ford, Allstate, and Wikipedia. At Singularity University, Jeremy teaches data science to the elite group of students who are awarded places to the graduate program.
At FastMail he used algorithms to automate nearly every part of the business – as a result the company only needed a total of three full-time staff, and got over a million signups. Optimal Decisions was a business built to commercialize a new algorithm he designed for the optimal pricing of insurance.
Sramana Mitra: Jeremy, let’s start with some context about Kaggle, yourself and how you got involved in this major big data disruption that you are doing.
Jeremy Howard: Kaggle is an interesting and unusual company. We run machine learning competitions of two types: private competitions and public competitions. Machine learning competitions have been running for over 20 years. What happens is an organization makes its data available to lots of data scientists to take a look at and try to build models. Then they set a particular modeling challenge, and whoever can come up with the most accurate predictions wins a prize. There have been examples of very successful data mining and machine learning competitions. The one most people know is the Netflix Prize, which was a $1 million competition a few years ago. Kaggle was created so that any organization can benefit from using machine learning competitions.
SM: The idea came when Netflix made the announcement that they were going to give out a million dollars to solve their recommendation system problems.
JH: The idea has been around for over 20 years, mainly in academia. There have been a few academic endeavors – particularly one called KDD, which is a data mining competition that has been around since 1997. Netflix was financially the biggest of any kind of competition, so that was the best known. Netflix was an inspiration in making this, because it was something that was [available] to everybody.