According to an Adroit market research report, the data science platform market is estimated to grow $178 billion by the year 2025. Data science leverages advanced technologies such as machine learning and artificial intelligence to mine actionable insights from large data sources. Databricks is a unified data analytics solution provider that has seen tremendous growth in the past few years.
San Francisco-based Databricks was founded in 2013 by the original creators of Apache Spark, Delta Lake, and MLflow. It was set up with the objective of bringing together data engineering, science, and analytics on an open, unified platform to help data teams collaborate and innovate faster.
Its four key open-source products include an open-source data lake called Delta Lake; an open-source project that assists data teams with operationalizing machine learning called MLflow; its open-source analytics engine Spark, and a single machine framework called Koalas.
Databricks is a data analytics solution that has been built on top of Microsoft Azure and used for managing, parsing, and processing large quantities of information to develop and deploy models on the data to help derive actionable insights. Based on Apache Spark, Databricks has been designed specifically for big data processing and helps data scientists take advantage of built-in core API for languages like SQL, Java, Python, R, and Scala. As a first party PaaS on Microsoft Azure cloud, it provides one-click setup, native integrations with other Azure cloud services, interactive workspace, and enterprise-grade security to power Data and AI use cases for small to large global customers.
It is a fully managed PaaS offering that leverages Microsoft Cloud to scale rapidly, host massive amounts of data effortlessly, and streamline workflows for better collaboration between business executives, data scientists, and engineers.
Databrick’s solutions are delivered on a freemium model. Users can download and utilize its products for free, but the management and use of these products is not easy. Databricks earns revenues by offering a subscription-based service that helps with the management of these tools.
Databricks is a privately held company and doesn’t publish its financials. However, in October 2019, it had announced that it was operating at a $200 million run rate. In October 2020, reports revealed that it was trending at an annualized run rate of $350 million.
Databricks has raised $897 million in funding from investors including Tiger Global Management, Dragoneer Investment Group, Andreessen Horowitz, Microsoft, BlackRock, Geodesic Capital, Alkeon Capital, Green Bay Ventures, New Enterprise Associates, and Coatue. Its most recent round of funding was held on in October 2019 when it raised $400 million in a round lead by Andreessen Horowitz at a valuation of $6.2 billion. An earlier round held in February of 2019 had valued Databricks at $2.75 billion.
Databricks’s Redash Acquisition
In June 2020, Databricks announced the acquisition of a visualization services provider Redash for an undisclosed sum. Israel-based Redash was founded in 2015 to help provide easy-to-use dashboarding and visualization capabilities to both data scientists and SQL analysts. Prior to the acquisition, Redash had not raised any outsider funding. Databricks has leveraged Redash’s solutions to help eliminate the complexity of moving data into other systems for analysis. The enhancements are providing organizations with the ability to adopt a simplified, single cloud architecture for data management, thus helping them reduce costs and complexity, while accelerating data team productivity.
Besides growing through acquisitions, Databricks is also expanding its presence through product expansion. Last quarter, it announced the release of SQL Analytics, which enables data analysts to perform workloads previously meant only for a data warehouse on a data lake. This new service expands the traditional scope of the data lake from data science and machine learning to include all data workloads including business intelligence (BI) and SQL. It will empower data teams across data engineering, data science, and data analytics to work on a single source of truth for data.
Last quarter, it also announced that Microsoft Azure Databricks had received a Federal Risk and Authorization Management Program (FedRAMP) High Authority to Operate (ATO). This authorization will allow Azure Databricks security and compliance for high-impact data analytics and AI to bid for a wide range of public sector, industry, and enterprise use cases.
Databricks is a rapidly growing organization as is evident from its soaring valuation. Analysts expect the company to go public soon, but Databricks has not expressed any such interest.
Databricks has successfully managed to build a data platform upon which, customers are conducting high-end analytics and running AI projects. But it does not have a PaaS ISV program yet. As I have mentioned earlier, an active ISV strategy helps create an ecosystem that nurtures startups and small businesses while giving PaaS providers access to a ready portfolio of companies. Databricks could surely benefit from this strategy.
Disclosure: All investors should make their own assessments based on their own research, informed interpretations and risk appetite. This article expresses my own opinions based on my own research of product-market fit, channel execution, and other factors. My primary interest is in product strategy. While this may have bearing on stock movements, my writings tend to focus on long-term implications. The information presented is illustrative and educational, but should not be regarded as a complete analysis nor recommendation to buy or sell the securities mentioned herein. I am not a registered investment adviser and I am not receiving compensation for this article.