The News: Hewlett-Packard Enterprise this week announced that it has acquired Ampool, a provider of a distributed SQL engine based on the open source Presto project that allows users to access data stored in multiple databases. Read the HPE press release here.
HPE Extends GreenLake Capability with Ampool Acquisition
Analyst Take: Data is rapidly becoming a key battleground as hybrid cloud providers look to capture database workloads as a control point for more widespread cloud deployments. As cloud adoption moves beyond test and development workloads and into more mission critical applications, the importance of data management increases.
Databases are an essential layer for organizations: they are critically the crown jewels of an organization’s IT architecture. Today, clients will have database administration teams focused on ensuring the integrity, security, and that data is backed up. As we increasingly see in ransomware attacks, once hackers capture key databases and then encrypt them, the operations of an organization rapidly come to a halt.
On the flip side, cloud providers are increasingly identifying that if they can capture the database layer, then applications and middleware workloads will rapidly follow. Database workloads are ‘stickier’ than more transitory workloads and once deployed are harder to migrate, giving cloud providers an element of client lock in. This trend is increasingly relevant when application and middleware layers are deployed in containers and orchestrated via Kubernetes, as containers are, by definition, designed to be portable and easily moved.
This means that if a cloud provider can win the database layer, they are more likely to keep that client in the medium to long term.
Another factor to be considered is the current level of maturity of managing data varies widely across organizations. The majority of data today is still managed within the context of the application used to create it. Organizations are just beginning to aggregate that data into data lakes that enable applications to access data, regardless of how it was created or ultimately stored. Loosely coupled query engines that can support multiple analytics and business intelligence tools accessing a range of backend data sources have become a critical requirement.
Against this backdrop, the acquisition of Ampool by HPE makes perfect sense as the company tries to cement its Ezmeral platform as a key component in their clients’ hybrid cloud strategy. HPE Ezmeral and, in particular the Data Fabric data platform, builds on innovations by MapR Technologies to deliver a unified data platform to ingest, store, manage, process, apply, and analyze all data types from any data source, and a variety of different ingestion mechanisms. HPE acquired MapR in 2019, a data platform focused on artificial intelligence and analytics applications powered by scale-out, multi-cloud, and multi-protocol file system technology.
Ampool is a provider of a distributed SQL engine based on the open source Presto project that allows users to access data stored in multiple databases. The Presto Project joined the Linux Foundation in 2019 as part of the wider Linux Foundation collaborative projects structure. The Presto distributed SQL engine is already available in a container format. Ampool is currently in the process of adding support for that format to its distribution of Presto, said Anant Chintamaneni, general manager for HPE Ezmeral.
According to the HPE announcements, the company plans to incorporate the distributed SQL engine it has gained into the HPE Ezmeral container platform, based on Kubernetes clusters. The Ezmeral platform offers IT organizations a range of data services that include support for the Apache Spark framework, frameworks for machine learning operations (MLOps), and now SQL platforms. Additionally, Ampool has also developed the ability to create a meta store for accessing data stored in multiple databases. Once those joins are created, Ampool allows customers to store the joins in cache memory to boost overall performance, using a tool based on open source Apache Geode software. The overall goal of this approach is to reduce the overhead associated with providing access to multiple data sources by building a data federation layer on top of an acceleration engine that boosts the speed at which analytical query processing occurs at scale.
As organizations begin to realize that data is a business asset, and that its management is becoming more crucial, the processes for managing it are becoming more structured. The generation of SQL requests has also transformed. Canned SQL queries launched by business intelligence and reporting tools are giving way to much more ad-hoc queries that are harder to predict. As such, the level of processing horsepower that needs to be available on demand has steadily increased.
SQL continues to be a predominant database workload in many organizations, despite the efforts of the NoSQL database proponents. The approach taken by HPE here is that, based on its experience with ISV partners, a clear need has emerged to modernize SQL stack. HPE is betting that current on-premises SQL technologies are not suited to the new requirements around hybrid cloud and scale. It is clear to see that HPE is further focusing on modernizing the SQL stack, as it believes that analytics transformations will become essential for clients as they try to address the challenges in the hybrid cloud and disparate data space.
I immediately see this Ampool deal as a strategic capability to add to the company’s as-a-Service model. HPE is creating a platform that enables an internal IT team to manage data, regardless of where it is stored, as a service. HPE also plans to integrate Ampool with the managed HPE GreenLake service it provides for its servers running in on-premises IT environments. This acquisition builds on the recently announced range of containerized services that will be delivered using an instance of the HPE Ezmeral platform accessed via the HPE GreenLake service.
I envision that it may be a while before data is truly fully managed as a service within most organizations. However, I believe that some, arguably long overdue, progress is being made in this domain. The challenge the industry is facing is melding all the data science, engineering, and management expertise required to realize that goal spans a range of technology and cultural challenges that are not easily overcome. Line of Business executives assume that data should be easily accessible whenever required. Explaining why this nirvana is one of the primary reasons the divide between IT and the rest of the business remains as wide as it is.
Disclosure: Futurum Research is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.