Wednesday, 22nd May 2019

MapR accelerates separation of compute and storage

New advances make it easy to deploy native Spark and Drill applications in Kubernetes.

MapR Technologies has introduced innovations in the MapR Data Platform that accelerate the compute journey with new, deep integrations with Kubernetes core components for primary workloads on Spark and Drill. These innovations make it easy to better manage highly elastic workloads while also facilitating in-time deployments and the ability to separately scale compute and storage. Organizations restructuring their applications or building next-generation real time data lakes will benefit from these new capabilities in a Kubernetes model, with Spark and Drill, by easily leveraging the elasticity and agility of such clusters.

“Having run a recent survey on organizations’ use of containers to support AI and analytics initiatives, it is clear that a majority of them are exploring the use of containers and Kubernetes in production,” said Mike Leone, senior analyst, ESG. “We are also seeing compute needs are growing rapidly and bursty due to the unpredictability of compute-centric applications and workloads. MapR is solving for this need to independently scale compute while also tightly integrating with Kubernetes in anticipation of organizations’ rapid container adoption.”

In early 2019, MapR enabled persistent storage for compute running in Kubernetes-managed containers through a CSI compliant volume driver plugin. With this announcement, MapR further expands its portfolio of features and allows the deployment of Spark and Drill as compute containers orchestrated by Kubernetes. This deployment model allows end users including data engineers to run compute workloads in a Kubernetes cluster that is independent of where the data is stored or managed. The following core capabilities are included in this release:

  • Tenant Operator: Creates tenant namespaces (Kubernetes Namespaces) for running compute applications, allowing for a simple way to start complex applications in containers within Kubernetes. An end user can run Spark, Drill, Hive Metastore, Tenant CLI, and Spark History Server in these namespaces. These tenants can, in turn, point to a storage cluster that is located elsewhere.
  • Spark Job Operator: Creates Spark jobs, allowing for separate versions of Spark to be deployed in separate pods, facilitating the multiple stages of dev, test, and QA that are typical in a data engineer’s workflow.
  • Drill Operator: Starts a set of Drillbits, allowing for auto-scaling of queries based on demand.
  • CSI Driver Operator: Standard plugin to mount persistent volumes to run stateful applications in Kubernetes.

“MapR is paving the way for enterprise organizations to easily do two key things: start separating compute and storage and quickly embrace Kubernetes when running analytical AI/ML apps,” said Suresh Ollala, SVP Engineering, MapR. “Deep integration with Kubernetes core components, like operators and namespaces, allows us to define multiple tenants with resource isolation and limits, all running on the same MapR platform. This is a significant enabler for not only applications that need the flexibility and elasticity but also for apps that need to move back and forth from the cloud.”

In this release, MapR delivers on six key benefits:

  • Handle compute bursts by spinning additional compute containers without having to add more physical host servers;
  • Isolate resources and prevent applications from starving each other of resources by setting granular limits on quotas, and by using Spark job operators to create different Spark clusters;
  • Accommodate fluctuating query workload by growing Drillbits dynamically based on load and demand;
  • Run different versions of Spark and Drill on the same platform,
  • Allow for multiple tenants to co-exist; and
  • Deploy Spark and Drill container applications, along with MapR volumes, across multi-cloud environments, including private, hybrid and public clouds;
GigaSpaces, the provider of InsightEdge, the fastest in-memory real-time analytics platform, has par...
Talend says that Talend Cloud, a unified, comprehensive, and highly scalable integration platform-as...
Ivanti has published the results of a survey of 400 IT professionals that captures the challenges fa...
Providing a single, comprehensive, and scalable enterprise-grade streaming data management solution...
Redis Enterprise on Microsoft Azure simplifies the development of highly performant, resilient, and...
New smart city platform ensures city control, policy transparency and citizen trust.
MongoDB Atlas will be made available as a first class service within the GCP Console along with tigh...
BlackBerry Cylance has published the findings of its global artificial intelligence (AI) security su...