Saturday, 16th November 2019
Logo

Powering real-time analytics

GridGain Systems has launched the GridGain Data Lake Accelerator, an in-memory solution for digital businesses that need to enrich operational data with historical data stored in data lakes to improve real-time analytics and decision automation. The GridGain Data Lake Accelerator is available for use with the GridGain Enterprise Edition and GridGain Ultimate Edition.

The GridGain Data Lake Accelerator boosts data lake access by providing bi-directional integration with Apache™ Hadoop®. This integration brings the historical data into the same in-memory computing layer as the operational data, enabling real-time analytics and computing on the combined data to drive real-time business processes. It leverages the GridGain Unified API and native Apache Spark™ connector to power real-time HTAP (hybrid transactional/analytical processing) in which transactions and analytics are performed on the same operational dataset.

“Many of today’s digital transformation and IoT use cases require real-time analytics against a combination of data lake and operational data,” said Abe Kleinfeld, president and CEO of GridGain. “The GridGain Data Lake Accelerator addresses the requirements of today’s businesses to gain instant insight, capitalize on opportunities as they arise and automate decision making.”

“Many companies have created Hadoop-based data lakes with a view to consolidating data from multiple data sources and serving the processing and analytics needs of multiple use-cases, but have then struggled to generate the expected value,” said Matt Aslett, Research VP, Data, AI and Analytics, 451 Research. “By bringing its in-memory compute functionality to the data lake, GridGain is providing an option for accelerating access to historical and live data to support real-time decision-making.”

Typical use cases for the GridGain Data Lake Accelerator include using historical data to enrich real-time data streams, calculating thresholds for real-time operational triggers from historical trends, and displaying historical and real-time data together in operational dashboards. For example, a transportation company might be collecting a continuous stream of data from its vehicle engines. The data is ingested, processed and analyzed and then stored in a data lake, with only the most recent data retained in the operational data store. When an anomalous reading in the live data triggers an alert for a particular engine, the system needs to analyze the engine data to identify the root cause of the problem. An infrastructure powered by GridGain’s in-memory computing platform, Kafka, Spark and Hadoop makes this possible. Apache Kafka feeds the live streaming data to the GridGain in-memory computing platform and to the Hadoop data lake. Spark retrieves the required data from the data lake and delivers it to the in-memory computing platform. The GridGain in-memory computing platform maintains the combined data set in memory and runs real-time queries across the data set. The result is deep and immediate insight into the causes of the anomalous reading.

Over a third of CFOs see big data as a threat to employment.
New research substantiates the view that data is a business’ most important asset - with CDOs playin...
Survey unveils the biggest obstacles to data strategy that are impeding business decisions.
A cultural shift in attitudes to analytics will be essential for businesses to compete in the age of...
European Development Centre part of expansion programme.
TigerGraph, the scalable graph database for the enterprise, has introduced TigerGraph Cloud, the fir...
Consultancy from customer engagement experts NGDATA has secured data-fluent operation following Tham...
Almost half of HR professionals (44 percent) are now using data for workforce planning and reporting...