Wednesday, 19th June 2019
Logo

Powering real-time analytics

GridGain Systems has launched the GridGain Data Lake Accelerator, an in-memory solution for digital businesses that need to enrich operational data with historical data stored in data lakes to improve real-time analytics and decision automation. The GridGain Data Lake Accelerator is available for use with the GridGain Enterprise Edition and GridGain Ultimate Edition.

The GridGain Data Lake Accelerator boosts data lake access by providing bi-directional integration with Apache™ Hadoop®. This integration brings the historical data into the same in-memory computing layer as the operational data, enabling real-time analytics and computing on the combined data to drive real-time business processes. It leverages the GridGain Unified API and native Apache Spark™ connector to power real-time HTAP (hybrid transactional/analytical processing) in which transactions and analytics are performed on the same operational dataset.

“Many of today’s digital transformation and IoT use cases require real-time analytics against a combination of data lake and operational data,” said Abe Kleinfeld, president and CEO of GridGain. “The GridGain Data Lake Accelerator addresses the requirements of today’s businesses to gain instant insight, capitalize on opportunities as they arise and automate decision making.”

“Many companies have created Hadoop-based data lakes with a view to consolidating data from multiple data sources and serving the processing and analytics needs of multiple use-cases, but have then struggled to generate the expected value,” said Matt Aslett, Research VP, Data, AI and Analytics, 451 Research. “By bringing its in-memory compute functionality to the data lake, GridGain is providing an option for accelerating access to historical and live data to support real-time decision-making.”

Typical use cases for the GridGain Data Lake Accelerator include using historical data to enrich real-time data streams, calculating thresholds for real-time operational triggers from historical trends, and displaying historical and real-time data together in operational dashboards. For example, a transportation company might be collecting a continuous stream of data from its vehicle engines. The data is ingested, processed and analyzed and then stored in a data lake, with only the most recent data retained in the operational data store. When an anomalous reading in the live data triggers an alert for a particular engine, the system needs to analyze the engine data to identify the root cause of the problem. An infrastructure powered by GridGain’s in-memory computing platform, Kafka, Spark and Hadoop makes this possible. Apache Kafka feeds the live streaming data to the GridGain in-memory computing platform and to the Hadoop data lake. Spark retrieves the required data from the data lake and delivers it to the in-memory computing platform. The GridGain in-memory computing platform maintains the combined data set in memory and runs real-time queries across the data set. The result is deep and immediate insight into the causes of the anomalous reading.

WHISHWORKS reveals results of 2019 Big Data Survey with in-depth look at trends and challenges.
Exasol’s new sports analytics package enables teams to get near real-time insights to improve rankin...
GigaSpaces, the provider of InsightEdge, the fastest in-memory real-time analytics platform, has par...
Talend says that Talend Cloud, a unified, comprehensive, and highly scalable integration platform-as...
Ivanti has published the results of a survey of 400 IT professionals that captures the challenges fa...
Providing a single, comprehensive, and scalable enterprise-grade streaming data management solution...
Redis Enterprise on Microsoft Azure simplifies the development of highly performant, resilient, and...
New smart city platform ensures city control, policy transparency and citizen trust.