Fivetran introduces new Managed Data Lake Service

Data teams are jumpstarting their generative AI and LLM initiatives with Fivetran Managed Data Lake Service by improving the quality, completeness and timeliness of enterprise data and reducing the complexity of managing data integration.

  • 5 months ago Posted in

Fivetran has introduced the Fivetran Managed Data Lake Service, designed to automate and simplify data lake management for businesses of all sizes. Fivetran supports over 500 pre-built as well as custom data sources, seamlessly integrating them into any major data lake destination while employing powerful change data capture, normalisation, compaction and deduplication processes. Fivetran’s Managed Data Lake Service is currently available on Amazon S3, Azure Data Lake Storage (ADLS) and Microsoft OneLake.

The Fivetran Managed Data Lake Service simplifies data lake management by automatically converting customer data to popular open formats (i.e. Apache Iceberg or Delta Lake) before landing it in the data lake. When combined with Fivetran's ongoing table management and maintenance, customers get the easy queryability and ease of use of a cloud data warehouse, with the flexibility and scale of a data lake. No other data provider can manage data lakes in this way, which means Fivetran customers benefit from the low cost of data lakes and the structure and reporting capabilities of data warehouses. Users can easily build out their data lake with query-ready data that can be read by data warehouses with their external table feature without having to move or duplicate data records in multiple locations. This supports a number of use cases including analytical, operational and genAI workloads.

The new Fivetran Managed Data Lake Service differentiates itself by not only converting data and centralising it in the lake but also providing an end-to-end data lake management service that automates low-level data management tasks entirely.

"Fivetran does the heavy lifting of change data management, PII detection, deduplication and other low-level table maintenance so that developers don't waste time on work that can be automated," said George Fraser, Fivetran CEO. "We hope to make business users and data scientists alike more productive by providing clean, centralised, optimised data from any source."

This level of automation and maintenance is crucial for many organisations. As Nick Chmura, Head of Data at Luma Financial Technologies, explains, “Automated table maintenance is the killer feature for us with Fivetran because we have so many different source connectors. To try to build change data capture and manage that for everything…would be prohibitively costly in terms of time.”

Fivetran Managed Data Lake Service helps transform traditionally ungoverned data lakes into organised, governed, continuously optimised data stores. With native integrations with data catalogues including AWS Glue, Databricks Unity Catalog and Microsoft Purview, users can quickly discover, access and govern key datasets from the lake. From there, users can query and modify the data with Python, SQL or other supported languages by leveraging compatible compute engines like Databricks, Snowflake, Starburst or Redshift. Or, they can transform the data with tools like dbt, visualise it with Power BI or build and deploy AI/ML models with tools like AWS Sagemaker, Azure Machine Learning or Databricks Mosaic AI.

Fivetran Managed Data Lake Service supports over 500 data sources, including on-premises and cloud databases like Postgres, MySQL, Oracle and SAP, SaaS applications, data warehouses, events and files. Fivetran can also create custom connectors, ensuring support for any data source without requiring precious engineering resources for pipeline management or connector development. This broad portfolio of source compatibility enables customers to unify their data in the data lake, regardless of where it currently resides.

“We are very excited about Fivetran supporting Delta Lake as a direct destination,” said Himanshu Raja, Director of Product, Databricks. “With this new capability, customers can now use Fivetran to build an open lakehouse with Delta Lake powered by the Databricks Data Intelligence Platform. We are also very excited about the upcoming Fivetran integration with Unity Catalog to provide out-of-the-box governance and security for all Fivetran-generated tables.”

The benefits of Fivetran Managed Data Lake Service include:

Empowering business users and data scientists with centralised, democratised, query-ready data that adds context, invites insights and drives data discovery

Enhanced operational efficiency by automatically converting the organisation’s data to open table formats (Delta Lake / Apache Iceberg) with robust data cataloguing and governance features

Reduced developer workload by having Fivetran do the heavy lifting of table updates, PII detection, deduplication and other low-level table maintenance tasks that can be automated

Reduced costs by automating data migration away from legacy data warehouses that lock organisations in with proprietary data formats

Provide peace of mind with worry-free data replication that ensures datasets always arrive clean and complete, with every change captured.

In response to the surging demand for advanced AI, Fivetran has seen substantial growth and customer interest in data lake destinations, particularly among large enterprises. This increase in demand underscores the critical need for cost-effective, flexible data architectures that enable customers to achieve success with AI and machine learning.

Others in the industry are also seeing a demand for new architectures to meet evolving needs. For example, Starburst continues to see open architecture adoption gain momentum and Fivetran’s adoption of Iceberg in its Managed Data Lakes Service further validates the Icehouse architecture”, said Anders Holden, Director of Product Management at Starburst Data. “This service simplifies the ingestion process and removes complexity around data lake management. By using Starburst to run fast SQL analytics on the data lake, businesses will see a faster time to insights and a streamlined data pipeline process, while saving time and reducing costs.”

Designed for partners and customers, Schneider Electric’s first Critical Power and Cooling Hub in...
Introducing five new partner journeys in its Lenovo 360 Global Partner Framework, Lenovo drives...
Eficode, a leading DevOps company, has been awarded GitHub Channel of the Year in the EMEA Platform...
SPG is enhancing its cybersecurity capabilities in a new partnership with Saviynt, a leading...
Panduit partners with Hyperview to replace its SmartZone Cloud software product offerings with...
New bundles, integrated task management and enhanced billing capabilities empower MSPs to deliver...
New free professional learning hub for MSPs and MSSPs – Cynomi’s vCISO Academy offers resources...
XPS delivers financial flexibility to help channel partners close large deal pipeline, ease...