Talend joins Google to propose Dataflow as an ASF Incubator Project

Hadoop and the broader Big Data ecosystem continue to innovate at an incredible rate. By harnessing the power of the community and creating a survival-of-the-fittest competitive landscape, the open-source development approach helps not only fuel the pace of innovation but also drive buyer confidence and market adoption.

  • 8 years ago Posted in
Open source is also important to a growing number of developers who are moving away from proprietary software looking for greater efficiency as well as transparency. In our experience, an open source approach makes for better software and happier customers, so we are all for it!
Among the various Big Data open source projects, the Data Processing space is probably the most active and promising. There are many Data Processing Engines/Frameworks out there, some are fully open source like Apache Spark, Apache Flink, Apache Apex while others are packaged and available as a service such as Google Dataflow. Most Apache open source projects combine streaming and batch data processing, and provide various levels of APIs to help programmatically develop pipelines or data flows. Google is helping to lead this charge with an abstraction layer that allows Dataflow SDK-defined pipelines to run on different runtime environments.
Google Leading the Open Source Charge with Dataflow SDK
A little over a year ago, Google open sourced its Dataflow SDK, which provides a programming model used to express Data processing pipelines (Input/source -> Transformation/Enrichment -> output/target) very easily. What is great about this SDK is the level of abstraction it provides so you can think of your pipeline as a simple flow without worrying too much about the underlying complexity of the distributed and parallel data processing steps required to execute your flow.
Talend has a long history with the Apache Software Foundation (and already has committers on key Enterprise Apache projects such as Apache ActiveMQ, Camel, CXF, Karaf, Syncope or Falcon) and has been focusing a lot on developer productivity. Given this, as Google announced its proposal for Dataflow to become an Apache Software Foundation incubator project, it became very natural for Talend to join with them to help accelerate development along with a few other companies that share similar interests and core values.
A Series of Firsts for the Apache Software Foundation
Upon acceptance, Dataflow will be the first Apache Software Foundation project offering a set of SDKs allowing the abstraction of the definition and execution of Data Processing/Pipes workflows, supporting complex Data Ingestion and Integration enterprise patterns including routing as well as data and message transformations.
Open Source, Future-Proof                             
Developers leveraging the Dataflow framework won’t be locked-in” with a specific data processing runtime and will be able to leverage new data processing framework as they emerge without having to rewrite their Dataflow pipelines, making it Future-proof.
Moving forward, Talend will commit developers to the Dataflow framework, specifically on the Ingestion and Integration front as well as work with the community on future runners. We look forward to contributing to this project and the broader Big Data community.
Beacon, NY, Dec 20, 2024– DocuWare unveils its AI-powered Intelligent Document Processing...
Hitachi Vantara survey finds data demands to triple by 2026, highlighting critical role of data...
Only 45% of business data is fully utilised in decision-making, while 34% of business leaders state...
Hitachi Vantara survey finds data demands to triple by 2026, highlighting critical role of data...
Yamaha Corporation, a world-renowned leader in musical instrument manufacturing, has chosen to...
Panzura and GRAU DATA have formed a partnership and introduced an integrated solution that...
77% cite increasing operational efficiency as the main strategic and spending priority for 2025.
Availability and access to right data is key challenge to decarbonization efforts, despite 54% of...