Digitalisation World Q&A on Kafka

With Michael Noll, Principal Technologist, Office of the CTO at Confluent.

Where did Kafka originate from?


Kafka started life at LinkedIn,initially, as high throughput, real-time infrastructure for moving data into Hadoop. It became apparent as time passed that this conduit was a precious resource in of itself because it could store large volumes of data and deliver them, not only to Hadoop but also to the plethora of other services that kept LinkedIn’s social network alive: people you may know, who’s viewed your profile, etc. As Kafka's usage grew, the team added stream processing features, allowing users to perform SQL-like operations on high throughput datasets as they moved through the company. The project was open sourced and donated to the Apache Foundation in 2011. As the benefits of messaging, storage, and processing in a single system propagated its way through the tech industry, a new category of data infrastructure became established. Kafka is now one of the most active open source projects and is used by the majority of listed companies the worldover.

Why did Kafka gain momentum so quickly?

Kafka is at the right place at the right time: it has become the technological foundation and digital 'central nervous system' for the always-on world, where businesses are increasingly software-defined and automated, and where the user of software is more software. We only need to look back at the past few years to see that major trends like cloud computing, artificial intelligence, ubiquitous mobile devices, and the Internet of Things have caused an explosion of data. And companies have been struggling with keeping pace to collect, process, and act on all this information as quickly as possible -- be it to serve their customers faster, to gain an edge on the competition, etc. The result? Whether you shop online, make payments, order a meal, book a hotel, use electricity, or drive a car: it's very likely that, in one form or another, this is nowadays powered by Kafka.

What problems do these ‘event streams’ solve for organisations?

At a high-level event streams connect data in different parts of a system or organisation together. But a simple explanation like this belies the real impact that event streaming has had on organisations. Event streaming systems provide storage and processing capabilities in addition to high throughput messaging. Properties that prove critical in real-world organisations because of various means by which they adapt and evolve. Here, answering simple questions like ‘how do you get System A to share data with System B’ is rarely enough. Modern digital companies face more nuanced questions: “Where do I get this data from?”, “How do I get both real-time and historical data?” or “Why doesn’t this data have everything I need?”. Event streaming systems like Kafka lay the foundations for answering these trickier questions, leveraging their combination of storing historical messages, in-built processing and real-time data to do so. All in all, it’s a more powerful and nuanced solution than those that preceded it, even though, at a high level, all such systems connect different parts of a system or organisation together.

Are there still barriers holding Kafka back, or slowing the market’s adoption of it? What challenges lie ahead?

Apache Kafka is at its most powerful when it acts as a central nervous system for all of an organisation's data. This is when all the data in an organisation is instantly available to all applications and people through the platform. With this incredible wealth of data available, new business can be uncovered, customer experiences exceeded, and new operational efficiencies realised.

However, Kafka is a complex distributed system and most organisations are built on top of a spaghetti mess of other systems. So without a team of Kafka experts, the bar for getting to a central nervous system is too high for many companies. This is why we at Confluent created Project Metamorphosis. By bringing the attributes of modern cloud computing systems to Kafka like elasticity, self-management, and unlimited storage, we're solving the most pressing issues that organisations run into when making event streaming a pervasive part of their business.

Bridging the Gap between Sustainability and CX By Jay Patel VP & GM, Webex CPaaS
How the tech industry can play its part in reducing carbon emissions Corporate social responsibility is now a business imperative and should be leading the business agenda. Technology companies need to demonstrate that they are taking sustainability and a reduction of their impact on the environment seriously. It’s a huge subject and more and more we are seeing customers demanding to know what we are doing. By Scott Dodds, CEO, Ultima Business Solutions
Sustainability as a primary driver of innovation Innovation can and must play a critical role in helping to simplify the problems and break the trade-offs between economics and sustainability. By Ved Sen, Business Innovation at Tata Consultancy Services
Ring the changes with circular IT procurement It is fair to say that sustainability and environmental responsibility is higher on the agenda for many businesses now than it has been over previous years. Not only is legislation slowly pushing businesses in this direction but the media spotlight, its increased importance to staff, as well as the high priority placed by consumers, means that many businesses are making improvements to their environmental footprint. By Mark Sutherland, director of e-commerce at Stone Group
Why businesses cannot COP-out of responsibility for sustainability action By Michiel Verhoeven, MD SAP UKI
Why digital transformation and green initiatives go hand-in-hand By David Mills, CEO Ricoh Europe
Sustainability IT and Circular IT in the light of COP26 By Betsy Dellinger, Senior Vice President and General Counsel, Park Place Technologies