Adaptive computing platforms deliver efficient AI acceleration

AI has begun to change many facets of our lives, creating tremendous societal advancements. From self-driving automobiles to AI-assisted medical diagnosis, we are at the beginning of a truly transformative era. By Greg Martin, Director, Strategic Marketing, Xilinx, Inc.

But with opportunity, comes challenge. AI inference, the process of making predictions based on trained machine learning algorithms, requires high processing performance with tight power budgets, regardless of deployment location – cloud, edge or endpoint. It’s generally accepted that CPUs alone are not keeping up and some form of compute acceleration is needed to more efficiently process AI inference workloads.

At the same time, AI algorithms are evolving rapidly, faster than the speed of traditional silicon development cycles. Fixed-silicon chips like ASIC implementations of AI networks risk becoming quickly obsolete due to the rapid innovation in state-of-the-art AI models.

Whole Application Acceleration

There is a third, less well-known challenge. This centers around the fact that AI Inference does not get deployed in isolation. Real AI deployments typically require non-AI processing, both before and after the AI function. For example, an image may need to be decompressed and scaled to fit the AI model’s data input requirements. These traditional processing functions must operate at the same throughput as the AI function, again with high performance and low power. Like with the AI inference implementation, the non-AI pre and post processing functions are beginning to need some form of acceleration.

To build a real application, the whole application needs to be implemented efficiently. In a data center application, the application may have thousands or even millions of parallel instances. Every fraction of a Watt that can be saved per instance will make a huge difference to overall power consumption

A solution is viable only if the whole application meets both the performance goal, through acceleration, and the power requirements, through greater efficiency. So how do we viably implement a whole application acceleration?

There are three key elements: the ability to build a custom data path; use of a single-device implementation; and the ability to take advantage the latest AI models as they continue to rapidly evolve and improve. Let’s take a look at all three elements.

The ability to build a custom data path

Most forms of AI inference operate on streaming data. Often the data is in-motion, such as part of a video feed, medical images being processed, or network traffic being analyzed. Even when data is stored on disk, it’s read off disk and streamed through the “AI application”. A custom data path provides the most efficient method for processing such data streams. A custom data path frees the application from the limitations of a traditional Von-Neuman CPU architecture, where data is read from memory in small chunks, operated upon and written back to memory. Instead a custom data path passes data from one processing engine to the next, with low latency and the right level of performance. Too little processing performance would not meet the application’s requirements. Too much performance would be inefficient – wasting power or physical space with capability that’s sitting idle. A custom data path provides the perfect balance – rightsizing the implementation for the application.

Single device implementation

Some solutions are good at AI inference, but not whole application processing. Fixed-architecture devices such as a GPUs generally fall into this category. GPUs can often be capable of high Tera-operations per-second (TOPs) numbers, a common performance metric, but AI inference performance typically needs to be matched with pre and post processing performance. If the non-AI components cannot be efficiently implemented on the same GPU, a multi-device solution is needed. This wastes power by requiring data to be sent between devices, which is very inefficient and costly in terms of power consumption. A single device that can efficiently implement the whole application has a significant advantage in real-world AI inference deployments.

Adapt and evolve with the latest AI models

The pace of innovation in AI is staggering. What’s considered the state of the art today could easily be rendered nearly obsolete six months from now. Applications that use older models risk being uncompetitive, so the ability to rapidly implement the latest models is critical.

So what technology allows dynamic updates of the AI models while providing the ability to build a custom data path to accelerate both AI and non-AI processing in a single device? The answer is adaptive computing platforms.

Adaptive Computing Platforms

Adaptive computing platforms are built on hardware that can be dynamically reconfigured after manufacturing. This includes longstanding technologies such as FPGAs, as well as more recent innovations such as Xilinx’s AI Engine. A single-device platform such as Xilinx’s Versal™ Adaptive Compute Acceleration Platform can accelerate both the AI and non-AI processing functions, by allowing custom data paths to be built. They are also capable of implementing the latest AI models quickly and efficiency because the hardware can be quickly reconfigured. Adaptive computing devices provide the best of both worlds. They offer the efficiency benefits of custom ASICs without the lengthy and expensive design cycles.

Xilinx Versal AI Core Series VC1902

The best implementation of an AI application doesn’t need to be the fastest, it needs to be the most efficient, yet remain flexible. It must be right-sized, delivering the performance that’s needed, nothing more and nothing less.

Summary

As AI inference becomes more pervasive, the challenge is not just how to deploy the AI model, but how to most efficiently deploy the whole AI application. When applications are replicated thousands or even millions of times, a small energy saving in each instance could save an entire power station worth of energy. When you multiply that by the myriad of new AI applications under development, the effects will be dramatic. There should be no doubt that efficient acceleration of whole AI applications should be a goal for all in the technology industry and adaptive computing platforms provide a competitive solution.

By AppDynamics EMEAR CTO, James Harvey,
By Prith Banerjee, CTO at Ansys.
Catherine Gull, Head of Private Network Sales at Cellnex UK, explains why enterprises should look to private networks to unlock digital transformation in the workplace, facilitating the fourth industrial revolution and opening up a world of new possibilities in a fast-moving digital era.
Bridging the Gap between Sustainability and CX By Jay Patel VP & GM, Webex CPaaS
How the tech industry can play its part in reducing carbon emissions Corporate social responsibility is now a business imperative and should be leading the business agenda. Technology companies need to demonstrate that they are taking sustainability and a reduction of their impact on the environment seriously. It’s a huge subject and more and more we are seeing customers demanding to know what we are doing. By Scott Dodds, CEO, Ultima Business Solutions
Sustainability as a primary driver of innovation Innovation can and must play a critical role in helping to simplify the problems and break the trade-offs between economics and sustainability. By Ved Sen, Business Innovation at Tata Consultancy Services
Ring the changes with circular IT procurement It is fair to say that sustainability and environmental responsibility is higher on the agenda for many businesses now than it has been over previous years. Not only is legislation slowly pushing businesses in this direction but the media spotlight, its increased importance to staff, as well as the high priority placed by consumers, means that many businesses are making improvements to their environmental footprint. By Mark Sutherland, director of e-commerce at Stone Group
Why businesses cannot COP-out of responsibility for sustainability action By Michiel Verhoeven, MD SAP UKI