Containing cloud costs while accelerating AI deployments

By Rupert Colbourne, Chief Technology Officer, Orbus Software.

  • 4 months ago Posted in

Organisations everywhere are exploring the benefits of AI. However, as a notoriously data-hungry technology, its deployment requires greater cloud storage and compute capacity. But, in a rush to implement this, many companies have seen a rise in cloud costs. And this can present a significant risk to already stretched IT budgets. It’s vital, then, that organisations take steps to contain such costs when utilising the cloud to accelerate their AI deployments.

The need for cloud

According to IBM’s Global AI Adoption Index 2023, about 42 per cent of enterprises have actively deployed AI in their business, with a similar number (40%) currently exploring or experimenting with the technology. And the speed of AI adoption is increasing. Three in five of those companies (59%) that have either deployed or are exploring AI say they have accelerated their rollout or investments in the technology.

This is hardly surprising. From predicting market trends to enhancing decision-making processes and optimising operations, the application of AI offers a range of potentially transformative capabilities. Embedding AI into the very fabric of its business operations allows an enterprise to not only anticipate changes in the market but also adapt and react to them with unprecedented speed and efficiency. And its ability to analyse vast amounts of data at breakneck speeds enables organisations to make informed decisions in real time, ensuring they remain agile in a constantly shifting business landscape.

AI is only as good as the data used to train and inform it. Without it, an organisation’s AI initiatives are unlikely to provide the competitive advantage they seek. The long running struggle to derive value from exponential rises in data is further amplified in the context of AI. Accenture has found that just 32% of companies are able to realise tangible and measurable value from their data, even less (27%) say their data and analytics projects produce highly actionable insights and recommendations. This leaves the majority of companies struggling to harness the full potential of their data.

To ensure they have sufficient storage and compute power for the volume of data required, organisations are turning to the cloud. Indeed, according to Gartner, 50 per cent of CIOs and technology executives increased their investment in cloud platforms in 2023.

There is often an expectation among businesses that moving applications to the cloud will resolve all their issues. The reality, though, is that every decision concerning the cloud has cost consequences that must be managed if AI is to be deployed successfully and cost-effectively.

Proactive cost management

Building cloud-native applications, and creating the architecture needed to accommodate and manage the processing power, memory, and storage necessary for training and running AI models can drive up cloud costs. Without careful management, these can quickly spiral out of control, and lower the return on investment of implementing AI.

The issue, according to Gartner, is that what are commonly considered to be the most beneficial characteristics of the cloud – shared and dynamic infrastructure, on-demand as a service, elasticity and scalability, consumption metering, and cross-network availability – are not actually infrastructure features. They only emerge from architecture built with cloud-native principles. And every architecture has cloud cost consequences - it needs to be built, and that will cost money over time.

Therefore, when building the cloud-native architecture required for data management and AI deployment - as with any digital transformation exercise – proactive cost management and architecture strategies are essential. Every decision must be coupled with considerations of its impact on the organisation’s overall IT architecture, its management practices and, of course, costs. One example of an increasingly common management practice is data meshing.

Tactics and practices

There are a number of tactics and practices an organisation can employ as part of its strategy to bring cloud costs under control. Efficient data storage practices, for instance, like eliminating silos and redundant storage, can optimise existing cloud usage to open up resources for AI implementation. 

According to Sysdig, organisations of all sizes could be overspending by up to 40 per cent due to misallocated storage. Its recent analysis found that 59 per cent of cloud containers have no CPU limits and 69 per cent of requested CPU resources were underused. Developers need to be aware of where their cloud resources are over- or under-allocated, otherwise cloud costs will remain unpredictable.

The use of rightsizing and autoscaling can reduce the consumption of cloud computing and storage. These techniques are based on analysing where provisioned cloud compute exceeds the peak requirements of a given application, where storage is over-provisioned or where unnecessary data volumes are being retained.

In addition, a strict IT portfolio management approach based on asset lifecycle management practices will enable the retirement of redundant or rarely-used cloud-hosted applications. Organisations can look at where cloud services can be replaced by lower cost equivalents, or whether the same service could be provided by an alternative – and less expensive – cloud provider.

Data meshing – an attractive option

To get a handle on cloud costs and accelerate the tactics to do so – organisations can look to put a data mesh in place – which is an architectural framework that solves advanced data security challenges through distributed, decentralised ownership.

A data mesh architecture effectively unites the disparate data sources from different lines of business and links them together through centrally managed data sharing and governance guidelines for the end purpose of analytics.

Working in a data mesh architecture, business functions can better maintain control over their storage costs and how shared data is accessed, who accesses it, and in what formats it’s accessed. A data mesh adds complexities to architecture but also brings efficiency by improving data access, security and scalability.

For a data mesh implementation to be successful, every domain team needs to apply product thinking to the datasets they provide. They must consider their data assets as their products and dimension the compute and storage requirements locally. In short, making data meshing a success requires strong levels of internal buy-in and prioritisation. But there’s serious gains for the taking for organisations that pull it off.

Greater visibility

Whichever techniques or practices it employs to consolidate its cloud costs, an organisation will benefit from a digital blueprint of its whole IT infrastructure. With greater visibility over the use of cloud across the entire organisation, it can see where resources are being under- or over-utilised, which resources are redundant, and where more efficient replacements can be made. As such, it will be better able to ensure its cloud resources are best utilised to support the implementation of AI technologies without drastically increasing costs. 

Shifting to the cloud is a logical move for organisations looking to capitalise on the many advantages offered by deploying AI across their business, but it can come at a cost. A strategic approach to managing cloud resources can help companies realise the benefits of their AI deployments while making the management of cloud costs as pain-free as possible.

By Krishna Sai, Senior VP of Technology and Engineering.
By Danny Lopez, CEO of Glasswall.
By Oz Olivo, VP, Product Management at Inrupt.
By Jason Beckett, Head of Technical Sales, Hitachi Vantara.
By Thomas Kiessling, CTO Siemens Smart Infrastructure & Gerhard Kress, SVP Xcelerator Portfolio...
By Dael Williamson, Chief Technology Officer EMEA at Databricks.
By Ramzi Charif, VP Technical Operations, EMEA, VIRTUS Data Centres.