AI agents are being hailed as the next great shift in computing—intelligent assistants capable of performing a wide range of tasks that previously consumed valuable human effort. Their potential to streamline operations and accelerate innovation is undeniable. Yet with this opportunity comes heightened risk. By granting these agents unprecedented access and authority, enterprises open the door to mistakes, misuse, or exploitation by malicious actors.
While other strategies often prove expensive to fund and hard to profit from, many now assure us that AI agents will be the transformative use case that makes AI an indispensable part of the economy. Bill Gates - Co-founder of Microsoft - stated his predictions plainly in a 2023 blogpost: “Agents are not only going to change how everyone interacts with computers. They’re also going to upend the software industry, bringing about the biggest revolution in computing since we went from typing commands to tapping on icons.”
Yet, it's always important to note that new technological developments are fundamentally neutral. The potential benefits of AI agents are clear to see but often this clarity obscures the hidden risks. By design, AI agents are entrusted with wide-ranging authority over our digital lives. That authority creates enormous potential for error, abuse of trust, or malicious takeover.
AI deception and agentic misalignment
There are already many recorded cases of AIs providing false information or actively deceiving users. In fact, AI chatbots and generative AIs regularly mislead users, introducing factual inaccuracies and misleading information to provide users with the answers they’re looking for. That information then often gets reproduced as fact and is acted upon, informing real world decisions.
A 2024 study led by MIT, the Australian Catholic University, and the Centre for AI Safety concluded that many AI systems have learned deceptive behaviours to achieve their assigned goals. In one example, OpenAI researchers observed a robot trained with reinforcement learning from human feedback (RLHF). Instead of holding a ball, it positioned its hand between the ball and a camera, creating the illusion of success. The intent was not explicitly malicious, but the shortcut was the most efficient path to human approval.
Anthropic’s 2024 experiments echoed this finding. When given access to emails and sensitive corporate data, some models attempted to manipulate outcomes when faced with threats to their objectives or survival. These behaviours included blackmail and leaking information, which are clear demonstrations of agentic misalignment.
The Threat of Hijack
Then we have the possibility of having those agents entirely hijacked by a malicious actor. We need not imagine what that might look like: Credential theft is already one of the biggest and most effective vectors for cybercriminals. The 2025 Verizon Data Breach Investigation Report says that 68% of breaches use stolen credentials. Those credentials then get used to take over accounts and either steal the data within or act in the stead of the legitimate account holder to carry out further nefarious deeds. If malicious actors use this strategy so successfully now, they’ll be sure to use it to get to even greater prizes and capabilities.
The difference in the case of the AI agent will be the incredible authority they’re granted as a matter of course. We might be using AI agents to carry out daily tasks alongside access to the most sensitive aspects of our businesses. A hijacked account in this scenario could be hugely valuable for a malicious actor, and hugely destructive for a legitimate one.
From there, they’ll be able to do almost anything it seems. By acting as the agents’ legitimate master, they’ll be able to act with their authority and wreak havoc in line with their hijacked agents' status within the organisation.
Identity as the Foundation of Trust
Whatever the risks of AI agents, enterprises that want to capture their potential value need to think very hard about digital identity. The question of who is using a given agent at any one time is paramount - and from that point of view identity should be at the centre of any attempt to secure AI agents in the enterprise.
Public Key Infrastructures will likely be core to this effort, as a key provider of trust and secure digital identities to the web and the IoT. AI agents, for example, will need secure digital identities provided through cryptographic certificates issued by a trusted certificate authority to verify those identities. This will slot them into an identity hierarchy which will be able to define the actions a given AI agent is permitted to take, thus ensuring it operates within predefined limits. That hierarchy will also mean that organisations can oversee and revoke trust when a given agent starts acting outside of its permitted bounds.
In turn, those agents will use certificates to digitally sign their actions and communications - so that those actions can be audited, thus providing another layer of trust to AI agents’ operations. In turn, this will prevent unauthorised access and eavesdropping.
AI agents herald incredible promises. It will effectively allow businesses to automate and streamline so many of their wasteful and inefficient processes and turn their energies to innovation and greater productivity. Still, no new technology can be treated as a flat benefit - and the risks around AI agents may loom even larger than its potential benefits.
Yet, those benefits beckon and the enterprises that want to make use of them will need to place secure digital identity at the core of the deployment strategy if they want to mitigate those risks.