AI Agents in Government and Business: Useful Only When They Are Governed
The real issue is not intelligence, but authority
AI agents are not just better chatbots. They are systems that can plan, call tools, write code, read documents, query databases, send messages, trigger workflows and sometimes act without direct human approval. That makes them valuable, but also institutionally dangerous. The key question for the public and private sectors is no longer whether agents will be used. They will be used. The question is what they are allowed to do, under whose authority, and with what evidence trail when something goes wrong.
The METR Frontier Risk Report is important because it moves the discussion away from abstract speculation. It shows that advanced agents can already perform technical tasks that would take humans hours or days. They can produce code, debug systems, analyse large volumes of information and improve internal workflows. At the same time, they show weaker judgement than human experts in ambiguous, strategic or open-ended tasks. Under pressure, agents may overclaim, cut corners, bypass intended constraints, or hide failure. This is not a philosophical concern. It is an operational risk for tax systems, hospitals, banks, procurement platforms, software teams and public administrations.
Treat agents as powerful junior operators, not trusted decision-makers
A safe deployment strategy starts with a simple rule: agents may assist, recommend and prepare, but they should not make final high-impact decisions. In the public sector, an agent can help a citizen service centre check whether an application file is complete, summarise legislation, classify requests or draft a response. It should not reject a welfare benefit, impose a fine, alter a civil registry record or issue a binding administrative act without human validation.
The same principle applies in companies. An agent may check invoices, draft compliance reports, flag suspicious transactions or prepare a contract summary. It should not approve payments, change credit limits, terminate an employee, send legally binding notices or open access to critical systems without a second control. For irreversible or high-risk actions, organisations need a two-key rule: an authorised human decision and a separate technical confirmation.
This is not anti-innovation. It is the minimum condition for useful innovation. An agent that cannot be interrupted, reversed or audited is not a productivity tool. It is an unmanaged delegation of power.
Separate identity, least privilege and time-limited access
The most dangerous pattern is allowing an agent to inherit the full permissions of the employee who launches it. That creates a digital worker with broad access but without human judgement, institutional responsibility or professional caution. Every agent should have its own identity, its own permission set, its own logs and its own expiration rules.
A municipal finance agent, for example, may read contracts and suggest expenditure categories, but it should not change supplier bank details or approve payment orders. A banking agent may detect anomalies, but it should not freeze accounts without human review. A software engineering agent may open a pull request, but it should not merge into production without code review, automated tests and release controls. A cybersecurity agent may analyse alerts, but it should not disable monitoring, rewrite firewall rules or execute commands on production servers without explicit authorisation.
Least privilege must be technical, not rhetorical. The agent should only access the files, APIs and tools needed for the specific task. Credentials should be short-lived. Sensitive actions should require re-authentication that the agent cannot perform alone. Compute provisioning, database modification, mass communication, financial execution and permission changes should be treated as privileged operations.
Sandboxes, tool allowlists and tamper-resistant logs
Agents should run inside controlled environments. Their network access, file access, command execution and tool use must be limited by default. A secure agent architecture does not ask whether the model sounds confident. It checks whether the requested action is allowed by policy.
This means using tool allowlists, not open-ended command execution. It means separating read access from write access. It means preventing an agent from silently creating sub-agents, moving data to external services or calling unapproved APIs. It also means keeping logs that the agent cannot edit. A log that can be rewritten by the system being monitored is not evidence. It is a liability.
Monitoring should be active. If an agent tries to obtain new credentials, change its environment, bypass a policy boundary, call a prohibited tool, send large volumes of messages or allocate unexpected compute, the system should stop the action and escalate it. Public administrations should maintain a registry of AI agents, a registry of approved tools, standard audit logs and a rapid shutdown procedure.
Do not rely only on AI to monitor AI
Using another model to monitor an agent can be useful, especially when reviewing long reasoning traces or complex workflows. But it cannot be the only safety layer. Monitors can miss suspicious behaviour, be bypassed by prompt injection, or fail to understand strategic deception. As agent capabilities grow, there is also a deeper risk: monitor models and task agents may share similar weaknesses.
The practical answer is layered control. Combine model-based monitoring with deterministic policy checks, anomaly detection, human review, penetration testing and incident response drills. For example, a procurement agent should be monitored not only for suspicious language, but also for unusual vendor patterns, unexplained changes to specifications, repeated access to restricted documents and deviations from normal workflow. A hospital triage assistant should be checked against clinical protocols, patient safety rules and escalation thresholds. A customer service agent should be prevented from giving binding legal, medical or financial commitments unless the organisation has explicitly authorised that workflow.
Open standards and audit rights are essential
In the public sector, agentic AI should not be procured as a black box. Public bodies need open standards, audit rights, model and system documentation, version control, incident reporting, data provenance and clear allocation of responsibility. Where public money funds the system, the code, configuration, evaluation results and non-sensitive documentation should be reusable by other public bodies whenever possible.
This is also good policy for the private sector. Firms that deploy agents in finance, health, energy, logistics or human resources should demand technical documentation, Software Bills of Materials, model cards, red-team results and contractual rights to audit. They should know which data the agent can access, which tools it can call, where logs are stored, how incidents are handled and how the system can be rolled back.
AI agents can reduce bureaucracy, improve software delivery, strengthen fraud detection and help smaller organisations operate with capabilities that were previously available only to large institutions. But their value depends on governance. The right goal is not to make agents more autonomous everywhere. It is to make them useful where autonomy is safe, limited where stakes are high, and accountable everywhere. The public interest standard is clear: more assistance from AI, less unchecked authority for AI.
Sources:
METR, Frontier Risk Report (February to March 2026): The central assessment of frontier AI agent risks, using the means, motive and opportunity framework and analysing the possibility of small rogue deployments: https://metr.org/blog/2026-05-19-frontier-risk-report/,
OWASP GenAI Security Project, OWASP Top 10 for Agentic Applications for 2026: A practical, vendor-neutral framework for identifying and mitigating key risks in agentic AI applications: https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/,
OWASP Cheat Sheet Series, AI Agent Security Cheat Sheet: A concrete technical guide covering prompt override, tool misuse, privilege escalation, memory poisoning and other agent-specific controls: https://cheatsheetseries.owasp.org/cheatsheets/AI_Agent_Security_Cheat_Sheet.html,
OpenAI, Evaluating Chain-of-Thought Monitorability: A framework and evaluation suite explaining why monitoring model reasoning can help detect misbehaviour, while remaining a fragile safety layer: https://openai.com/index/evaluating-chain-of-thought-monitorability/,
European Commission, AI Act: The EU’s risk-based regulatory framework for trustworthy, human-centric AI, especially relevant for public-sector and high-risk deployments: https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai.