Dalal Abdullah AlWuhaib,
Senior Auditor
Pre-Audit and Technical Support Sector
State Audit Bureau of Kuwait
Dalal Abdullah AlWuhaib,
Senior Auditor
Pre-Audit and Technical Support Sector
State Audit Bureau of Kuwait
Government audit institutions around the world are at a genuine turning point. The rise of artificial intelligence is forcing these agencies to ask a question they have never had to ask before: how do you audit a government that is itself being run by algorithms?
Keywords: artificial intelligence, government auditing, public accountability, Supreme Audit Institutions, algorithmic governance, audit technology
1. Introduction
Auditing government spending has never been a glamorous job, but it is one of the more important ones in any functioning democracy. When done well, it keeps public money accountable, surfaces waste and fraud, and gives legislators the information they need to do their jobs. For most of the past century, the basic method has stayed the same: auditors review records, test samples, interview officials, and write reports. That model worked reasonably well when governments were smaller and simpler.
Today, governments process millions of transactions every day across sprawling digital systems. Benefits are disbursed automatically. Procurement happens online. Revenue is collected through algorithms. The sheer volume and speed of modern public administration has outrun the capacity of traditional audit methods. You simply cannot review a meaningful sample of 50 million transactions by hand.
This is where artificial intelligence enters the picture. AI tools — particularly those built around machine learning — can process enormous datasets, flag anomalies, and identify patterns that no human auditor could detect in a reasonable timeframe. The promise is real. But so are the complications, and the audit institutions that have moved furthest in this space have found that the technology is only a fraction of the challenge.
This paper examines the current state of AI in government auditing, grounded in what institutions have actually reported doing and finding. It does not speculate about hypothetical capabilities. Every claim here traces back to a documented, publicly available source
The honest answer is that most government audit institutions are still in the early stages. The UK's National Audit Office (NAO) published a major review of AI use across UK government departments in March 2024 and found that while 70% of government bodies surveyed were piloting or planning AI use cases, actual deployment remained limited (National Audit Office, 2024). The report was candid about the gap between ambition and reality, noting that less than a quarter of government bodies had a formal AI strategy in place.
That finding is not unique to the UK. The United Nations Department of Economic and Social Affairs noted in early 2025 that Supreme Audit Institutions — the independent bodies that oversee public spending in most countries — are still largely dependent on the budgets allocated by the very governments they oversee, which creates a structural barrier to rapid technology investment (United Nations DESA, 2025). The cost of implementation and the need for specialized talent are consistently cited as the two biggest obstacles.
2.2 The GAO's Approach in the United States
The U.S. Government Accountability Office has been among the more methodical institutions in thinking through how AI fits into public sector accountability. In June 2021, the GAO published an AI Accountability Framework organized around four principles: governance, data, performance, and monitoring (GAO, 2021). Rather than deploying AI systems first and figuring out the rules later, the GAO's approach was to establish the accountability structures that should govern any AI system before those systems are built.
By early 2024, the GAO had begun deploying a large language model internally — primarily to help synthesize past reports, assist with editorial reviews, and scan congressional documents (GAO, 2024). The institution was transparent about the fact that developing these internal tools was also a way of gaining firsthand knowledge about AI's limitations, knowledge that would help the GAO evaluate other agencies' use of AI more effectively.
This approach — learning by doing, but doing it carefully — reflects a broader philosophy that appears throughout GAO's published work: accountability cannot be an afterthought. A December 2023 GAO report reviewing AI implementation at major federal agencies made 35 recommendations to 19 separate agencies, finding that most had not fully implemented required AI governance practices (GAO, 2023).
2.3 The European Dimension
In May 2024, the European Court of Auditors published a special report assessing the European Commission's contribution to the EU's AI ecosystem. The findings were mixed. The Court found that the Commission's coordination measures were not effectively implemented, EU AI investment was not keeping pace with global competitors, and the infrastructure support meant to help smaller businesses adopt AI had not yet produced meaningful results (European Court of Auditors, 2024).
The EU's approach to AI governance more broadly — anchored in the AI Act, which came into force in 2023-sets a precedent that audit institutions in member states will increasingly have to engage with. The Act follows a risk-based framework: higher-risk AI applications face stricter requirements around transparency, testing, and human oversight. For government auditors, this matters both because they use AI tools themselves and because they will increasingly be asked to audit whether the agencies they oversee are complying with these requirements.
3.1 The Population Problem
The most practical advantage AI offers auditors is scale. Traditional audit methodology relies on sampling — you cannot check every transaction, so you select a representative subset and extrapolate. This works reasonably well for detecting systematic problems, but it means that individual irregularities outside the sample go undetected. A fraudulent contract that does not happen to land in the sample is invisible.
Machine learning models can be trained to scan entire transaction populations. They do not replace the auditor's judgment about what matters, but they can narrow the field from millions of records to a few thousand flagged items worth a closer look. This is not a hypothetical benefit — it is the basic operating logic behind fraud detection systems that have been in use in financial services for decades and are now being adapted for public sector contexts.
3.2 Continuous Monitoring vs. Periodic Review
Traditional government auditing is inherently retrospective. An institution gets audited, problems are found, a report is issued, recommendations are made, and the agency has a year or two to respond. By the time corrective action happens, the original problem may have been running for years.
AI-enabled continuous monitoring changes that timeline. Systems can flag anomalies in near real-time — an unusual spike in procurement approvals, a pattern of payments that do not match contracts, a string of identical amounts just below approval thresholds. This does not eliminate the need for a formal audit, but it can trigger earlier attention to problems that would otherwise fester.
3.3 Processing Unstructured Information
Much of what auditors need to review is not neatly formatted data. It is emails, contracts, meeting minutes, inspection reports, and policy documents. Natural language processing tools — a branch of AI that allows computers to read and analyze text — can help auditors work through large volumes of unstructured information more efficiently. The GAO has used these tools to help synthesize findings across previous reports, which is a practical example of the technology adding value without replacing human judgment (GAO, 2024).
4.1 The Skills Gap Is Not Small
The NAO's 2024 review of AI in the UK government found a consistent pattern: enthusiasm for AI adoption running well ahead of the workforce capacity to implement it responsibly (National Audit Office, 2024). The report recommended upskilling AI capabilities across all government departments as a priority. The Parliament's Public Accounts Committee echoed this, noting that the government had already been warned about digital skills shortages in a 2023 report and that staffing cuts were making the problem worse, not better. For audit institutions specifically, this is a structural issue. Auditors are trained in accounting, law, and professional scepticism. They are not generally trained in data science, statistical modelling, or software engineering. Building the kind of multidisciplinary teams that effective AI deployment requires is not something that happens quickly. It means either training existing staff in technical skills that are genuinely hard to acquire, or recruiting specialists who typically have more lucrative options outside government.
4.2 Data Quality Is a Prerequisite, Not a Given
Machine learning only works as well as the data it runs on. Garbage in, garbage out — the cliché exists because it is accurate. Government data, in most countries, is fragmented across legacy systems that were built at different times, to different standards, with no particular thought given to interoperability. The NAO has consistently flagged poor data infrastructure as one of the central barriers to effective AI deployment across the UK government (National Audit Office, 2024).
This is not a problem that AI can solve. It requires investment in underlying data systems, clear standards for data collection and storage, and governance frameworks that ensure quality is maintained over time. These are expensive, unglamorous, long-term projects. They are also essential.
4.3 Transparency and the Black Box Problem
A government auditor's findings have to be defensible. When an audit report concludes that an agency mismanaged funds, that conclusion has to rest on evidence that can be explained, challenged, and tested. This is where some AI approaches run into serious trouble.
Many of the most powerful machine learning models — particularly deep neural networks — work in ways that are not easily explainable even to their designers. The model flags a transaction as suspicious, but articulating precisely why in terms that would hold up in a hearing or a court is a genuine challenge. For government auditing, where public accountability is the entire point, deploying AI tools that cannot explain their own outputs is a significant problem.
This is why the EU's AI Act, with its emphasis on transparency and explain ability requirements for high-risk AI applications, is directly relevant to the audit world. It is also why institutions like the GAO have been careful to frame AI as a tool that assists human judgment rather than replaces it.
4.4 Auditing AI Systems Is Its Own Challenge
Here is the wrinkle that does not get enough attention: government agencies are deploying AI to make consequential decisions about people — who gets benefits, who gets flagged for tax review, who qualifies for housing assistance. Those systems need to be audited. But auditing an AI system requires a fundamentally different skill set than auditing a procurement process.
The European Court of Auditors' 2024 special report on the EU's AI strategy found that governance frameworks were incomplete, implementation was behind schedule, and there was a lack of effective coordination between national and EU-level oversight (European Court of Auditors, 2024). The institutional infrastructure for auditing AI systems is still being built, at the same time those systems are already making decisions.
The INTOSAI community — the international network of Supreme Audit Institutions — has recognized this dual challenge explicitly. The Moscow Declaration, endorsed at INTOSAI's most recent international congress, called on audit institutions to equip their auditors with data analytics and AI capabilities while also developing the capacity to audit AI systems used by government agencies (INTOSAI Journal, 2021). Calling for something and actually achieving it are, of course, different things.
5. What Responsible Adoption Looks Like
Based on what institutions that have moved carefully in this space have learned, a few principles seem to matter consistently.
The first is that governance has to come before deployment. The GAO's AI Accountability Framework was built on the premise that you cannot responsibly adopt AI without first establishing who is accountable for the system, how its performance will be measured, and what happens when it goes wrong (GAO, 2021). Institutions that skip this step tend to end up with AI tools that produce outputs nobody fully trusts and no clear process for addressing problems.
The second is that data infrastructure investment is not optional. AI tools built on fragmented, low-quality data will not deliver reliable results. Institutions that have invested seriously in data governance — cleaning up legacy systems, standardizing formats, establishing quality controls — are consistently better positioned to use AI effectively.
The third is that transparency has to be a non-negotiable design requirement, not an add-on. For government audit work, being able to explain how a finding was reached is fundamental to the institution's credibility. This points toward AI tools that are designed with explain ability in mind from the start, even if that means accepting some loss of predictive power.
AI is not going to solve government auditing's fundamental challenges, but it can meaningfully change what is possible. The ability to analyze complete datasets rather than samples, to flag anomalies in real time rather than after the fact, and to process vast amounts of unstructured information changes the scale at which oversight can operate. That matters.
But the institutions that have moved most thoughtfully in this space — the GAO, the NAO, the European Court of Auditors — have all been clear-eyed about the limits. The technology is only as good as the data beneath it. It has to be explainable to be usable in accountability contexts. The people deploying it have to actually understand it. And as governments deploy AI to run public services, audit institutions face the added challenge of developing the capacity to audit those systems too, at the same time they are learning to use AI themselves.
None of this is impossible. But it requires treating AI adoption as a serious institutional undertaking — not a technology project that can be handed off to an IT department, but a transformation that touches governance, workforce, data management, and professional standards all at once. The institutions that understand that are the ones making real progress.