Blog

Expense management with AI agents

How AI agents turn receipts, policy checks, categorization, and approvals into one quiet background process, and what expense automation actually looks like in practice.

By Andrew Pagulayan · Published May 17, 2026

Expenses are the part of finance everyone agrees to hate. The traveler hates digging a crumpled receipt out of a coat pocket three weeks after a trip. The manager hates the Friday queue of approvals that all need a decision and none of which carry any context. The finance team hates chasing missing documentation, re-keying line items into the ledger, and explaining to an auditor why one dinner got coded to marketing and an identical one got coded to travel. Nobody enjoys any of it, and yet most companies still run the whole thing on a mix of spreadsheets, email threads, and a tool that was bought to fix the problem but mostly just moved it into a different screen.

The reason expenses stay painful is that the work is small but constant. A single report is trivial. A thousand reports a month, each with its own receipts, its own policy edges, and its own approver, is a grind that scales linearly with headcount. This is exactly the shape of problem that AI agents are good at. Not the flashy generative kind that writes you a poem, but the patient kind that reads a receipt, checks it against a rule, files it in the right account, and routes it to the right person without being asked twice. That is what expense automation means in 2026, and it is a lot closer to boring plumbing than to science fiction.

This post walks through the four jobs an expense agent actually does, receipts, policy checks, categorization, and approvals, with concrete detail on each. The goal is not to sell you on the idea that AI is magic. It is to show you where the labor really goes, which parts a machine handles well, which parts still need a human, and how to wire the whole thing together so the result is faster and more trustworthy than what you have now.

What expense automation actually means

Strip away the marketing and expense automation is a pipeline. Something enters the system, a photo, a card transaction, a forwarded email, and a series of steps run on it until it is either posted to the books or returned to a human with a clear reason. The old version of this pipeline had a person at every step. The automated version keeps a person at the steps that need judgment and lets software handle the steps that are pure mechanics.

It helps to name the steps plainly. Capture pulls the raw data in. Extraction turns a picture into structured fields like vendor, date, amount, and tax. Matching ties the receipt to a card charge so the same expense is not counted twice. Validation runs the company policy against the line. Coding assigns the right ledger account and cost center. Routing sends it to the correct approver. Posting writes it to the accounting system. For decades, the middle of that list was manual data entry, and most of the time and most of the errors lived right there.

What changed is that the middle of the list is now where the agent does its best work. Document understanding has gotten good enough to read a wrinkled receipt photographed at an angle in bad lighting. Reasoning has gotten good enough to read a written policy and decide whether a given line violates it. The combination means the human moves up the value chain, from typing numbers to making the handful of real decisions that the numbers surface.

Receipts: from photo to structured data

Receipt capture is where automation either earns trust or loses it on day one. The classic optical character recognition approach reads characters but does not understand them. It can tell you the string forty-two dollars appears on the page, but it cannot reliably tell you whether that is the subtotal, the tip, the tax, or the grand total. A receipt from a restaurant, a parking garage, and a software vendor look nothing alike, and a brittle template breaks the moment a vendor redesigns its layout.

An AI agent reads the receipt the way a person does, by understanding what each region of the document means rather than just transcribing it. It extracts the vendor, the transaction date, the currency, the subtotal, the tax, the tip, and the total, and it can pull individual line items when they matter, for example separating the alcohol on a client dinner from the food because one is reimbursable and the other may not be. It handles foreign receipts, converts currency at the transaction date rate, and flags when an image is too blurry to trust instead of guessing.

The practical payoff shows up in two numbers most teams already track: the time to file a report and the rate of reports that bounce back for correction. When extraction is reliable, the employee snaps a photo and the fields are already filled, so filing drops from minutes to seconds. When extraction is honest about its own uncertainty, the agent asks for a better photo up front instead of letting a bad read travel all the way to the approver and get kicked back. The second behavior matters more than the first. An automation that fails loudly and early is worth far more than one that fails silently and late.

Policy checks that run on every line

Every company has an expense policy, and almost nobody reads it. It lives in a PDF that was last opened during onboarding. The per-diem caps, the receipt-required threshold, the rule about booking economy under a certain flight time, the list of categories that need a project code, all of it is written down and almost none of it is enforced consistently, because enforcing it by hand on every line is more work than any approver will actually do.

This is the single best fit for an agent, because policy is a set of rules and rules are checkable. The agent reads the company policy once, in plain language, and applies it to every line on every report, every time, without fatigue and without playing favorites. A meal over the dinner cap gets flagged with the exact amount it exceeds by. A hotel night booked above the city rate gets noted. A taxi with no receipt above the documentation threshold gets held. A duplicate, the same vendor, same amount, same day submitted twice, gets caught before it is paid.

The win is not that the agent rejects more expenses. It is that it applies one consistent standard to everyone, so the rule in the document and the rule in practice are finally the same rule.

There is a softer benefit that compliance teams care about. When checks run on every line, the rare real problem, the fabricated receipt, the personal charge slipped into a business report, the vendor that does not exist, stops hiding inside the noise of a thousand legitimate transactions. The agent does not accuse anyone. It surfaces the handful of lines that do not fit and lets a human decide, which is exactly the division of labor you want between a machine and a person on anything that touches money. For a broader look at how this pattern generalizes, see our overview of AI automation.

Categorization and the general ledger

Coding an expense to the right account sounds like a clerical detail, and it is the detail that quietly breaks the monthly close. If the same kind of spend lands in three different accounts because three different people guessed three different ways, the financial reports drift away from reality and someone spends the last week of the month reconciling instead of analyzing. Consistent categorization is the difference between books you can trust and books you have to forensically reconstruct.

An agent categorizes by learning the company chart of accounts and the patterns in past coding. A rideshare to the airport goes to travel. A subscription to a design tool goes to software. A client lunch goes to meals and entertainment with the client recorded. When a line is genuinely ambiguous, the agent does not silently pick one and move on. It proposes its best guess, shows why, and asks the submitter or the finance reviewer to confirm. Over time the confirmations sharpen the model for that specific business, so the gray area shrinks month over month.

A few categorization habits separate a system people trust from one they quietly route around:

Tie every coded line back to the source receipt and the card transaction, so an auditor can trace any number to its origin in one click rather than a week of email.
Make the agent show its reasoning for non-obvious codes, so a reviewer can correct the logic and not just the single line.
Capture the cost center and project code at capture time, while the employee still remembers the context, instead of reconstructing it during close.
Treat low-confidence codes as a queue for a human, never as a number to bury, because one wrong account multiplied across a quarter is what blows up the variance report.

Approvals without the bottleneck

Approval is where good intentions go to wait. A report sits in a manager queue because the manager has forty of them and no context on any. They either rubber-stamp the batch, which defeats the purpose of an approval, or they sit on it, which delays reimbursement and annoys everyone. The bottleneck is rarely the decision. It is the missing context that makes each decision feel risky enough to postpone.

An agent reframes approval as exception handling. Reports that pass every policy check, match a card charge, and code cleanly to an account do not need a human signature at all, or they get a one-line summary and a single confirmation. The approver's attention goes only to the lines that actually need a judgment call, the over-cap dinner with a written justification, the missing receipt, the unusually large one-off, each presented with the policy it touches and the agent's recommendation already attached. The manager reviews ten meaningful items instead of skimming four hundred routine ones.

This is also where automation should know its limits. Approval routing, escalation when an approver is out, and a hard rule that any expense above a threshold gets a second set of eyes are guardrails the agent enforces but does not override. The agent makes the easy yeses instant and the hard cases legible. It does not get to spend the money. That boundary, machine handles mechanics, human owns the spend decision, is the whole design principle for putting AI anywhere near a budget.

A mini walkthrough: one receipt, end to end

Concrete beats abstract, so here is a single expense moving through the pipeline. An employee finishes a client dinner in another city and photographs the receipt at the table.

Capture. The photo lands in the system. The agent reads it: vendor name, date, a subtotal, tax, a tip, and a total of one hundred forty dollars, with two of the line items being wine.
Match. A card charge for the same vendor, same day, same amount arrives from the bank feed. The agent links them so the dinner is recorded once, not twice.
Validate. Policy says client meals are reimbursable up to one hundred dollars per head and alcohol is reimbursable only with a named client present. The employee logged two attendees and one client, so the per-head math passes. The agent notes the wine and confirms the client field is filled.
Categorize. The line is coded to meals and entertainment, the client is attached, the trip's project code is carried over from the travel booking, and the foreign currency is converted at the day's rate.
Route. Because the report is under cap, matched, and clean, it goes to the manager as a one-line summary with the agent's note about the alcohol, flagged but compliant.
Post. The manager confirms in a tap. The expense writes to the accounting system with the receipt, the card match, and the policy check all attached for the audit trail.

Total human time spent: one photo and one tap. Total finance time spent: zero, until the close, when the line is already coded and documented. Multiply that by every dinner, taxi, and subscription in a month and you can see where the hours come back.

Common mistakes when automating expenses

Plenty of expense automation projects disappoint, and the failures rhyme. The most common is trusting the machine too much. A system that auto-approves everything it is confident about and never surfaces its confidence level will eventually pay a fabricated receipt with total assurance. Confidence is not accuracy. Build the queue for uncertain items first, and treat a high auto-approval rate as something to watch, not something to brag about.

The second mistake is automating a broken policy. If the underlying rules are vague, contradictory, or out of date, the agent will enforce the mess consistently, which feels worse than the old inconsistent mess because now it is everywhere at once. Clean the policy before you automate it. The exercise of writing rules clear enough for an agent to follow usually exposes the ambiguities a human reviewer was quietly papering over anyway.

The third mistake is islands. The expense agent reads receipts in one tool, the policy lives in a document in another, the chart of accounts sits in the accounting system, and the approvals happen in email. Every boundary between those systems is a place where context drops and a human has to re-key. The teams that get real leverage put the data, the rules, and the agent in one place so nothing has to be copied across a seam. That is the argument for an integrated AI workspace rather than five disconnected apps stitched together by hand. You can see more patterns like this in our use cases.

Where the workspace fits in

The reason expense automation tends to stall is structural, not technical. The receipts are files, the policy is a document, the spend is a database of transactions, and the approvals are messages between people. In most companies those four things live in four different products that were never designed to talk to each other, so an agent that wants to reason across all of them has to be bolted on from the outside and fed copies of everything.

Team Brain takes a different starting point. Docs, databases, files, and AI agents already live in one workspace, which means the policy document, the receipt files, the expense database, and the agent that reads all three share the same home. An expense agent here is not integrating across vendors. It is reading the file next to the rule next to the row, and writing its result back into the same place a human would look. That is less of a feature and more of a precondition: automation gets easy exactly when the data stops being scattered.

None of this requires a moonshot. Start with one painful step, usually receipt capture or duplicate detection, prove it saves real time, then extend the agent up the pipeline into policy and coding as trust builds. If you want to see what that looks like on your own spend, you can start for free and wire up a single agent before you commit to anything bigger.

Sources

Back to blog Team Brain home