The audit trail for AI banking decisions: what we log, and why.
A regulator's first question about any AI-assisted decision is the same: show me the trail. Not a reconstruction. Not a summary. The actual record of what the model saw, what it produced, who reviewed it, and what they decided. The trail is the evidence that the controls system is real.
This post describes the structure of the audit log Veetso keeps for every AI-assisted action, what is in each entry, and how a regulator would use it.
What "audit trail" means in our system
The audit trail is a single, append-only log. One entry per AI interaction, hashed and timestamped at the moment it is written. Nothing in the trail is reconstructed retrospectively; if it is not in the trail when it happens, it is not in the trail at all.
Every action that touches the AI layer writes one entry. That includes:
- A query to the Brain.
- A draft generated in drafting mode.
- A KYC reviewer reading a model-produced narrative.
- An analyst opening a triaged alert.
- A reviewer approving a regulated decision.
The trail is the system of record for AI-touching workflows. It is not the only system of record (the bank's general ledger and customer files exist separately), but for AI-touching activity it is the canonical one.
What is in each entry
Every entry contains the same eight fields, regardless of which workflow produced it.
01. Timestamp
UTC, millisecond precision. Monotonic within a single process. Cross-process ordering is derived from the hash chain rather than the timestamp.
02. Hash
A SHA-256 of the entry contents plus the previous entry's hash. The chain is what makes tampering visible: changing any historical entry invalidates every entry after it.
03. Actor
Either the human user (their identity) or the system component that produced the entry. Never anonymous.
04. Workflow
The use-case identifier from the use-case register. Tells the auditor what kind of action this is and what controls apply. See the gates post for how the register works.
05. Inputs
The exact inputs the model received. For a Brain query, this is the question and the set of source documents retrieved. For a KYC narrative, this is the document references. We store the references rather than the document content; the documents themselves live in their own classified store.
06. Outputs
The exact outputs the model produced. For a Brain answer, the text plus the per-claim citations. For a triage score, the numeric score plus the feature contributions. Outputs are stored verbatim; we do not summarise.
07. Decision
If a human took an action based on the outputs, the action they took. "Reviewer X approved customer file Y at timestamp Z" is a decision entry that points back to the model entries that informed it.
08. Disposition
The downstream resolution. For an alert, this is close / escalate / file SAR. For a Brain answer, usually empty (most answers are informational). The disposition is what regulators want to be able to count, group, and sample.
How a regulator would use the log
Three queries cover most regulatory information requests, and the log is designed for all three.
"Show me every AI-assisted decision on customer X."
Filter by actor (the customer's reviewer) or by referenced inputs (the customer's file references). Returns every model interaction that contributed to a decision on that customer, in order.
"Show me every decision the model recommended that the reviewer overrode."
Filter for decisions whose disposition differs from the model's output. This is the model's quality signal as seen through human review. A rising override rate is an early signal that the model is drifting.
"Reconstruct the chain for this specific SAR."
Filter by the SAR identifier; the log returns every alert, every model triage entry, every analyst note, every escalation, and every supervisor approval that led to the filing. The chain is unbroken from the rule that fired to the SAR that was filed.
What the log does not contain
It does not contain customer document content. The references in the inputs field point to the classified document store, where access is gated by the same controls as the underlying systems. Splitting the log from the content keeps the log inspectable by a wider audience while keeping sensitive data on a need-to-know basis.
It does not contain the model weights or the prompt template. Those live in versioned artefacts referenced by an identifier. The log records which version was used; the artefact store has the version.
It does not contain personal opinions or freeform notes outside the structured fields. Analysts' detailed notes live in the customer file, not in the AI log. The log is for the verifiable chain, not the qualitative record.
How long we keep it
The log is retained for the longest of: the jurisdictional minimum for records related to the regulated activity, seven years for AML-relevant entries, and the life of the customer relationship plus six years for KYC-relevant entries. Whichever is longest applies to each entry.
How the log helps us, not just the regulator
We use the same log internally to:
- Sample model output for quality review on a weekly schedule.
- Train new analysts by replaying real alert dispositions.
- Detect drift in the model's behaviour by comparing recent override rates to historical baselines.
- Answer "what happened?" when something goes wrong, without having to interview anyone.
A log built only for regulators is a cost centre. A log that also serves internal quality is the difference between AI you can operate and AI you cannot.
FAQ
Questions readers ask
What is an AI audit trail in a bank?
An AI audit trail is the canonical record of every AI-assisted action: which model ran, what inputs it saw, what it produced, who reviewed it, and what they decided. At Veetso it is a single append-only log, hashed and timestamped at write, with one entry per AI interaction. It is the system of record for AI-touching workflows.
What should be logged for AI compliance?
Eight fields per entry are enough for any regulatory query: timestamp, hash (chained to the previous entry), actor (human or system component), workflow (use-case identifier), inputs the model saw, outputs the model produced, the human decision if any, and the downstream disposition. The same schema applies across every workflow.
How does hash chaining make the log tamper-evident?
Each entry's hash is computed over its contents plus the previous entry's hash. Changing any historical entry invalidates every entry after it, so tampering is visible by inspection rather than by audit. A regulator can verify the chain end to end without trusting the bank's word.
How long must AI audit logs be retained?
Longest of: the jurisdictional minimum for records related to the regulated activity, seven years for AML-relevant entries, and the customer relationship plus six years for KYC-relevant entries. Veetso applies the longest-applicable rule to each entry.
Does the audit log contain customer documents?
No. The log stores references to documents in the bank's classified document store, not the content itself. This keeps the log inspectable by a wider audience (operations, audit, regulators) while keeping sensitive content gated by the document store's access controls.
Further reading
- Part of the Responsible AI in banking series, the full set of essays on the controls system.
- The six gates we apply to every AI workflow · the log is gate 04 and 05 in a single artefact.
- Why every Brain answer ships with a citation · what populates the inputs and outputs fields.
- AI-assisted KYC review and alert triage · the workflows that fill the log.