Evidence-grounded discovery for accountability work

For accountability work, "close enough" is not good enough.

LLMs paraphrase, approximate, and hallucinate citations. I built a tool that only surfaces what it can validate character-by-character against source documents — verbatim excerpts, exact page numbers, zero inference.

Built for journalism, oversight, investigations, and research. Currently a working prototype — looking for pilot collaborators.

Get in touch How it works →

Guardrails

What I built it to do, what I deliberately kept out, and where your judgment still matters.

What it does

Every excerpt is verified verbatim against source text before it surfaces
Links every signal to verbatim source text with precise page location
Processes full documents — body text, appendices, footnotes — with equal scrutiny
Behaves consistently across documents, years, and repeat runs

What it does not do

Infer intent, wrongdoing, or "what they really meant"
Paraphrase or alter source text
Surface signals without validated evidence
Tell you which signals matter (that's your judgment)

What still needs human judgment

Deciding which threads to follow
Curating the story from hundreds of signals
Determining what's significant vs. routine
Knowing when you've found enough

The validation is bulletproof. The storytelling is still yours.

First things first

Why this exists and how you can help steer it.

Why I built this

I'm Andrew. I built this because I needed it.

Read my storyClose

I volunteer on boards and do governance work in my spare time. Last year I had to review ten years of board papers, audit reports, and committee transcripts to understand how a major project went wrong.

I tried ChatGPT. It found things that were "close enough" — paraphrased quotes, approximate summaries, contextually similar passages. But never verbatim text. When I asked for the exact page number, it confidently pointed to the wrong place.

I tried Claude with extended context. It could handle more documents but had the same problem: it would synthesize, it would interpret, it would "helpfully" rephrase. I still had to manually verify every single excerpt.

The problem was not that LLMs are bad at reading — they are incredible. The problem was that for accountability work, "close enough" is not good enough. If I am going to cite something in a report or put it in a story, I need the exact words, the exact page, and I need to know it is real.

So I built Accounter. It only surfaces what it can validate character-by-character. No inference. No paraphrasing. No helpful summaries. Just: "This exact text appears on this exact page of this specific document."

I built it in a month. It works for my use case. Now I need to know if it works for anyone else's.

What I'm looking for

I need real document sets and candid feedback.

Why this existsClose

Accounter is a working prototype. The validation works but I don't know yet if the salience ratings prioritize the right signals, if the auto-tags help or clutter, or whether going from 355 signals to a story is still too manual.

I'm looking for 3-5 pilot collaborators who:

Have a real document corpus (50+ documents, 500+ pages)
Work in journalism, investigations, oversight, or research
Can provide honest feedback on what worked and what broke

What Accounter produces

Free processing of your corpus
Direct access to me for iterations
Professional attribution in any case study
Recognition as a founding pilot user
Your feedback shapes what this becomes

What I'm not promising:

Perfect results on your documents
That it will save you time
Production-ready software

If you're wrestling with hundreds of documents, let's talk.

What you get

A permanent, structured record of everything worth attention in your documents. Each signal pairs an editable label with a verbatim evidence excerpt and its precise source location. These are real signals from my HS2 pilot run — the same data explored in the case studies.

Signals ledger — HS2 corpus (3 of 2,253)

Nov 2015

Senior HS2 staff allegedly withheld true land costs from government

"My observation was that senior HS2 directors and staff were deliberately failing to inform the government about the true estimated cost of HS2 land and property and were actively instructing junior members of the HS2 staff to intentionally use lower estimates even when they were known to be wrong."

Written evidence — Lt Col Andrew Bruce
Page 1
HS2

Jan 2024

Cancelling HS2 phases projected to free £36bn including £6.5bn from Euston

"The Department has said that the decision not to proceed with HS2 phases 2a, 2b and East will unlock £36 billion (2023 prices) and that "every penny of the £19.8 billion committed to the Northern leg of HS2 will be reinvested in the North."

NAO Report
Page 10
HS2

Jul 2025

Whistleblower claims of inflated invoices prompt HS2 Ltd investigation

"The allegations concern inflated invoices and improper PAYE charges, potentially defrauding taxpayers. HS2 Ltd treats all whistleblower allegations seriously and an investigation was launched earlier this year into these allegations."

Review of High Speed Two (HS2)
Page 8
HS2

Case Study: Navigating Complexity Case Study: Following the Thread

Get in touch

I'm looking for 3-5 pilot collaborators. Tell me about your document corpus and what you're trying to understand.

Your email Role (optional) What are you analysing? (optional)

I'm interested in Access to the pilot. Sample analysis of some files.