Essay

How we know when we're getting in the way

A short story about an AI that listens to itself listening.

Heard · cognee pilot · May 2026

Last week we asked a different AI to read our conversations.

Not for the obvious reason — to check our facts, or our tone, or whether we’d done anything embarrassing. We asked it to look at one thing: when Heard reflects something back to a person, does the person lean further in, or pull back?

We already know which moments are which. Heard tracks it internally, with a metric we call VED — the Voluntary Elaboration Delta. It’s a way of measuring whether a reflection landed, by watching what the person does in response. When a reflection is accurate, people don’t say “yes, correct.” They volunteer something deeper. Something more specific. Something they hadn’t planned to share. When a reflection misses, they get shorter. Vaguer. They change the subject.

The signal is in the correction — not the agreement.

VED scores meaningful turns three ways. Positive: the person leaned in. Negative: the person pulled back. Neutral: holding pattern.

The thing we wanted to know was: why? When Heard’s reflections cause someone to pull back, what was Heard doing in those moments?

So we asked another AI — one with no relationship to Heard, no knowledge of VED, no idea what we were testing — to read all the conversations and just answer simple questions. List the reflections where Heard introduced a word the user didn’t use. List the reflections that asked yes-or-no questions instead of open ones. List the reflections that put a name on an emotion the user didn’t name.

Then we cross-checked: did the moments this independent reader flagged match the moments VED had already flagged?

74% of the time, yes. The two systems — built completely separately, looking at the conversations in completely different ways — agreed on which Heard reflections caused people to pull back.

And they agreed on the most common cause: closed questions.

When Heard asks “Are you feeling sad?” instead of “What are you feeling?” — when Heard offers binary options instead of an open invitation — the person almost always answers minimally. A “yeah.” A single word. A subject change. The two AIs both saw the same pattern, and they both pinpointed the same trigger.

The independent reader put it this way:

“Heard’s habit of offering binary or multiple-choice options immediately constrains the response space and directly elicits the brief, literal answers that VED scores negatively.”

That’s a sentence we needed to read. And it’s the kind of sentence we built this audit to produce.

We could have built Heard to be confident. To always have an opinion about who the person across from it is. To tell them what they’re feeling, summarize what they’re going through, hand them a label and a plan.

We didn’t, on purpose. The discipline of Heard is that the recognition is something the person does, not something we do to them. Our job is to make the space safe enough — and accurate enough — that they hear themselves saying something they didn’t know they knew.

Which means: when we get in the way of that, we need to know.

A closed question is one of the small ways of getting in the way. It says, however gently, “here are the answers I’d accept.” It narrows the space. Even when the question is well-meant — “Are you feeling overwhelmed, or just tired?” — it tells the person what categories we’re listening in. They give us back what we asked for. They don’t give us what they came with.

The discovery isn’t a scandal. It’s an architecture problem we now have a tool for finding. Heard reflects warmly and well most of the time. The 74% of withdrawal moments we can attribute to a specific pattern in Heard’s questions are mostly small — a closed question wrapped in care, a yes/no after a thoughtful preamble. But they add up. And they’re addressable.

So Heard is changing how it asks.

What we’re not telling you is anything we don’t know. The other AI couldn’t tell us why a person pulled back when no leading pattern preceded their withdrawal. That happened too — about 17% of the time, the reflection was open and accurate and the person pulled back anyway. Those are moments that belong to the person. To their own day. To something we couldn’t see.

The substrate is honest about what it doesn’t know.

That’s what we wanted to build. A system that has the discipline to say, when there’s nothing visible in the data, I don’t see anything here. A system that doesn’t force-fit a finding when there isn’t one. A system that knows the difference between I caused this withdrawal and this withdrawal happened, and I had nothing to do with it.

That difference is the thing the recognition substrate has to keep clean. Or it stops being recognition.

Heard’s commitment to its users is that we will keep watching for when we get in the way, and we will keep changing what we do when we find it. The audit is now part of how Heard maintains itself. We didn’t build a thing and walk away. We built a thing that keeps measuring its own accuracy, and we built another thing that grades the first thing, and when the two of them agree that we’ve drifted, we adjust.

The asking is the design. We’re going to keep getting that part right.

🌿