Writing

Ground Truth:
AI product practice,
examined.

Most AI product content is written by people excited about what AI can do. Ground Truth is where I work through what rigorous practice actually looks like: from evaluation frameworks and measurement strategy to the organizational conditions that determine whether good research ever changes anything.

Subscribe on Substack

Series 3 • May 2026

The Capability Question:
What AI Enablement Actually Does to People

Adoption metrics measure whether people use a tool. They do not measure whether people are getting better at their jobs. This series examines what capability actually means in the context of AI enablement and how to design research that answers that question honestly.

May 27, 2026

The Confidence Problem

The tool was certain. The employees were certain. The outputs were wrong.

May 20, 2026

What Does Capable Even Mean?

Employees are finishing tasks. That is not the same as getting better at their jobs.

May 13, 2026 • Series Introduction

Introducing The Capability Question: What AI Enablement Actually Does to People

The adoption metrics look fine. That is not the same as knowing whether your people are getting better.

Series 2 • April 2026

The Measurement Traps
That Break Strategy

Organizations make consequential decisions from evidence that was never designed to support them. This series examines the structural incentives that produce bad measurement and what it takes to build insight functions that actually change decisions.

Apr 29, 2026

The Quant Default

Organizations default to quantitative evidence when it conflicts with qualitative not because it is more right, but because it is easier to defend.

Apr 22, 2026

How Research Gets Designed to Survive Scrutiny Rather Than Generate Insight

The incentive to be defensible and the incentive to be useful are not the same incentive.

Apr 15, 2026

Why Most Insight Functions Produce Answers Nobody Asked For

Insight functions optimized for research production will produce research. Whether anyone uses it is a different question.

Apr 8, 2026

The Difference Between a Finding and a Decision

Most research tells you what is true. Decisions require something else entirely.

Apr 1, 2026 • Series Introduction

Introducing The Measurement Traps That Break Strategy

Organizations are making consequential decisions from evidence that was never designed to support them.

Series 1 • February – March 2026

Rigorous AI Product Practice:
Evaluation, Metrics, and Governance

Most AI product teams only instrument one of the four evaluation layers that matter. This series builds the case for a more rigorous approach: evaluation frameworks, metric failure modes, human-in-the-loop design, and governance checklists for shipping responsibly.

Mar 19, 2026

How to Design Human-in-the-Loop AI Systems That Don't Collapse

Adding a human review step is not the same as having oversight. Here's why HITL systems fail and what holds up in practice.

Mar 12, 2026

A Governance Checklist for Shipping LLM Features in Consumer Apps

A checklist is not a governance system. Here's the difference and what actually needs to be in place before you ship.

Mar 5, 2026

Why Most AI Product Metrics Fail Under Distribution Shift

Why your dashboard looks green while your product is quietly failing a subset of users.

Feb 26, 2026

A Practical AI Evaluation Framework for Consumer Products

Most teams only instrument one of the four evaluation layers that matter. Here's the full framework.