Writing

Ground Truth:
AI product practice,
examined.

Most AI product content is written by people excited about what AI can do. Ground Truth is where I work through what rigorous practice actually looks like: from evaluation frameworks and measurement strategy to the organizational conditions that determine whether good research ever changes anything.

Subscribe on Substack
Ground Truth — AI Product Practice
Series 3  •  May 2026

The Capability Question:
What AI Enablement Actually Does to People

Adoption metrics measure whether people use a tool. They do not measure whether people are getting better at their jobs. This series examines what capability actually means in the context of AI enablement and how to design research that answers that question honestly.

The Confidence Problem
The tool was certain. The employees were certain. The outputs were wrong.
What Does Capable Even Mean?
Employees are finishing tasks. That is not the same as getting better at their jobs.
Introducing The Capability Question: What AI Enablement Actually Does to People
The adoption metrics look fine. That is not the same as knowing whether your people are getting better.
Series 2  •  April 2026

The Measurement Traps
That Break Strategy

Organizations make consequential decisions from evidence that was never designed to support them. This series examines the structural incentives that produce bad measurement and what it takes to build insight functions that actually change decisions.

The Quant Default
Organizations default to quantitative evidence when it conflicts with qualitative not because it is more right, but because it is easier to defend.
How Research Gets Designed to Survive Scrutiny Rather Than Generate Insight
The incentive to be defensible and the incentive to be useful are not the same incentive.
Why Most Insight Functions Produce Answers Nobody Asked For
Insight functions optimized for research production will produce research. Whether anyone uses it is a different question.
The Difference Between a Finding and a Decision
Most research tells you what is true. Decisions require something else entirely.
Introducing The Measurement Traps That Break Strategy
Organizations are making consequential decisions from evidence that was never designed to support them.
Series 1  •  February – March 2026

Rigorous AI Product Practice:
Evaluation, Metrics, and Governance

Most AI product teams only instrument one of the four evaluation layers that matter. This series builds the case for a more rigorous approach: evaluation frameworks, metric failure modes, human-in-the-loop design, and governance checklists for shipping responsibly.

How to Design Human-in-the-Loop AI Systems That Don't Collapse
Adding a human review step is not the same as having oversight. Here's why HITL systems fail and what holds up in practice.
A Governance Checklist for Shipping LLM Features in Consumer Apps
A checklist is not a governance system. Here's the difference and what actually needs to be in place before you ship.
Why Most AI Product Metrics Fail Under Distribution Shift
Why your dashboard looks green while your product is quietly failing a subset of users.
A Practical AI Evaluation Framework for Consumer Products
Most teams only instrument one of the four evaluation layers that matter. Here's the full framework.
Introducing Ground Truth: A Publication on Rigorous AI Product Practice
Most AI product content is written by people who are excited about what AI can do.
Ground Truth New articles publish every week.
Subscribe on Substack
Get In Touch
Open to the right
conversation.