PAF, Retold: Product Management When AI Runs Underneath

I came across a product methodology and got curious. The ideas underneath were good. The problem was getting to them: the whole thing is wrapped in invented vocabulary — Nexus, Cortex, Bunch, Bale, Germination, Harvesting, Stagility — and I kept losing the thread.

So I did what I usually do when I suspect a framework is better than its packaging. I tried to retell it without the dictionary, trace each idea back to whoever said it first, and see what survives. Most of it does. A couple of pieces don't, and I'll say which.

What this is. This is my simplified retelling of the Product Architecture Framework (PAF) by Sergey Tikhomirov, specifically his AI Product Operations guide.¹ ² I didn't invent any of the core ideas here — he assembled them, and the thinkers in the footnotes originated them. I've stripped the proprietary terms, added the attributions (those are mine, not his), and suggested concrete tools (also mine). Any errors introduced by simplifying are on me. The original is published under CC BY-SA 4.0, so this retelling is too.¹

Where the whole thing starts

The story begins on a Toyota factory floor. Taiichi Ohno built the Toyota Production System in the postwar years to squeeze waste out of manufacturing — same car, made cheaper and faster.³ Decades later, Ken Schwaber and Jeff Sutherland adapted that lineage into Scrum,⁴ and David Anderson carried Kanban from the same Toyota roots into software.⁵

These are production systems. They answer one question well: how do we ship faster, with less waste? They say nothing about whether you're shipping the right thing. That gap is the reason an entire second discipline exists.

Marty Cagan spent a career drawing the line between discovery (working out what's worth building) and delivery (building it well).⁶ Steve Blank hit the same wall from the startup side and called the fix Customer Development: stop optimizing the build until you've confirmed there's value to build.⁷ Eric Ries packaged that into Lean Startup and gave it a slogan, validated learning.⁸

The practical version is simple. A delivery system makes you faster at building whatever you've already decided to build. It has nothing to say about whether that thing was worth building in the first place. So when a company's growth stalls and the answer is to roll out Scrum or Agile, the growth usually stays stalled. You've made the factory quicker, but the factory was never the problem. A fast line pointed at the wrong product just reaches the wrong product sooner.

That's where PAF starts, and it points the idea straight at the people running the company: you bought a way to build faster and hoped it would fix a problem about what to build, then wondered why nothing changed. Cagan, Blank, and Ries each said a version of this years ago, so the insight isn't new. What PAF adds is the aim, and most organizations still haven't taken it in.

PAF's own framing is that managing a product is closer to cultivation than production — you're growing something in conditions you don't fully control, not assembling it on a line. I find that metaphor does real work, so I'll keep using it.

The one idea everything hangs on

Keep one always-current knowledge base per product. Everything the company knows about the market, the product, and how it makes money lives in one place that stays up to date, instead of rotting across old decks, Slack threads, and people's heads. PAF calls this the Nexus; you can call it a living source of truth.

"Single source of truth" is old engineering hygiene, and Cagan has preached "deep product knowledge" for years. What makes it urgent now is the 2025 turn toward context engineering — the reframe, pushed by people like Shopify's Tobi Lütke⁹ and Andrej Karpathy,¹⁰ that once the models are capable enough, your bottleneck isn't how you word the prompt, it's the quality and freshness of the context you feed it.¹¹ PAF's contribution is applying that to a whole product org rather than a single chat window. The line I'd underline from the original: the company with better context, managed better, wins.

PAF also asks you to track how good that context is on two axes — how complete it is and how recently it was refreshed. A node that was true six months ago and never updated is a liability, not an asset. That's a genuinely useful discipline most knowledge bases lack.

The loop, in plain words

Here's the working cycle, with the original terms dropped.

1. Measure the gap. State where the product should be — that's vision plus goals, the OKR practice Andy Grove invented at Intel and John Doerr spread.¹² Compare it to where the product actually is. The distance between them is your gap.

2. Write a fresh shortlist, not a backlog. Instead of grooming an aging pile of stakeholder requests, you regenerate a short list of changes from current goals plus current evidence. Itamar Gilad makes this case directly in GIST Planning: long backlogs and roadmaps accumulate stale, unsourced items, so you're better off deriving the list anew each cycle.¹³

This is PAF's "there is no backlog" claim, and I'll be honest about it — it's half real, half marketing. A set of candidate changes carried over time with confidence scores attached is a backlog. A better one. Declaring backlogs abolished is showmanship. But the underlying habit — kill stale low-provenance items, regenerate from the current state — is correct and worth adopting.

3. Score by confidence and attack the weak ones. For each candidate, ask two questions: how sure are we this works, and on what evidence? Gilad's Confidence Meter is the rubric PAF actually cites — it ranks evidence from mere self-conviction up through data, experiments, and live results.¹³ ICE and RICE are the lighter-weight cousins.¹⁴ Then have the team try to break each argument and drop what can't survive. Run the shortlist through an AI first with a "give me every reason each of these fails" prompt — that's a pre-mortem, Gary Klein's technique, automated.¹⁵

4. Where confidence stays low, go learn. This is the part PAF frames as scouting rather than prioritizing: your job isn't ranking a wishlist, it's reducing uncertainty until the best move is obvious. Teresa Torres's opportunity solution trees give the structure for that learning,¹⁶ and Blank's original rule still holds — get out of the building and talk to real users. AI can draft the research plan and synthesize the notes. It does not replace talking to customers, whatever the framework implies.

5. Build the survivors. On whatever process you like. PAF leans toward Kanban and an event-based rhythm, so a release can land whenever it's ready instead of waiting for a sprint boundary.⁵

6. Ship with a real offer, then measure and fold back. Package each change with an actual offer and a distribution plan — value first, monetization second, which is just product/market fit.¹⁷ Then measure the impact, write what you learned back into the knowledge base, and the next shortlist regenerates from the new state. The loop turns again.

Two numbers carry the weight

One metric that ties product value to money. PAF calls it a monetizable North Star Metric; the North Star itself is generally credited to Sean Ellis, with the practical playbook later written up at Amplitude.¹⁸

And a confidence score per change, living in the same table as the shortlist.¹³

PAF also wants net present value (NPV) per change. I'd skip that one. NPV per feature demands you forecast and discount cash flows for each individual change, and almost no team can attribute revenue to features cleanly enough to make that honest. It's false precision dressed as rigor.

What happens to the PM

Smaller teams owning more is Amazon's two-pizza idea¹⁹ crossed with the T-shaped specialist who goes deep in one area and broad across several.²⁰ The genuinely new — and least proven — twist PAF rides is that an AI layer now lets one builder run the entire loop end to end, so the role drifts from "product manager who coordinates" toward "product engineer who builds." Whether that generalizes across large orgs is unsettled. For a solo product it's not a theory; it's just how the work already goes.

If you wanted to actually run this

The original is deliberately silent on tooling, which is fair for a methodology but useless if you want to start Monday. Here's a concrete stack. None of it requires buying anything called a Nexus.

Knowledge base. A folder of Markdown in a git repo, one file per topic — market, segments, value prop, business model, each major feature, the metrics tree. Markdown because every model reads it natively; git because you get history and diffs for free. Put Obsidian on top for a nicer local surface, or Notion if you want relational databases and don't mind it being the store of record. Solo, the cheapest viable version is a single AI project whose knowledge is that repo.

The AI layer, built as a ladder, not a leap.

Floor: an AI project with the knowledge base loaded and a written operating prompt. Ask it to analyze, draft, and find gaps.
Next: point a coding agent like Cursor at the repo so it can read and write the files — update a node, reconcile a contradiction, draft the next shortlist.
Next: wire live sources through MCP²¹ — analytics, GitHub, your issue tracker — so the model pulls current numbers and writes findings back instead of you copy-pasting.
Top: scheduled jobs (cron, n8n, or CI) that refresh the relevant files nightly and flag what's gone stale. That stale flag is PAF's context-freshness idea, made real.

Shortlist and confidence. One table — a Notion database, a Linear view, or even a CSV — where every candidate links to its evidence, with Gilad's Confidence Meter as the scoring rubric.

Discovery. Opportunity solution trees, AI-assisted synthesis, real user contact.

Metrics. One North Star plus an input-metric tree, tracked in PostHog (open-source, self-hostable) or Amplitude.

Delivery. Kanban.

Skip. NPV-per-feature, and the entire invented vocabulary.

What this does to the job

If the operating model changes, the job changes with it. For years the work was to operate one product: groom the backlog, write the specs, walk each release through delivery. Once the living knowledge base and the AI layer from the last sections are doing that grunt work, that's not where your hours go anymore. The job moves up a level, from running the day-to-day output to running the system that produces it.

So where does the job actually go? Mostly in three directions. They aren't three separate careers, more like three ways the same role can stretch, and at a small company one person might cover all three at once.

You own one product's brain. Instead of managing a backlog, you build and maintain the living knowledge base for a single product, and you set up the AI that works on top of it. That means writing the operating instructions, deciding which tasks you hand to the agents and which you keep for yourself, and checking what comes back. There's still a human in charge of the product. That human just spends the day on the context and the agents rather than on tickets.

You run a portfolio like an investor. Because the day-to-day shrinks, one person can look after several products at once. The question shifts from "what do we build next in this product" to "which of these products deserves more money, and which do we wind down." That's closer to running a small business unit than to classic product work, and it puts you in front of leadership owning the numbers.

You keep the whole system honest. Someone has to build the templates that every product's knowledge base follows, keep them current, and watch whether the process is actually working. This is a Product-Ops job. Think of it as a scrum master, except the thing being coached is how well the company uses AI, not how well it runs its standups.

One more split runs underneath all three. Building AI into the product, where the work is model quality and new interface patterns, is a different job from managing the AI-run operating model around a product. They look like one job today because they share a title, but they pull in different directions.

Here's how the three directions map onto the stack from the last section:

What you govern	What you actually own
One product's context	The knowledge-base repo, the operating instructions, and the agents and MCP wiring that feed them live data
A portfolio of products	A shared business-model profile across them, plus the budget call between them
The system itself	The templates, the freshness checks, and the process metrics that keep every product's knowledge base trustworthy

This isn't a lonely prediction. The people who've thought hardest about AI and product teams keep arriving at a similar place from different starting points. Marty Cagan runs the numbers from the staffing side and figures a company that needs fifteen to twenty product teams today could do the same work with three to five, each one staffed by people who are good at a handful of things and let AI cover the gaps.²² His real point isn't headcount. It's that the org gets flatter and the typical person on it gets more senior, because the junior, execution-heavy work is the first thing AI absorbs. Andrew Ng comes at it from the other end and puts it simply: when writing the software stops being the expensive part, the expensive part becomes knowing what to build, and knowing what to build is supposed to be the product manager's whole job.²³

You can already see the squeeze in the hiring data. Roughly thirty-five qualified people are chasing every open PM role right now, and the demand that's left leans senior. Companies want someone who can run the system, not someone who still has to be taught it.²⁴ That fits what nearly every study of AI at work keeps turning up: it sharpens people who already know enough to catch its mistakes, and it lulls the ones who don't into shipping something confidently wrong. So the role isn't going anywhere. It's moving up and out of reach of anyone who was counting on a few quiet years at the bottom to learn the craft.

What I'd keep and what I'd drop

Keep three things. The dev-versus-growth split, because most teams genuinely confuse the two and then blame the wrong system. Context as the competitive asset, because it's the part the AI shift makes true in a way it wasn't five years ago. And the reframe from waste to risk — you're reducing uncertainty about the future, not accounting for losses in the past.

The vocabulary I'd leave behind. The framework's stated aim is to build common language across methodologies — a reasonable goal — but the route is to mint a dozen new terms, which pulls in the other direction. That's not a criticism specific to PAF; most framework authors end up here, often despite themselves. Tikhomirov is a serious practitioner with the track record to back it,² and the substance holds up without the costume.

Read the original for the depth.¹ Read the people in the footnotes for the foundations. And if you only take one sentence away: the cheap work — analysis, first-pass research, drafting — now goes to the machine, so your edge stops being how much you produce and becomes how good your context is.

Glossary

Discovery vs. delivery — Discovery is deciding what's worth building; delivery is building it well. Different jobs, different systems.
Product/market fit (PMF) — The point where a product satisfies a strong market demand. Until you have it, scaling and monetizing are premature.
North Star Metric (NSM) — The single measure that best captures the value your product delivers to users, used to align the whole team.
Input-metric tree — The decomposition of a North Star into the smaller, ownable metrics that feed it, so each proposed change maps to a specific lever it's meant to move.
OKRs — Objectives and Key Results. A goal (the objective) paired with a few measurable outcomes that prove you reached it.
Confidence Meter — Itamar Gilad's scale for how much you should trust an idea, ranked by the strength of evidence behind it rather than how strongly someone feels.
Pre-mortem — Before committing, imagine the project has already failed and work backward to list why. Surfaces risks that optimism hides.
Opportunity solution tree — Teresa Torres's map linking a desired outcome to user opportunities and then to candidate solutions, so discovery stays tied to goals.
Kanban — A delivery method that pulls work continuously and limits work-in-progress, rather than batching it into fixed-length sprints.
MCP (Model Context Protocol) — An open standard for connecting AI models to external tools and data sources.
RAG (retrieval-augmented generation) — Letting a model fetch relevant documents at query time and answer from them, instead of relying only on what it memorized.
NPV (net present value) — The value today of a stream of future cash flows, after discounting for time and risk.

Sergey Tikhomirov, Product Architecture Framework — AI Product Operations. https://productframework.ru/ops/main. Distributed under CC BY-SA 4.0. ↩ ↩² ↩³
Sergey Tikhomirov, about page — product methodologist, founder of the GRACHI consulting agency, MBA instructor, author of PAF, and writer of the "Борода продакта" (Product's Beard) channel. https://productframework.ru/about. ↩ ↩²
Taiichi Ohno, Toyota Production System: Beyond Large-Scale Production (1978). ↩
Ken Schwaber and Jeff Sutherland, The Scrum Guide. https://scrumguides.org. ↩
David J. Anderson, Kanban: Successful Evolutionary Change for Your Technology Business (2010). ↩ ↩²
Marty Cagan, Inspired: How to Create Tech Products Customers Love (2008; rev. 2017). Silicon Valley Product Group, https://www.svpg.com. ↩
Steve Blank, The Four Steps to the Epiphany (2005); the origin of Customer Development. https://steveblank.com. ↩
Eric Ries, The Lean Startup (2011). https://theleanstartup.com. ↩
Tobi Lütke, CEO of Shopify. In a 2025 internal memo that became widely circulated, Lütke framed AI fluency — and specifically the ability to construct good context — as a core expected skill at Shopify. ↩
Andrej Karpathy, former Tesla AI director and OpenAI co-founder, has written and spoken extensively about "context engineering" as the practice of deliberately constructing the LLM context window, arguing that this — not prompt wording — is the real leverage point. ↩
"Context engineering" as a framing gained currency in 2025, shifting the conversation from how you word a prompt to how well you curate and maintain the information you supply to the model. ↩
Andy Grove originated OKRs at Intel; John Doerr popularized them in Measure What Matters (2018). https://www.whatmatters.com. ↩
Itamar Gilad, GIST Planning and the Confidence Meter. https://itamargilad.com. The PAF guide cites his confidence work directly: https://itamargilad.com/how-much-product-discovery/. ↩ ↩² ↩³
ICE (Impact, Confidence, Ease) was popularized by Sean Ellis; RICE (Reach, Impact, Confidence, Effort) was introduced by Intercom. ↩
Gary Klein, "Performing a Project Premortem," Harvard Business Review (September 2007). ↩
Teresa Torres, Continuous Discovery Habits (2021); originator of opportunity solution trees. https://www.producttalk.org. ↩
The phrase "product/market fit" was popularized by Marc Andreessen (2007, "The Only Thing That Matters"); the underlying concept traces to Steve Blank's Customer Development. ↩
The North Star Metric is generally credited to Sean Ellis; a widely used practical playbook was later published by John Cutler at Amplitude. ↩
Two-pizza teams — small enough to be fed by two pizzas — is an Amazon operating principle associated with Jeff Bezos. ↩
The "T-shaped" skill profile (deep in one area, broad across many) was popularized by Tim Brown of IDEO. ↩
Model Context Protocol, an open standard for connecting AI models to external tools and data. https://modelcontextprotocol.io. ↩
Marty Cagan, "A Vision For Product Teams," Silicon Valley Product Group (2025). https://www.svpg.com/a-vision-for-product-teams/. The team-size collapse, the cost math, and the comb-shaped framing are his. ↩
Andrew Ng, in conversation on Lenny's Podcast (2025): as the cost of building falls, product judgment becomes the constraining step rather than engineering throughput. ↩
Drawn from Lenny Rachitsky's "State of the Product Job Market" tracking (2025–2026): a large surplus of candidates per PM opening, with senior demand rising while junior and associate demand stays soft. ↩