What's the main difference between Great Question and Dovetail?

Great Question is a full research platform: recruitment, scheduling, unmoderated testing, moderated interviews, surveys, AI analysis, and a searchable repository — all in one. Dovetail is a research repository that doesn't run studies or recruit participants. Teams using Dovetail still need 3-4 other tools for the actual research.

Does Dovetail have participant recruitment?

No. Dovetail retired its Recruit beta, so teams needing recruitment use a separate tool. Great Question includes a built-in research CRM with panel management, screener surveys, scheduling, and global incentive payments.

Can I migrate from Dovetail to Great Question?

Yes. Great Question offers migration support for teams switching from Dovetail. Interviews, transcripts, highlights, and tags transfer over, and all recordings are re-transcribed using Great Question's AI. Start at least 90 days before your Dovetail contract ends.

Is Dovetail or Great Question better for AI analysis?

Great Question's AI auto-tagging learns your taxonomy over time because it captures research natively rather than importing files. Great Question also offers AI moderation for unmoderated studies — something Dovetail doesn't have.

Which one is better for ResearchOps teams?

Great Question. ResearchOps teams are usually asked to consolidate the stack. A repository-only tool like Dovetail leaves ResearchOps managing 3-4 additional tools. Great Question covers Recruitment & Admin, Tools & Infrastructure, and Data & Knowledge Management in one platform.

LIVE EXPERIMENT - PART 2 of 4

Four ways to build a synthetic user (I tried all of them)

Building our synthetic user in public, Part 2. Part 1 mapped the ways to build one. This time I built each one and put it through its paces.

Follow the experiment→

Tania

PMM · Great Question

June 2026~12 min read

Four parts · One live experiment

We're building our synthetic user in public,
start to finish.

The map

The vocabulary, the ways to build a synthetic version of your customer, and the priors going in.

YOU ARE HERE

Four ways to build one

I built each of the four workflows, and worked out what each is good for and where each falls apart.

Robot vs human

An experiment putting synthetic users and real humans head to head. Where do they diverge?

The recap & the skill

Concludes the live experiment. The synthetic user skill ships, and everyone on the list gets it.

Data scientists have spent years trying to predict what a user will do next from what they've done before. With the availability of AI, it's hard not to feel optimistic about how we could blend past user behaviour to predict user behaviour and feedback, at a much greater level. I like to think of synthetic users as data science on steroids.

What data science & synthetic users have in common

When it comes to building a synthetic user, we're essentially blending behavioral prediction (product usage), and giving it much better raw material (interview transcripts).

The old way mostly had the numbers to work with, what people clicked and how often. A grounded synthetic user adds the why behind the numbers: the actual words customers used in interviews, and the context that only qualitative data can collect.

Caitlin Sullivan framed it for me that way in Part 1, and it reset how I thought about the whole project. It's the same goal the data teams have always chased, just with far richer input.

The ground rules (they apply to all four): A synthetic user is only as honest as the evidence under it, so the same non-negotiables went into every version before I picked a workflow:

Evidence-backed claims only. No source, no claim.

Cite every claim inline, with the quote attached.

Flag the gaps. When the evidence is thin, the synthetic user says so instead of inventing a pattern.

Qualitative confidence threshold. When I run a qualitative study, I always want to know how many people actually said something before I trust it as a pattern. The skill does the same thing. It tells me how many interviews are behind every claim it makes, and it won't call something a pattern unless enough of them back it up. I set the bar at 8: hit 8 interviews and it states the point plainly, anything below that and it categories the theme as medium or low confidence.

I've built all of these requirements into a synthetic user skill, that I'll be testing for the duration of this live experiment. We'll make it available to you when part 4 lands, and concludes our live experiment.

Why I'm building the synthetic user from a research repository, not a pile of transcripts

The first question I asked Jack, an AI Product Manager from the Great Question team, is why can't I just query a whole bunch of transcripts from github or Google Drive?

The DIY version is to drop a folder of transcripts into Claude and start asking questions. It works for about three transcripts. Past that you hit "lost in the middle," where the model skims the middle of a long document and quietly fills the gaps with things that sound right. You won't catch it, because the invented parts read exactly like the real ones. Jack, who built our repository retrieval system, said this:

"If you don't build a RAG pipeline that knows what it's doing? It's going to be hallucinating left and right. And you won't know."

Jack · AI Product Manager, Great Question

A repository earns its place by doing the unglamorous work that keeps that from happening:

Hybrid search. Keyword and semantic together. Pure semantic search feels clever but loses the exact-string matches that let you anchor a claim to the precise sentence a customer said. You want both running.

Server-side filtering. Rather than shipping a 90-minute transcript to the model and hoping, the repo narrows to the relevant chunks first, so the model only ever reasons over material it can actually hold in context.

Structured metadata. Studies, segments, dates, participants. You can scope a query to "B2B researchers, last 18 months" instead of praying the right transcripts surface on their own.

A curated layer. Insights and highlights you've already validated sit on top of the raw transcripts, so the synthetic user draws on evidence that's been checked, not just whatever the search happened to return.

Citations that resolve. Every claim links back to the session it came from, which is the whole difference between a synthetic user you can audit and one you have to trust blind.

The DIY route can get there, but only by building your own version of all this. Anything you'd actually rely on, and especially anything high-stakes, means building your own RAG: server-side filtering, citation plumbing, metadata, the lot. That's a real engineering project before you've even started on the synthetic user. A repo is that project already finished, which is why all four workflows below run on top of one.

By the way, we did experiment with building a lightweight synthetic persona in the past, which was a collection of 8-10 interview transcripts from a previous study. This felt lightweight to me? My intention with this series is to build something meatier, with MUCH more data available to you than 8-10 raw transcripts.

Jargon, decoded

Two terms worth getting straight.

RAG (retrieval-augmented generation)

RAG

Instead of relying on what the model already knows, you pull the relevant pieces of your own data (your interviews and notes) and feed them in alongside the question, so the answer is built on your evidence.

Lost in the middle

Language models read the start and end of a long document closely and skim the middle. Hand one a 90-minute transcript and the middle is exactly where it's most likely to miss something or quietly make it up.

The build

4 ways to build a synthetic version of your customer

Here are the four ways I experimented with below, and their pros and cons.

Workflow 1

Digital twin

How I built it

Take one real, named user you have deep data on. Strip the PII (any personal details that identify them), store what's left as a synthetic-user document, and tell the agent to answer as that person. It's the highest-fidelity option because it's grounded in one real human rather than an average.

Best data to ground it

This one goes deep on a single person, not wide. You want everything you have on them: their interview transcripts, their product-usage history (feature adoption, drop-offs, the Mixpanel trail), their support tickets, and their CRM and sales-call notes. The richer the single-person record, the more convincing the twin.

The trade-off

Its strength is fidelity. Nothing gets you closer to a specific person's perspective, which makes it ideal when a key account or a design partner needs a seat in a roadmap conversation. Its limit is that it's exactly one person, quirks and all, so it can't speak for a segment.

Workflow 2

Segment-based synthetic user

How I built it

Aggregate eight to ten or more real users across a bunch of different segments into one synthetic user. You're building an archetype from a cluster of evidence, then writing it up as a single coherent person whose every trait traces back to the underlying interviews.

Best data to ground it

Breadth is the whole game here. Transcripts across the segment for language and goals, the insights and highlights you've already curated for themes that are validated, candidate and demographic data so no single sub-group dominates the mix, and product-usage data so the behavior is real and not self-reported.

Step zero is an audit: do you actually have eight or more solid sessions on the segment you want? If not, your next step would be to fill that research gap, so you have solid foundations to build upon.

Types of segment-based synthetic users:

Power users: gather the data on your 'best' customers, how they're using the product, what they say, what they love, what feedback they've given in the past.

Casual users: gather product usage data on a segment of your less frequent users.

Churned users: pull out churn surveys, customer interview transcripts or closed-lost interviews.

The reason I like segment-based synthetic users is because you can then run a PRD or an artifact, or a concept past all 3, and then compare the insights. I love experimenting with Perplexity's model council for this reason. If you haven't used it yet, Perplexity's model council runs any query through 3 models so you can triangulate and sharpen your point-of-view based on the models arguing against each other. Fun stuff.

The trade-off

Because it's built on a few dimensions of data, the patterns are stable enough to trust…with a grain of salt of course.

Its limit is that you sand off the sharp edges of any one person, and it's only as good as your data's coverage of that segment. Thin segment = thin synthetic user.

Again, at best this could be used internally to drive customer empathy, and provide some directional feedback on new concepts. We know from Part 1, that nuance is where synthetic users unfortunately fall down…today.

Workflow 3

Synthetic panel

How I built it

Sample several synthetic users from a segment (built on survey data), and run them through the same study together, the synthetic version of a recruited panel. Instead of one voice, you get a spread of them answering the same questions.

Best data to ground it

This one is built purely off a large survey, where you'd have statistical significance, plus enough demographic and behavioral variation in the source data to make the panel genuinely diverse.

The trade-off

Its strength is distribution. You get a range of responses rather than a single point estimate, which is exactly what you want for dry-running a survey or catching a broken question before a human ever sees it. Its limit is that the diversity is capped by your data, and a panel that looks varied but isn't will hand you false confidence.

Workflow 4

Live retrieval

How I built it

No stored user at all. A skill queries the whole repo live, contextualized to whatever artifact you feed it, a PRD or a design spec, and assembles the relevant customer evidence on the spot. This is the one Ned demoed on our recent synthetic user webinar.

Watch Ned demo it on our synthetic user webinar →

He pasted in a PRD, the skill built synthetic users from the matching screener and interview data, and each one reacted to the parts of the PRD he put through.

Best data to ground it

This leans on the whole repo rather than one stored doc, so it lives or dies on the indexing from the section above, the hybrid search and server-side filtering especially. Whatever's freshest in the repo is what it pulls, which is the point. You want it reacting to your latest evidence, not a stored persona document from last year.

The trade-off

Its strength is that it's always current and contextual. You point it at the artifact in front of you and it reacts to that, with no document to maintain. Its limit is consistency. We experimented with it a few times while prepping the webinar, using the same repository and a similar prompt, and the outputs came back consistent-ish but not the same. The themes held every run; the exact wording and the examples it reached for moved around. Useful for a directional gut-check, risky the moment two people quote their own separate runs as the source of truth.

So which one do you reach for?

It comes down to what you're holding when you start. A specific person's perspective points to a digital twin. A whole segment points to the segment-based build. A need for spread, like dry-running a survey, points to a synthetic panel. A live artifact you want reactions to points to live retrieval. And for any high-stakes go/no-go, real users still win. Synthetic users is where you can start, but not use as a final decision point.

The consistency question

Does it matter if everyone gets a slightly different answer?

This is the part I keep going back and forth on. Live retrieval gives a slightly different answer each time you run it. Does that actually matter? Especially if that data is the most recent?

I saw a small version of this in our own testing. We ran live retrieval a few times while prepping the webinar, same repo, similar prompt, and the answers came back close but not identical. The big themes held every time, which was at least comforting. The exact wording and the examples moved around. That's fine if you just want a quick gut-check. It's a problem if two people each run it and quote their own version as the truth.

So my answer is yes, consistency matters, but not all the time. For a quick directional read, a slightly different answer each time is fine. For anything the whole team is building around, you want one version everyone trusts. That's why I think we'll keep one saved synthetic user as the official reference, and use live retrieval for one-off questions on top of it.

From the webinar

The questions the webinar raised

When Ned ran the live demo, the chat filled with sharp and thoughtful questions that we wanted to outline here:

On the data behind a synthetic user

How much cleanup did it take before the output felt decent?

Less than you'd expect. Coverage matters more than polish.

Does it use the repo's transcripts? What about usability-test videos?

Yes to transcripts, they're the backbone. Video works through its transcript today, so the spoken content is in but the on-screen behavior isn't yet.

On trust and accountability

Isn't it easy to over-trust this?

Yes, which is exactly why the skill we've built cites every claim and flags every gap.

Who's accountable when the output informs a real decision?

A genuinely tough question. I think where we're landing is that people are, not AI. Its job is to make the evidence legible enough that a researcher and a PM can share that accountability.

How do you coach people to read it?

Get them looking at the citations and the gaps before the conclusions.

On replacing humans

Is this only for low-stakes calls?

Early discovery isn't always low-stakes, it can set product strategy. So the rule holds: synthetic is the floor, not the ceiling, and the higher the stakes, the faster you go to real humans.

How do I stop budget-holders swapping humans for synthetic to save money, without being called "anti-AI"?

Lean on synthetic where it's strong, like dry runs and gap-finding, and say plainly where it isn't, like anything you'd bet the roadmap on.

The hard ones I can't fully answer yet

Hyper-rationalization:

real people reason messily, synthetic ones don't.

Web access:

does letting it browse inflate what a synthetic user "knows"?

Model temperature:

how do you get realistic variety instead of ten identical voices?

Brand knowledge:

how much should a synthetic user know about you?

Language bias:

AI tends to reward concrete, action-oriented wording over abstract ideas.

WEIRD bias:

model training data skews Western, Educated, Industrialized, Rich and Democratic, and aggregation alone won't fix it.

I don't have clean answers to that last group. What I have is repo-only retrieval, so the model isn't pulling in web knowledge a real customer wouldn't have, plus a gap flag that fires when the evidence isn't there.

Under the hood

The skill behind all this

Everything above runs on a skill I built. Here's how it works below:

Point it at the repo, and it builds profiles

It surveys what's in the Great Question repo, groups sessions into clusters by role, workflow and shared pains, and counts the evidence behind each one:

sessions: a high-confidence profile, stated plainly

a profile, but hedged

a signal, not yet a finding

a gap, flagged for you to go research

Each profile card covers who they are, the world they work in, their pains in their own words, the phrases they use, what they want, what it's safe to use them for, and what it still can't tell you.

Hand it a PRD or design, and a profile reacts

It pulls the relevant evidence with hybrid search (keyword and semantic at once), then responds in the first person as that customer, pushing back where the evidence contradicts the artifact and flagging anything the repo can't speak to.

The rules, every time

Every claim cites the real session it came from, by anonymous speaker handle, never a name. The repo is the only source. If the evidence isn't there, it says so instead of filling the gap.

It's not public yet, on purpose

I'm testing it in the open across this series first, so you can see where it holds up and where it doesn't before you run it yourself.

What's coming in Part 3

Part 3 is about the robot vs human experiment I'm running! I'm going to design an experiment and put both synthetic users and humans head to head. It should be interesting!

What I honestly don't know going in is where they'll diverge.

Live experiment · Part 2 of 4

Four ways to build a synthetic user (I tried all of them)

Building our synthetic user in public, Part 2. Part 1 mapped the ways to build one. This time I built each one and put it through its paces.

Follow the experiment

Tania

PMM · Great Question
The Synthetic User Series

Catch up · Part 1

Synthetic users: we're building in public

The vocabulary, the ways to build one, and the priors I took into this experiment.

Four parts · One live experiment

We're building our synthetic user in public, start to finish.

The map

The vocabulary, the ways to build a synthetic version of your customer, and the priors going in.

Four ways to build one

I built each of the four workflows, and worked out what each is good for and where each gives me niggles of discomfort.

You are here

Robot vs human

An experiment putting synthetic users and real humans head to head. Where do they diverge?

The recap & the skill

Concludes the live experiment. The synthetic user skill ships, and everyone on the list gets it.

Data scientists have spent years trying to predict what a user will do next from what they've done before. With the availability of AI, it's hard not to feel optimistic about how we could blend past user behaviour to predict user behaviour and feedback, at a much greater level. There's lots of similiarites we can draw from data science into creating synthetic users.

What data science & synthetic users have in common

When it comes to building a synthetic user, we're essentially blending behavioral prediction (product usage), and giving it much better raw material (interview transcripts).

Caitlin Sullivan framed it for me that way in Part 1, and it reset how I thought about the whole project. It's the same goal the data teams have always chased, just with far richer input.

In Part 1 I outlined the different ways to build some kind of synthetic version of your customer, and said I'd actually go and build them. So that's this edition. I took the four workflows, built a synthetic user with each, and worked out what each is good for and where each falls apart.

The ground rules (they apply to all four)

A synthetic user is only as honest as the evidence under it, so the same non-negotiables went into every version before I picked a workflow:

Evidence-backed claims only. No source, no claim.
Cite every claim inline, with the quote attached.
Flag the gaps. When the evidence is thin, the synthetic user says so instead of inventing a pattern.
Qualitative confidence threshold. When I run a qualitative study, I always want to know how many people actually said something before I trust it as a pattern. The skill does the same thing. It tells me how many interviews are behind every claim it makes, and it won't call something a pattern unless enough of them back it up. I set the minimum evidence bar at 8: hit 8 interviews and it states the point plainly, anything below that and it categorises the theme as medium or low confidence.
Note: this qualitative confidence threshold would not apply to synthetic panels, which require much more statistical significance.

Why I'm building the synthetic user from a research repository, not a pile of transcripts

The first question I asked Jack, an AI Product Manager from the Great Question team, is why can't I just query a whole bunch of transcripts from github or Google Drive?

"If you don't build a RAG pipeline that knows what it's doing? It's going to be hallucinating left and right. And you won't know."

Jack · AI Product Manager, Great Question

Jargon, decoded

Two terms worth getting straight.

RAG (retrieval-augmented generation)

RAG

RAG is simply connecting an AI LLM to your specific source documents so it can read them and answer your questions accurately. Put another way, a RAG is the bridge that lets the LLM search external files, read them, and use that information to formulate its answer. The benefits of a RAG is that stops the AI from "hallucinating" (fabricating facts) by forcing it to stick to your provided sources, in this case the evidence in your Great Question repository. This means you can easily update your source documents without the need to retrain or fine-tune the expensive AI model, and safely connect the AI to private internal records without exposing that data to the public internet.

Lost in the middle

To build each type of synthetic user, I'm relying on Great Question's research respository with existing data, and the MCP to pull this in programmatically using Claude.

A repository earns its place by doing the unglamorous work that keeps that from happening:

Hybrid search. Keyword and semantic together. Pure semantic search feels clever but loses the exact-string matches that let you anchor a claim to the precise sentence a customer said. You want both running.
Server-side filtering. Rather than shipping a 90-minute transcript to the model and hoping, the repo narrows to the relevant chunks first, so the model only ever reasons over material it can actually hold in context.
Structured metadata. Studies, segments, dates, participants. You can scope a query to "B2B researchers, last 18 months" instead of praying the right transcripts surface on their own.
A curated layer. Insights and highlights you've already validated sit on top of the raw transcripts, so the synthetic user draws on evidence that's been checked, not just whatever the search happened to return.
Citations that resolve. Every claim links back to the session it came from, which is the whole difference between a synthetic user you can audit and one you have to trust blind.

workflows I built

4 ways to build a synthetic version of your customer

Here are the four ways I experimented with below, and their pros and cons.

For reference, I used real customer data from past Great Question customer research to thoroughly test these workflows out. The sample size I used here was a whopping 726 transcripts!! Which is why I did this through the Great Question repo, because there’s no way I could architect all that data together without engineering effort.

Workflow 1

Digital twin

How I built it

Best data to ground it

This one goes deep on a single person, not wide. I queried the Great Question research repository, to try to find a candidate that had participated in multiple studies. I found a customer that had participated in 12 sessions. The MCP then used the multi-study data to assemble a digital twin.This one goes deep on a single person, not wide. You want everything you have on them: their interview transcripts, their product-usage history (feature adoption, drop-offs, the Mixpanel trail), their support tickets, and their CRM and sales-call notes. The richer the single-person record, the more convincing the twin.

Tip: You could take this further by adding in product usage data, and any other customer dimension.

The trade-off

Pros: Its strength is fidelity. Nothing gets you closer to a specific person's perspective, which makes it ideal when a key account or a design partner needs a seat in a roadmap conversation.

Limitations: Its limit is that it's exactly one person, and is only limited to the data you have available on that particular customer. I do see this being useful if you want to build a digital twin “skeptic” or “best fit” customer, as a way to gather the potential negative and positive feedback.

Workflow 2

Segment-based synthetic user

How I built it

Best data to ground it

Breadth is the whole game here. Transcripts across the segment for language and goals, the insights and highlights you've already curated for themes that are validated, candidate and demographic data so no single sub-group dominates the mix, and potentially even product-usage data so the behavior is real and not self-reported.

Types of segment-based synthetic users:

By role or job title (the one I chose to build!)
‍Power users: gather the data on your 'best' customers, how they're using the product, what they say, what they love, what feedback they've given in the past‍
Casual users: gather product usage data on a segment of your less frequent users‍
Churned users: pull out churn surveys, customer interview transcripts or closed-lost interviews

The reason I like segment-based synthetic users is because you can then run a PRD or an artifact, or a concept past all 3, and then compare the insights. I love experimenting with Perplexity’s model council for this reason. If you haven’t used it yet, Perplexity’s model council runs any query through 3 models so you can triangulate and sharpen your point-of-view based on the models arguing against each other. Fun stuff.

The trade-off

Pros: Because it's built on a few dimensions of data, the patterns are stable enough to trust…with a grain of salt of course.

Limitations: Its limit is that you sand off the sharp edges of any one person, and it's only as good as your data's coverage of that segment. Thin segment = thin synthetic user.

Where do I see this fitting? Again, at best this could be used internally to drive customer empathy, and provide some directional feedback on new concepts. We know from Part 1, that nuance is where synthetic users unfortunately fall down…today.

Workflow 3

Synthetic panel

How I built it

I queried the Great Question repository to audit and gather all of the existing survey responses, and tried to understand what types of users had answered what in the past.

Best data to ground it

This one is built purely off a large survey, where you'd have statistical significance, plus enough demographic and behavioral variation in the source data to make the panel genuinely diverse.

The trade-off

Pros: Its strength is distribution. You get a range of responses rather than a single point estimate, which is exactly what you want for dry-running a survey or catching a broken question before a human ever sees it.

Limitations: Its limit is that the diversity is capped by your data, and doesn’t have the qualitative nuance that synthetic users seem better fit for. I would feel deeply uncomfortable using this without having 1000's of past responses.

Note: A lot of our internal research skews qualitative, so without the evidence base of 1000’s of survey responses, I felt the least confident in this as a synthetic user option. Plus this workflow didn’t have the dimensionality that I was looking for either, because we did not have a deep set of data.

Workflow 4

Live retrieval, or the average of all your data

How I built it

He pasted in a PRD, the skill built synthetic users from the matching screener and interview data, and each one reacted to the parts of the PRD he put through.

Best data to ground it

The trade-off

Pros: Its strength is that it's always current and contextual. You point it at the artifact in front of you and it reacts to that, with no document to maintain.

Limitations: Its limit is consistency. We experimented with it a few times while prepping the webinar, using the same repository and a similar prompt, and the outputs came back consistent-ish but not the same. The themes held every run; the exact wording and the examples it reached for moved around. Useful for a directional gut-check, risky the moment two people quote their own separate runs as the source of truth.

So which one do you reach for?

more thoughts...

Does it matter if everyone gets a slightly different answer?

This is the part I keep going back and forth on. Live retrieval gives a slightly different answer each time you run it. Does that actually matter? Especially if that data is the most recent?

A newsletter I read recently made the case that it does, and not just for synthetic users. The point was that when AI lets anyone pull their own answer whenever they want, very few people end up looking at the same picture. Everyone works off their own version, and that slowly chips away at the shared understanding a team needs to make decisions together. The writer [Chris Saad] was talking about analytics dashboards, but it applies just as well here.

So where did I land with all 4 of these workflows?

This is what Part 3 will cover…sorry to end on a cliffhanger.

I built all 4 workflows into a Claude skill to be able to draw upon them for the next phase of this experiment. Stay tuned for how they performed.

recent webinar

Watch Ned demo synthetic user's live

Ned demoed the live retrieval workflow recently. We highly recommend you watch, and sign up to our series to hear when our recap of this experiment will go live!

Under the hood

The skill behind all this

Everything above runs on a skill I built. Here's how it works below:

Point it at the repo, and it builds profiles

It surveys what's in the Great Question repo, groups sessions into clusters by role, workflow and shared pains, and counts the evidence behind each one:

8+ sessions

a high-confidence profile, stated plainly

3 to 7

medium confidence

1 or 2

a signal

a gap, flagged for you to go research

Each profile card covers who they are, the world they work in, their pains in their own words, the phrases they use, what they want, what it's safe to use them for, and what it still can't tell you.

Hand it a PRD or design, and a profile reacts

The rules, every time

Every claim cites the real session it came from, by anonymous speaker handle, never a name. The repo is the only source. If the evidence isn't there, it says so instead of filling the gap.

It's not public yet, on purpose

I'm testing it in the open across this series first, so you can see where it holds up and where it doesn't before you run it yourself.

What's coming in Part 3

Part 3 is about the robot vs human experiment I'm running! I'm going to design an experiment and put both synthetic users and humans head to head. It should be interesting!

What I honestly don't know going in is where they'll diverge.

If you want it the day it drops, sign up to follow along. The skill ships at the end of the series, and everyone on the list gets it.

The build continues.Follow along.

Part 3 lands next: the robot vs human experiment, putting synthetic users and real humans head to head.

The map

Vocabulary, the ways to build one, priors going in.

Now · Four ways to build one

Built each workflow. What each is good for, where each falls apart.

Coming · Robot vs human

Synthetic users and real humans, head to head.

Coming · Recap & the skill

The full guide, and the synthetic user skill you can run yourself.

Catch up on Edition 1, or watch Ned's webinar that kicked the experiment off.

Get great UX research reads & resources once a month.

Thank you! Please check your email to confirm your subscription.

Oops! Something went wrong while submitting the form.

Four ways to build a synthetic user (I tried all of them)

We're building our synthetic user in public,start to finish.

The map

Four ways to build one

Robot vs human

The recap & the skill

What data science & synthetic users have in common

Why I'm building the synthetic user from a research repository, not a pile of transcripts

Two terms worth getting straight.

RAG

Lost in the middle

4 ways to build a synthetic version of your customer

Digital twin

Segment-based synthetic user

Synthetic panel

Live retrieval

So which one do you reach for?

Does it matter if everyone gets a slightly different answer?

The questions the webinar raised

On the data behind a synthetic user

On trust and accountability

On replacing humans

The hard ones I can't fully answer yet

The skill behind all this

Point it at the repo, and it builds profiles

Hand it a PRD or design, and a profile reacts

The rules, every time

It's not public yet, on purpose

What's coming in Part 3

The build continues.Follow along.

Four ways to build a synthetic user (I tried all of them)

We're building our synthetic user in public, start to finish.

What data science & synthetic users have in common

The ground rules (they apply to all four)

Why I'm building the synthetic user from a research repository, not a pile of transcripts

Two terms worth getting straight.

To build each type of synthetic user, I'm relying on Great Question's research respository with existing data, and the MCP to pull this in programmatically using Claude.

4 ways to build a synthetic version of your customer

So which one do you reach for?

Does it matter if everyone gets a slightly different answer?

So where did I land with all 4 of these workflows?

Watch Ned demo synthetic user's live

The skill behind all this

Point it at the repo, and it builds profiles

Hand it a PRD or design, and a profile reacts

The rules, every time

It's not public yet, on purpose

What's coming in Part 3

The build continues.Follow along.

We're building our synthetic user in public,
start to finish.

The build continues.
Follow along.