Synthetic Market Research

In the Belle Jolie lipstick scene from Series 1, Episode 6 of Mad Men, ad agency Sterling Cooper put a room full of women behind a one-way mirror to test lipstick shades. Every woman tried the product exactly as expected, except Peggy Olson, who refused. She said, “I don’t think anyone wants to be one of a hundred colours in a box.” That throwaway comment became the campaign. The insight that mattered most came from the person who broke the methodology.

Statistics adjusts for outliers. Synthetic panels replicate the statistically relevant sample. The methodologies that find the lipstick campaign (deviant case analysis, lead user research, ethnographic immersion) are designed not to discard outliers but to follow them. Peggy is the outlier the methodology was set up to discard. Following the deviant case is the part of the discipline that does not get cheaper.

Still from Mad Men Season 1 Episode 6, 'Babylon': the Belle Jolie lipstick focus group sits behind a one-way mirror at Sterling Cooper, the moment that produced the insight that became the campaign. — Mad Men, Season 1 Episode 6, “Babylon”. The Belle Jolie lipstick focus group. The insight that changed everything came from the person who refused to participate as expected.

Clipboard confessional.

As a young psychology student, I worked as a door-to-door market researcher, primarily asking sympathetic faces to let me into their homes and give me fifteen to thirty minutes of their time to answer questions about FMCG. I learned early how biased the data collection from this type of field research was. While I got better at getting a broader range of personalities to trust me, for every middle-aged woman who took pity on the young man earnestly asking for their opinion about ice-cream, I needed a battle-axe who slammed the door in my face.

The fallibility of field work stayed with me. Field research is bounded by who agrees to participate. The calibrated population diversity of a synthetic research panel can sometimes be more representative than a sample of self-selected real humans. It brings its own inherited biases.

The door-knocker wager.

A door-to-door researcher walks the street. Some doors open. Some slam shut. The sample is bounded by who answers. The data is bounded by who tolerates a clipboard. Close-to-accurate insight at a fraction of the cost may be worth the wager. That was true at eighteen with a clipboard. It is true now at a terminal.

Illusory superiority.

Each of us, isolated, is a wonderful deviant, freak, outsider. But as a sample, can our collective opinions be replicated?

Synthetic data is as good as real. Mark Ritson made the case in Marketing Week, June 2024. He grounded it in a number from EY Americas. CMO Toni Clayton-Hine had run her firm’s annual senior-executive brand questionnaire through Evidenza, a synthetic-respondent platform, and compared the AI answers to a fresh human sample. The two data sets returned a 95% correlation.¹ Ritson’s reading of that result was sharper still: “It’s more likely that the inherent sampling errors, subject distractions and signalling biases mean it is the human subjects who were off the pace. Actual executives get bored (quickly) with a survey of more than 20 questions, for example. Synthetic customers never falter.”¹

The academic work is converging. A Harvard Business School study found that large language models replicate human consumer behaviour well enough to be useful for real product and pricing decisions, not as a novelty, but as a working methodology.² Stanford researchers used large language models to predict outcomes across seventy pre-registered social-science experiments and reported strong correlation with actual human results.³

Beyond empirical data.

The Market Research Society’s Delphi Report names the limit: synthetic panels cannot replace the human moments that surface what data alone will never reach.⁴

The same caution comes from the people running the world’s largest research firms. Ray Poynter, President of ESOMAR, the global research body representing twelve thousand members across one hundred and thirty-five countries, was direct: “I think synthetic data will be big, but at the moment it can’t reliably match the results from real people. Sometimes it does, sometimes it doesn’t, and we need to find out more about when it works and when it doesn’t before it moves mainstream.”⁵

Ben Page, global CEO of Ipsos, was sharper: “We have been using synthetic data in the form of imputation for decades. AI lets us do it more quickly and accurately. Where it is unproven is in moving beyond past empirical data to measure reactions to things that have not existed before.”⁵

Cutting-edge research can do these things in lab conditions. It cannot yet do them in commercial practice. An AI in a real focus group, on real footage, still misses the body language that contradicts a survey answer and the throwaway comment that turns out to be the real insight. The research that changes strategy still comes from the parts that resist commercial automation.

The dehumanising charge.

In June 2024, Jason Dunstone of the Research Society of Australia put the strongest dissent on the record: “Replacing research from real people with synthetic data seems counter-productive and dehumanising as a marketing strategy.”⁵

Steelman first. Dunstone’s case, taken seriously, is not that synthetic respondents are inaccurate. Grant that they can be close. His argument is that the strategy itself corrodes the discipline. Qualitative research, at its best, is not a data-collection step. Take the encounter away as a habit, and you get a closed loop of executives consulting models trained on executives consulting models. The strategy is dehumanising not because the output is wrong but because the practitioner stops practising the human part of the craft.

He is right. But for most small-to-medium enterprises, the choice is not between synthetic data and good research. It is between synthetic data and no research at all. Customer conversations matter, but they are bounded by who agreed to take the meeting. The insights that matter from larger customers can be locked away behind office security. Field research is prohibitively expensive. Synthetic data can sanity-check what the sales team is telling you.

One-tenth the cost.

Synthetic data is not a research-industry argument any more. It is already in production at the frontier of code. Cursor, an AI company run by a bunch of MIT graduates in their mid-twenties, released Composer 2.5 in May 2026. Twenty-five times more synthetic training tasks than their previous model. Results equivalent to Claude Opus 4.7, at one-tenth the per-token cost. The Cursor result is engineering trade reporting, not peer-reviewed market research, and the analogy is imperfect. The broader point still holds. Underdogs can compete on synthetic data. They can do the same in business and marketing.

Mark Ritson predicts the next move. In his June 2024 Marketing Week column, he argues the engines generating synthetic respondents will move on to produce strategic outputs: segmentation, positioning, perceptual maps, media mix modelling, pricing analysis, briefing. Marketers will eventually run thousands of strategic permutations side by side, each one scored by synthetic data, with the optimum paths surfaced as recommendations. He puts it this way: “The missing element is time, and it’s the 2030s that will usher in this grand new era of automation.”¹

Experimentation.

Serious practice tracks, closely and honestly, which applications are proven and which are still experimental. The distinction matters. It changes week by week. What frontier models could do six months ago has been superseded. What is being tested today will not all survive contact with real client data. That is how the work goes.

Decades of qualitative and quantitative discipline do not get abandoned because a language model can generate ten thousand survey responses in an afternoon. The proven approaches stay. New capabilities earn their place.

The single largest failure mode of a synthetic panel is the one Dunstone is, in effect, warning about: a monoculture wearing a thousand faces. Other failure modes are more specific: training data that leans Western, English-heavy and online-native; responses that average toward the eager-to-please; weaker performance the further the model is pushed outside the distribution it was trained on. For Australian research, the offshore centre of gravity in training data is a live concern. But synthetic data experiments are uncovering broader applications in marketing technology.

Customer experience teams have been creating personas for mapping customer journeys for decades. A dozen handpicked psychographic consumer segments of “Savvy Shoppers” and “Self-made Lifestylers” stuck to a meeting room wall can be eclipsed by multitudes of synthetic customer personas, linked to census data and other publicly-available data sets. Running them through cognitive diversity gates, reasoning diversity (Hong-Page Theorem), pre-mortem analysis (Klein), adversarial challenge, Delphi convergence, and stakeholder perspective forces response diversity that the assumptions about customer experience blu-tacked to the wall systematically miss. The difference is between a synthetic panel that confirms what the prompt already implies, and one that surfaces the close-to-accurate insights the wager depends on.

Underdogs and leapfrogs.

Synthetic respondent panels compress weeks of recruitment and fieldwork into hours. Brand positioning gets pressure-tested against virtual consumer segments before you commit real budget. Creative concepts get evaluated by AI-simulated panels before they reach production. Competitive intelligence gets gathered and synthesised by agents that never stop working.

Go-to-market cycles shorten because you are not waiting three months for a research phase to finish before you can move. Face-to-face research budgets shrink because the expensive qualitative phases get supplemented, not replaced, by synthetic data. You get more confidence to back your instincts, because you can validate a hunch in hours rather than commissioning a six-figure study to confirm what you already suspected.

SMEs have historically been priced out of the research that de-risks product development, service innovation and market entry. Global firms serve enterprise at enterprise prices. Smaller agencies lack methodological depth. Critical decisions get made on gut feel. These tools change that equation. Not by replacing human judgment, but by augmenting it and making it affordable.

The Eureka question.

The question may be less what synthetic data produces than the Eureka moments it sparks. Does it produce inspiration, or misinformation? Will it open up new markets, or direct marketers down disastrous paths?

A synthetic panel would have given Sterling Cooper a perfect read on lipstick shades and missed the line that built the brand. The honest position is that the same tool that compresses six months into six hours will compress a wrong instinct into a wrong launch at the same speed. Humans and AI both respond to fresh insights and additional data sources. Synthetic data can still surprise: market opportunities, customer segments and use cases that spark Eureka moments. The trick is knowing when you’ve found genuine inspiration and when the synthetic data breaks.

Predictive insight.

We have been building a proprietary platform called Market Personality that is already showing promise for a wide range of synthetic data applications across marketing technology and wider applications for AI retrieval-augmented generation (RAG) and fine-tuning. Synthesised data can avoid the pitfalls of breaching privacy and data handling regulations and allow companies to rapidly build comprehensive knowledge bases essential for campaign workflows, composing content and sales support tools like voice agents.

Every insight we produce still gets validated by human researchers who understand how findings land in the real world. AI generates the hypotheses at speed. Humans make the final call. Market Personality is generating unexpected insights that trigger the Eureka moments.

Personality test.

I am looking for clients who want to put Market Personality to work with us. Not as a sales pitch, as a genuine collaboration. If you have qualitative or quantitative data that matters to your business and you are curious what frontier AI could do with it, I would welcome that conversation. We can prototype something together with your actual data, at our risk, and see what we learn. That is how we intend to formalise these approaches: by doing the work, not by waiting for someone else to prove it first.

Dunstone calls synthetic data dehumanising. Ritson calls it the start of “this grand new era of automation.” The wager we are making is narrower than either: for SMEs priced out of conventional research, a calibrated synthetic panel can deliver close-to-accurate insight at a fraction of the cost. I keep knocking. More doors open.

Introducing Market Personality

Market Personality goes beyond market research. Realistic customer responses for your target markets, enhanced with predictive insights traditional research cannot reach.

Meet Market Personality

Footnotes

Ritson, M. (2024). “Synthetic data is as good as real: next comes synthetic strategy.” Marketing Week, 13 June 2024. EY 95% correlation figure attributed by Ritson to AdWeek’s interview with Toni Clayton-Hine. Ritson discloses a minority shareholding in Evidenza.
Brand, J., Israeli, A. & Ngwe, D. (2023). “Using GPT for Market Research.” Harvard Business School Working Paper 23-062.
Hewitt, L., Ashokkumar, A., Ghezae, I. & Willer, R. (2024). “Predicting Results of Social Science Experiments Using Large Language Models.” Stanford Polarization and Social Change Lab. Correlation of 0.85 across 476 measured treatment effects from seventy pre-registered experiments.
Market Research Society Delphi Group (2024). “Using Synthetic Respondents for Market Research.” MRS Delphi Report.
Dunstone, J. (2024). “Embracing Synthetic Data’s Potential While Valuing Real People.” The Research Society, 27 June 2024. Highly Commended, 2024 Luminate Research Article of the Year. The piece independently notes Ritson’s disclosed shareholding in Evidenza.