Does adding schema markup increase AI citations?

On pages already cited by AI, the controlled evidence as of May 2026 says no. Ahrefs tracked 1,885 pages adding JSON-LD schema between August 2025 and March 2026 against 4,000 matched control pages. Using difference-in-differences analysis, the study found no statistically significant citation uplift on Google AI Mode (+2.4%) or ChatGPT (+2.2%), and a small but statistically significant decline of 4.6% on Google AI Overviews. On pages not yet visible to AI systems, schema may still play a role in crawlability, entity recognition and knowledge graph inclusion. Schema is a technical readiness tool, not a citation driver on already-cited pages.

Why do 53% of AI-cited pages have schema if schema doesn't drive citations?

Correlation, not causation. Sites that implement structured data also tend to invest in technical SEO, publish authoritative content, build links, and maintain their pages well. Schema rides the wave of every other quality signal. Ahrefs' controlled follow-up study confirmed this: strip schema out and the rest of the signal stack likely still carries the page through to citation. The 53% figure reflects which sites bother with schema, not what causes citations.

Do AI systems read schema markup when fetching a page?

Not during direct retrieval. A searchVIU experiment tested ChatGPT, Claude, Perplexity, Gemini and Google AI Mode. None of them used schema during real-time page fetching. Every system extracted only visible HTML content. JSON-LD, hidden Microdata and hidden RDFa were all ignored. This is mechanistic evidence consistent with Ahrefs' controlled finding that adding schema doesn't move citation outcomes on already-visible pages.

Should I stop implementing schema markup?

No. Schema still produces rich results in traditional search, supports voice assistants, helps knowledge graph entity recognition, and remains a valid signal for crawlability on pages not yet visible to AI systems. What the Ahrefs study disproves is the narrower claim that adding schema to already-cited pages will measurably increase AI citations. Schema's job is upstream (helping pages get crawled, parsed and entity-recognised). What drives citation decisions on already-visible pages is content-level: block structure aligned to RAG retrieval, fact density, declarative answer architecture, self-containment and freshness.

What actually drives AI citations if it isn't schema?

Five content-level signals, scored at the block level rather than the page level. These are the five pillars of the Citation Probability Score (CPS) framework: Content Structure (block size aligned to the 134-167 word RAG retrieval chunk, declarative openings, semantic segmentation), Fact Density (named entities, statistics and verifiable claims per 100 words), Answer Architecture (the declarative 'X is/provides/enables Y' opening pattern AI retrieval systems favour), Self-Containment (whether a block makes sense in isolation without surrounding context), and Freshness (visible date markers and recency language). These signals operate on the unit AI systems actually retrieve: the content block, not the page.

What should I update in my ASEO strategy after this study?

Three changes. First, stop treating schema as the primary citation lever. Keep it for crawlability and knowledge graph value, drop it from the top of your priority list. Second, audit your content at the block level: are your 134-167 word chunks opening with declarative statements, dense with verifiable facts, and self-contained? Third, treat schema-first AI visibility tools (those that lead with schema scoring as their primary output) as solving a different problem than the one Ahrefs measured. They're useful as publish-time signal checks. They're not measuring what drives citation outcomes on already-visible pages.

Strategy Briefing

Schema vs Content Signals: What Actually Drives AI Citations

Published: 11 May 2026 Author: Cited By AI® Reading time: 8 min

Version 1.0 | Published 11 May 2026 | Last verified: 11 May 2026 | Source: citedbyai.info AI Visibility Intelligence

On 11 May 2026, Ahrefs published a controlled study tracking 1,885 pages that added JSON-LD schema between August 2025 and March 2026. Citation changes across Google AI Overviews, Google AI Mode and ChatGPT barely moved. The schema-first positioning that anchors a category of AEO and GEO tools just lost its strongest claim. Here's what the study means, what it doesn't, and what actually drives AI citation decisions.

If you've been told that adding schema is the way to get cited by ChatGPT or Perplexity, the controlled evidence as of today says: not on pages already in the consideration set. Schema still has real upstream value. It's just not the citation lever it's been sold as.

What Ahrefs actually found

Ahrefs (Louise Linehan and Xibeijia Guan, reviewed by Ryan Law) ran a two-stage study. Stage one looked at 6 million URLs and found AI-cited pages were almost three times more likely to have JSON-LD than non-cited pages. That's the correlation everyone has been quoting. Stage two was the controlled follow-up: 1,885 pages that added JSON-LD between August 2025 and March 2026, matched against 4,000 control pages, measured with difference-in-differences analysis across four separate statistical tests.

The headline numbers are these.

Google AIO

−4.6%

Small but statistically significant decline vs matched controls

Google AI Mode

+2.4%

Statistically indistinguishable from zero

ChatGPT

+2.2%

Statistically indistinguishable from zero

All four statistical tests pointed the same way: no meaningful citation growth from adding schema. On AI Overviews the treated pages dropped 4.6% more than matched controls. Both groups were already on a downward AIO trajectory before schema was added, so the small decline can't be cleanly attributed to schema. But equally, if schema were doing the work people claim, you'd expect treated pages to outperform controls. They didn't.

Source: Louise Linehan and Xibeijia Guan, "We Tracked 1,885 Pages Adding Schema. AI Citations Barely Moved.", Ahrefs Data & Studies, 11 May 2026. Read the full study.

The mechanistic confirmation

There's a second piece of evidence worth pairing with the Ahrefs controlled study. A searchVIU experiment tested whether five AI systems use schema markup when fetching a page in real-time: ChatGPT, Claude, Perplexity, Gemini, and Google AI Mode. None of them did. Every system extracted only visible HTML content during direct retrieval. JSON-LD, hidden Microdata, and hidden RDFa were all ignored.

That's mechanistic supporting evidence. Ahrefs' study shows adding schema doesn't move citation outcomes on already-cited pages. searchVIU shows why: the AI systems doing the citing aren't reading schema during the retrieval loop. They're reading what users see.

What this doesn't mean

It's worth being precise about the scope, because the wrong conclusion is "schema is useless" and that isn't what the study supports. Three things the Ahrefs finding does not say:

It doesn't say schema is useless. Schema produces rich results in traditional search, supports voice assistants and shopping features, helps knowledge graph entity recognition, and provides semantic context for crawlers. None of that is in dispute.
It doesn't apply to pages not yet visible to AI. The Ahrefs study population had 100+ AI Overview citations as a February 2025 baseline. These pages were already inside the consideration set. Ahrefs explicitly carves this out: "For pages that aren't being seen by AI systems at all, schema markup might still play a role in helping them get crawled, parsed, or indexed in the first place."
It doesn't disprove that 53% of AI-cited pages have schema. They do. But the study explains why: sites that add structured data also tend to invest in technical SEO, publish authoritative content, build links and maintain their pages. Schema rides the same quality wave as every other signal. Strip schema out and the rest of the stack likely still carries the page.

What this does mean

The claim that's now hard to defend is the narrower one: that adding schema markup to an already-visible page will measurably increase AI citations. That's the claim a category of AEO and GEO tools have been leading with. The controlled data says it isn't true. Or if it is true, the effect is small enough to be lost in statistical noise across thousands of URLs.

If your ASEO strategy treats schema as the primary citation lever, the strategy needs updating. Schema belongs in the technical readiness layer, alongside crawlability, robots.txt, llms.txt, and entity recognition signals. It's not the lever that moves citation outcomes once you're visible.

Two ways to think about the signals

The cleanest way to internalise this is to split the AI citation problem into two stacks: the technical readiness stack (does the AI know your page exists and what entity it represents?) and the citation decision stack (when the AI is choosing what to cite from its consideration set, what makes it pick you?). Different stacks. Different signals. Different priority.

Technical Readiness

Where schema lives

AI crawler access (robots.txt, firewall rules)
llms.txt presence and structure
JSON-LD schema for entity recognition
Knowledge graph signal coverage
Sitemap quality, indexability
Brand entity establishment

Citation Decisions

What actually drives citation

Block size in the 134–167 word RAG range
Declarative opening sentences per block
Fact density: named entities and statistics
Self-containment in isolation
Visible freshness markers and recency cues
Semantic segmentation at chunk level

Schema sits on the left. The Ahrefs study tells us the left column gets pages into the room but doesn't decide who gets quoted. The right column is what decides who gets quoted. That's where attention belongs once a page is visible.

The five content signals that do the work

If schema isn't the citation lever, what is? Cited By AI®'s Citation Probability Score® (CPS®) framework scores five content-level signals at the block level, not the page level. We didn't pick these signals because we liked them. We picked them because peer-reviewed retrieval research and large-scale citation studies kept pointing at them.

The CPS® five-pillar framework

Content StructureBlock size aligned to the 134–167 word RAG retrieval chunk. Declarative opening sentence. Semantic segmentation that lets a retrieval model extract a passage cleanly. The GEO: Generative Engine Optimization study (KDD 2024) showed structural optimisation produced up to 40% visibility uplift in controlled benchmarks.
Fact DensityNamed entities, statistics and verifiable claims per 100 words. The GEO study found adding statistics increased AI visibility by approximately 41%. AI retrieval models weight fact-rich passages 2–3× higher than descriptive prose.
Answer ArchitectureThe declarative pattern AI retrieval systems are designed to surface: "[Topic] is/provides/enables [specific outcome]." Blocks that open this way self-answer the implied query and don't need surrounding context. Most brand content opens with scene-setting and pays the price in retrieval ranking.
Self-ContainmentDoes the block make complete sense in isolation, without the paragraph before it or the heading above it? AI systems extract blocks in isolation. Dangling pronouns, "as mentioned above," "see below" all sink retrieval scores. Self-contained blocks get cited; context-dependent blocks get skipped.
FreshnessVisible date markers, "as of [year]" language, "updated" and "latest" cues within the block itself. Carries the lowest weighting overall but disproportionately important for Perplexity and Bing-powered AI systems that weight recency heavily.

Each pillar is scored on the unit AI retrieval actually operates on: the content block, typically 134–167 words. Page-level scoring averages strong blocks with weak ones and tells you the page is fine. Block-level scoring tells you which paragraph is being skipped and what specifically to rewrite. The full methodology is published at citedbyai.info/citation-probability-score-framework, and the evidence base is at citedbyai.info/cps-research-foundation, which now includes a "What the evidence says doesn't directly drive citations" section explicitly citing the Ahrefs study.

What to update in your ASEO strategy

Three concrete moves, in priority order.

One: stop treating schema as the primary citation lever. Keep it for crawlability, voice assistants, rich results, and knowledge graph signal. Drop it from the top of your AI visibility priority list. It belongs in technical readiness, not citation strategy.

Two: audit your content at the block level. Pull a representative sample of your pages. Read the first paragraph of each one. Does it open with a declarative answer or with brand narrative? Is it in the 134–167 word range? Does it contain named statistics and verifiable claims? Does it make sense without the heading above it? If you're failing on those four checks, you have a citation problem schema can't fix.

Three: treat schema-first AI visibility tools as solving a different problem. They're useful as publish-time signal checks inside their respective CMSes. They're not measuring what drives citation decisions on already-visible pages. The category that leads with schema scoring is solving the technical readiness layer. Useful, but not sufficient. We covered the methodology distinction in detail in Citability Score vs CPS®: What Each Actually Measures.

The intellectually honest framing. Schema is a real signal with real upstream value. It's not a citation driver on already-visible pages. The Ahrefs controlled study, combined with the searchVIU mechanistic finding, makes that the most defensible position as of May 2026. Tools, agencies and frameworks that update their advice to match the evidence are the ones a serious buyer should trust. Tools that keep selling schema as the citation lever after this study aren't reading the research.

See what AI bots can read on your site right now

Free tool. Enter your URL, choose an AI crawler identity (GPTBot, ClaudeBot, PerplexityBot and 12 others), see what's actually visible. Surfaces robots.txt blocks, llms.txt presence and content gaps in under 30 seconds.

Run the AI Crawler Simulator →

The bottom line

The Ahrefs study doesn't tell us schema is useless. It tells us schema isn't doing what a category of AI visibility tools have been claiming it does. On pages already cited by AI, adding schema produces no measurable citation uplift on AI Mode or ChatGPT, and a small decline on AI Overviews that can't be cleanly attributed to schema either way. The mechanism is consistent with searchVIU's finding that AI systems ignore schema during direct retrieval and extract only visible HTML.

What this means for your strategy: schema is technical readiness, not citation strategy. The signals that drive citation decisions live in the content itself, scored at the block level. Block structure, fact density, declarative answer architecture, self-containment and freshness. Those are the five pillars of CPS®, and they're where the lever actually is.

If your current ASEO programme is built around schema scoring, this is the moment to rebalance. The evidence has moved. The advice should move with it.

Score your content the way AI actually scores it

Free 28-module ASEO audit with block-level CPS® scoring across all five content pillars, hallucination detection across five AI platforms, funnel-stage SOV, and GA4 revenue attribution. Results in 48 hours.

Get Your Free ASEO Audit →

Schema vs Content Signals: What Actually Drives AI Citations

What Ahrefs actually found

The mechanistic confirmation

What this doesn't mean

What this does mean

Two ways to think about the signals

The five content signals that do the work

The CPS® five-pillar framework

What to update in your ASEO strategy

See what AI bots can read on your site right now

The bottom line

Related

Score your content the way AI actually scores it