Why Your GEO Score Is Wrong (And What Block-Level Scoring Fixes)
Most GEO dashboards give you a number. Some give you a trend line. A few give you a sentiment breakdown. None of them tell you which paragraph got cited - or why the one next to it didn't.
That's the gap. And it's not a minor oversight in how AI visibility tools are built. It's a fundamental misunderstanding of how AI retrieval actually works.
The unit of AI retrieval is not your page
When a large language model generates an answer that cites your content, it doesn't read your page the way a human does. It doesn't absorb the narrative arc, pick up the thesis in paragraph one, and carry it through to your conclusion.
It retrieves chunks.
Specifically: RAG (Retrieval-Augmented Generation) systems - the architecture behind Perplexity, ChatGPT's web search mode, Gemini, and most AI search surfaces you care about - break web content into discrete text segments before they ever generate a response. Each segment gets embedded, scored for relevance, and either pulled into the answer or discarded.
The operative chunk length, based on how most production RAG pipelines are tuned and validated against real citation behaviour, sits in the 134–167 word range. That's one strong paragraph, three tight ones, or a complete standalone answer with a fact and a conclusion.
Your page might have twelve of these chunks. Four might be strong. Eight might be invisible to AI retrieval - not because your content is bad, but because those sections aren't structured to survive the chunking process.
The core problem: A page with a Citation Probability Score® (CPS®) of 68 might contain three blocks scoring above 80 and five blocks scoring below 30. The high-scoring ones get cited. The low-scoring ones get ignored. Your dashboard shows you a 68 and tells you things are fine.
What page-level scoring actually measures
To be fair to the tools that report page-level scores: they're measuring something real. Domain authority signals, structured data presence, crawlability, freshness - these matter. Getting the technical foundations right is a precondition for citation, not an afterthought.
But page-level scoring is a blunt instrument when the question you're trying to answer is: which parts of my content is AI actually using?
Here's a concrete example. Say you're a B2B SaaS company. You have a 1,200-word service page covering what you do, who you serve, your pricing model, and a case study. A user asks Perplexity: "Which B2B SaaS companies offer transparent usage-based pricing?"
Your page is crawlable. It has schema. It's indexed. Your GEO score is respectable. But the section about pricing is 90 words buried between a header and a CTA. It doesn't open with a direct answer. It references a table that's image-rendered and therefore invisible to the retrieval system. The fact density - named figures, specific percentages - is low.
Perplexity retrieves the chunk. It scores low for the query. Your competitor's shorter, denser, more direct answer gets pulled instead. Your GEO dashboard doesn't show you this. It shows you the page score. You don't know the gap is there.
What the five block-level pillars actually capture
The Citation Probability Score® evaluates each content block - each retrievable chunk - across five pillars. Here's what each one measures and why it only makes sense at the chunk level, not the page level.
-
Content Structure
Is the chunk in the 134–167 word optimal range? Does it open with a direct answer rather than a throat-clearing setup? AI retrieval systems favour content that gets to the point immediately. A block that spends its first two sentences restating the question it's about to answer loses those 40 words to setup that contributes nothing to retrieval scoring.
-
Fact Density
How many named entities, statistics, and specific claims appear per 100 words? AI systems disproportionately retrieve content that contains verifiable, specific information. Same information, completely different retrieval probability depending on specificity.
-
Answer Structure
Does the block open with a declarative statement that directly addresses the query it might be matched to? Retrieval systems match your content to user queries at the chunk level. A block that buries its conclusion scores lower than one that states it upfront.
-
Self-Containment
Can this block stand alone — without the surrounding page — and still communicate a complete idea? If a chunk says "as mentioned above" or "see the table below" or "which we covered in the previous section," it fails in isolation. AI retrieval doesn't know what "above" is. The chunk is all it has.
-
Freshness Signals
Does the block contain date markers or recency language? This matters especially for Perplexity, which actively weights content freshness in its retrieval ranking. A statistic without a year attached is less citable than the same statistic with "as of Q1 2026."
These aren't abstract criteria. Each one maps to a specific, observable behaviour in how RAG pipelines score and select content. And each one only makes sense measured at the chunk level — because that's where the retrieval decision actually happens.
The specificity gap in practice
The Fact Density pillar is worth dwelling on because the gap between high-scoring and low-scoring content is most visible here.
"Pricing varies by usage and can be customised for enterprise clients."
"Pricing starts at $0.008 per API call, with volume discounts applied above 500,000 requests monthly."
Both may rank on Google. Only the second gets cited by ChatGPT. The difference isn't quality — it's specificity. Adding one named statistic per paragraph is the fastest Fact Density improvement available to most content teams.
The practical consequence
If you're using a GEO tool that scores your pages, you're optimising for the wrong unit. You might be improving overall page authority while the specific blocks matched to high-value queries stay broken.
This creates a pattern that's hard to diagnose without block-level data: your AI visibility metrics look reasonable, but your citation rate in competitive queries stays flat. You're not missing because you're invisible. You're missing because the wrong paragraphs are doing the work.
The fix isn't a complete content rewrite. Often the lift comes from restructuring three or four underperforming blocks per page: leading with a direct answer, tightening the word count into the optimal range, adding one specific statistic, cutting the cross-references that make the block context-dependent. Those changes don't move your page-level GEO score much. They move your actual citation rate significantly.
Why this matters more as AI search matures
Right now, most brands are optimising for presence — appearing in AI-generated responses at all. That's the right first step. But the category is maturing fast.
Perplexity already shows named citations with source attribution. ChatGPT's web search mode selects sources at the passage level. Google's AI Mode pulls specific excerpts, not whole pages. The precision of retrieval is increasing, which means the margin between a block that gets cited and one that doesn't is narrowing.
Page-level optimisation gets you into the game. Block-level optimisation determines whether you win the specific query that matters — the one a buyer is asking at the moment they're deciding between you and a competitor.
The brands building block-level granularity into their content now are the ones who'll own specific query clusters in AI responses twelve months from now. The ones relying on page-level scores will know their overall GEO health and not much else.
What to do with this
You don't need to rebuild your entire content library. Start with your highest-value pages — the ones that should be cited when someone asks a purchase-intent query in your category. Run each one through a block-level audit. Find the chunks that are underperforming. Fix the structure, density, and self-containment issues in those specific blocks.
Then measure citation rate on those queries, not page authority.
That's the feedback loop that actually tells you whether your GEO work is doing anything. Not a dashboard number. Not a trend line. A specific query, a specific block, a specific citation.
The test: Pick your three most important purchase-intent queries. Ask Perplexity each one. Note which paragraph on your site it cites — if it cites you at all. That paragraph is your highest-performing block. The ones it skips are your audit backlog.
Get a block-level CPS® audit
Free instant check at citedbyai.info. Full audits from £49.
Get Your Free Audit →