ASEO Fundamentals

Machine-Readability Is the Floor, Not the Ceiling

Published: 13 April 2026 Author: Cited By AI® Reading time: 6 min
Version 1.0 | Published 13 April 2026 | Last verified: 13 April 2026 | Source: citedbyai.info AI Visibility Intelligence

Getting your site machine-readable is necessary. Schema markup, llms.txt, clean crawler access, structured entity data: all of it matters, and any business that hasn't done it yet is starting from a deficit. But machine-readability is eligibility. It's not selection. And the difference between those two things is where most ASEO advice currently stops.

There's a growing category of ASEO advice, and a growing number of products built around it, that positions machine-readability as the core problem to solve. Get the infrastructure right, build the structured signals layer, make yourself legible to AI systems, and you'll start appearing in AI answers. The argument is coherent. It's also incomplete.

A site can be perfectly machine-readable and still get skipped at retrieval. That's not a technical failure. It's a content failure, at the paragraph level. And it's the failure that most infrastructure-focused approaches don't touch.

Two layers. One problem is visible. One isn't.

AI citation involves two distinct decisions, made at two different stages. Most current advice focuses on the first and skips the second.

Layer 1 — Floor
Machine-Readability
Can AI crawlers access, parse, and verify your site? Do they know who you are as an entity?
Schema markup, llms.txt, robots.txt, entity data, structured signals, clean rendering
Layer 2 — Above
Block-Level Citability
When AI runs a retrieval query, which specific paragraphs get selected over your competitor's?
Answer structure, fact density, self-containment, freshness signals, declarative opening — measured by CPS®

Layer 1 is the infrastructure problem. It's well-understood, well-documented, and there are now a reasonable number of services that address it. You need it. Skipping it means AI systems can't confidently identify you as an entity, can't access your content, and won't cite you regardless of quality.

Layer 2 is the content problem. It operates at the level of individual 134-167 word content chunks (the size AI retrieval systems embed and evaluate when matching a query to candidate sources). Once a site passes the machine-readability threshold, the retrieval decision comes down to which blocks score highest on a set of citability signals. Infrastructure doesn't affect that score. The words in the block do.

What each layer actually covers

Machine-Readability covers
  • Schema markup and structured data
  • llms.txt and AI crawler access
  • Structured entity identity signals
  • Clean HTML rendering (no JS blind spots)
  • robots.txt permissions for AI bots
  • Geographic and service area signals
Block-level citability covers
  • Declarative answer structure per paragraph
  • Fact density per 100 words
  • Self-containment without surrounding context
  • Freshness signals and date markers
  • RAG chunk size compliance (134-167 words)
  • Opening pattern that matches retrieval expectations

These are different problems. You can solve Layer 1 completely and still score poorly on Layer 2. A site with perfect schema, a clean llms.txt, and flawless entity data can have every paragraph opening with brand narrative, containing no verifiable facts, and referencing context that AI retrieval can't see. That site is machine-readable. It isn't citable.

Where the infrastructure argument stops

The clearest version of the machine-readability argument appears in content like Surfaced's recent post, "The End of Search as We Know It: Why Machine-Readability Is the New Competitive Edge." The framing is honest and largely correct at Layer 1: most businesses haven't built the structured signals layer that tells AI systems who they are and what they do. That gap is real. The article is worth reading.

Where it stops is exactly where Layer 2 begins. Surfaced's argument is that building the infrastructure is what gets you cited. The infrastructure gets you eligible. Two different outcomes.

Consider what actually happens when someone asks ChatGPT "what's the best ASEO consultancy in the UK?" The model doesn't simply check which sites have clean schema and return those. It runs a retrieval process across its training data, pulls the content blocks that best match the query semantics, and generates an answer from those blocks. The blocks it selects are the ones with the highest citability score at query time. A perfectly structured entity with low-citability paragraphs doesn't appear. A less polished entity with high-citability paragraphs does.

Machine-readability determines whether you're in the room. Block-level citability determines whether you're the answer.

The five signals that determine selection

The Citation Probability Score® (CPS®) framework measures citability at block level across five pillars. Each one governs a distinct aspect of how AI retrieval systems evaluate a 134-167 word chunk:

None of these are infrastructure signals. Schema markup doesn't affect fact density. llms.txt doesn't change whether a paragraph opens with a declarative answer. Entity data doesn't determine whether a content block is self-contained. These are content decisions, made at the paragraph level, that sit entirely above the machine-readability layer.

The sequence that actually works

Infrastructure first. Content second. That's the right order, and both layers are required.

A business that skips Layer 1 and writes highly citable content is building on an unstable foundation. AI systems that can't reliably identify and access a site won't cite it consistently regardless of content quality. The infrastructure work isn't optional; it's prerequisite.

But a business that completes Layer 1 and assumes the job is done will keep wondering why their AI visibility isn't improving. They've solved eligibility. They haven't solved selection. The paragraphs are machine-readable. They're not citable. And no amount of additional schema is going to change that.

The full sequence: Build the machine-readability foundation (schema, llms.txt, entity signals, crawler access), then score every page at block level using CPS® to identify which paragraphs are failing at retrieval and rewrite them to Grade B minimum.

Both layers have a measurable output. Machine-readability has a binary pass/fail: either AI crawlers can access and understand your site, or they can't. Block-level citability has a continuous score: 0-100 per page, with a pillar breakdown showing exactly which signal is causing the underperformance and what to change. The free CPS® Block Scorer at citedbyai.info/cps-scorer gives you an instant score on any paragraph with no signup required. Paste your current best paragraph. If it's below Grade B, you've found the problem.

Check your citability layer, not just your infrastructure

Paste any paragraph and get a 0-100 Citation Probability Score® with a five-pillar breakdown. Free, instant, no signup. Tells you exactly which signal is failing and what to fix.

Score a Paragraph Now →

Want both layers audited in one report?

Free audit. 27 modules. 5 platforms. Machine-readability plus block-level CPS® scoring across your entire site. Results in 48 hours.

Get Your Free Audit →