Retrieval-Augmented Generation (RAG) is the technique AI search engines use to ground their answers in real, current web content instead of relying only on what the model memorized during training. When you ask ChatGPT, Perplexity, or Google AI Overviews a question, the system retrieves a small set of relevant documents from a live index, feeds them to the language model as context, and generates an answer based on that retrieved material, citing the sources it drew from. Understanding this pipeline explains, mechanically, why some pages get cited and others don't.
The three stages of RAG
1. Retrieval
The system converts your query into a numerical representation (an embedding) and searches an index of pre-processed web content for documents with similar embeddings, meaning conceptually similar, not just matching keywords. That's why a page can rank for a question it never uses the exact phrasing of, as long as the underlying meaning is close enough in vector space. Most RAG systems retrieve at the passage level, not the whole-page level. Your page typically gets chunked into smaller sections before being indexed, which is why one paragraph can get cited while the rest gets ignored.
2. Augmentation
The retrieved passages get inserted into the model's context window alongside the user's original question, effectively becoming part of the prompt. The model now has the actual current text in front of it rather than relying on a possibly outdated memory of the topic. That's why fresh content has a structural advantage in RAG systems that pure training-data recall doesn't offer. A page published yesterday can be retrieved and cited even though no model was ever trained on it.
3. Generation and citation
The model synthesizes a coherent answer from the retrieved passages, typically attributing claims to specific sources. If multiple retrieved passages disagree, the model has to adjudicate between them. It generally favors passages that state claims more directly, with supporting specifics like numbers, dates, and named entities, over vague or hedged language.
What this means for how you write
| RAG mechanic | Content implication |
|---|---|
| Retrieval works on passages, not full pages | Make every section self-contained, so a reader (or model) can understand a paragraph without needing the three paragraphs before it |
| Embedding similarity matches meaning, not exact keywords | Write naturally and directly; keyword-stuffing doesn't help and can hurt the semantic clarity that retrieval depends on |
| Fresh content gets retrieved over stale training-data recall | Visible, accurate dateModified and genuinely updated content (not just a changed timestamp) carry real weight |
| Generation favors specific, direct claims over hedged language | State your thesis plainly; lead with the conclusion, not a build-up to it |
Why technical SEO still matters here
Before any of this happens, the retrieval index has to have crawled your page at all. A site that blocks crawlers, fails to render server-side, or returns slow responses never enters the retrieval pool in the first place. No amount of well-structured content matters if the page never gets indexed. This is the literal mechanical reason "technical SEO" and "GEO" aren't separate disciplines: RAG retrieval depends on the same crawlability fundamentals traditional search has always required.
The takeaway
RAG isn't a black box. It's a three-stage pipeline (retrieve, augment, generate) with mechanical reasons behind which pages get cited. Self-contained passages, semantic clarity, fresh and verifiably updated content, and basic crawlability aren't abstract "best practices." They map directly to specific steps in how these systems decide what to put in front of a user.
