Technical foundations of LLMs: understanding the infrastructure behind AI Search

Alexandre Hoffmann 28/03/2025 4 minutes

For digital marketers and SEO professionals, focusing on rankings alone is no longer enough. Understanding the infrastructure that powers large language models (LLMs) is the key to future-proofing visibility in this new search ecosystem.

Let’s break down how these models process content, where Retrieval-Augmented Generation (RAG) fits in, and what it all means for the shift from traditional SEO to Generative Engine Optimisation (GEO).


How LLMs actually process content

LLMs will feel like a different beast if you’re used to optimising for traditional search engines. While search engines crawl and index, LLMs read, understand, and generate.

 

 

Step one: tokenisation

LLMs don’t read like we do. They break down content into bite-sized bits called “tokens” – think parts of words, punctuation, or spaces. For example, “optimisation” might be tokenised as “optim” + “isation”.

Why does this matter? Because tokenisation powers everything. It feeds into how meaning is constructed, how context is retained, and how long a model can “hold a thought.”

The standard process:

  1. Break the input into digestible chunks

  2. Convert those chunks into tokens

  3. Turn tokens into numbers (vectors, for the techy folk)

  4. Feed these numbers into the neural network for processing

Beyond keywords: semantic understanding

Traditional search hinges on keywords. LLMs? They’re all about meaning. These models position words in a massive multi-dimensional space through a method called embeddings. Words and phrases that “mean” similar things are clustered close together.

So, while “automobile purchase guidance” and “buying advice for cars” may look nothing like SEO tools, they’re neighbours to an LLM.

This means keyword stuffing is yesterday’s news. Content must now show depth, clarity and context across a topic – not just repeat the right phrase.

The context window

LLMs operate within a “context window” – a limit on how much text they can consider in one go. Depending on the model, this might be 8,000 or up to 128,000 tokens.

What’s the implication? Structure matters. If your key messages are buried deep or scattered, they might get missed entirely. So, front-load the good stuff, use clear headings, and build a logical content flow even in snippets.

 

Enter RAG: Retrieval-Augmented Generation

RAG is the tech that brings real-time information into AI conversations. It powers tools like ChatGPT with Web, Perplexity, and Google AI Overviews, helping LLMs cite up-to-date info from the web instead of relying solely on training data.

 

 

How RAG Works

A typical RAG system includes:

  • Retriever: Searches the web or database for relevant info

  • Generator: Uses an LLM to respond using that info

  • Augmentation Layer: Weaves it all together into a coherent answer

Here’s the flow:

  1. Understand the query

  2. Retrieve external info

  3. Assemble context

  4. Engineer the prompt

  5. Generate the response

  6. Link citations back to sources

Each platform handles this differently, but the end goal is the same: accurate, timely, contextual answers.

 

The technical hurdles

RAG isn’t flawless. The biggest technical challenges include:

  • Sifting truly relevant info from the noise

  • Condensing retrieved text to fit within the model’s limits

  • Reconciling contradictions between live data and pre-trained knowledge

  • Accurately citing sources

  • Balancing freshness vs authority

From a content creation perspective, understanding these hurdles helps you create content that’s more likely to be surfaced and cited by AI systems.

 

 

SEO vs GEO: what’s technically different?

This isn’t just a new flavour of SEO – it’s a different operating system. Here’s how traditional search and LLM-driven search diverge on a technical level.

Crawling and indexing

  • Traditional SEO: Relies on crawlers like Googlebot to read and index everything from HTML to JavaScript.

  • GEO: Most AI crawlers don’t fully render JavaScript. If your key content is client-side only, it might never be seen.

Keep essential information in static HTML. It may not exist in the AI if it’s not there at page load.

Structured data & schema

  • Traditional SEO: Schema.org markup (like JSON-LD) helps trigger rich results in the SERP.

  • GEO: LLMs might ignore schema but will understand well-written, semantically clear copy.

That means clear headings, ordered lists, and logical grouping aren’t just nice to have – they’re vital for LLM comprehension.

Authority signals

  • Traditional SEO: Backlinks, domain authority, site speed, mobile usability

  • GEO: Mentions, context, and relationships matter more than link juice

LLMs look for how your brand or topic fits into broader semantic ecosystems. Who are you associated with? What context surrounds your name?

Query handling

  • Traditional SEO: Match queries to ranked pages

  • GEO: Interpret conversational intent and generate answers directly

Your content isn’t showing up as a blue link anymore. It’s being paraphrased, summarised, and cited – or not.

 

GEO implementation strategies: what you can actually do

Here’s how to technically optimise for generative search systems.

1. Map entity relationships

Think beyond keywords. Define relationships between your brand and:

  • Related topics

  • Industry challenges

  • Specific tools, features or use cases

  • Competitive comparisons

Contextualise your offering within an ecosystem. Don’t just describe what you do – explain how it connects to your audience’s problems.

2. Engineer for citation

Make your content citation-friendly:

  • Include unique research or stats

  • Format insights clearly with proper attribution

  • Highlight definitional or explanatory paragraphs

  • Use structured but readable formats

Aim to be the source that LLMs want to quote. That’s how you show up in AI search results.

3. Prioritise preprocessing

Make it easy for AI crawlers to process your content:

  • Avoid JavaScript-dependant elements for key content

  • Use semantic HTML elements: headings, bullets, blockquotes

  • Create clear visual and logical content hierarchies

  • Maintain a healthy text-to-code ratio for technical pages

4. Manage context windows

Structure content with LLM constraints in mind:

  • Front-load essential information

  • Build standalone sections with complete ideas

  • Use repetition (not redundancy) to reinforce concepts

  • Maintain consistent terminology for clarity

In short, say what you mean early and clearly. Don’t leave your best insights buried at the bottom.

 

What does Passion Digital make of all of this?

The move from SEO to GEO isn’t just technical; it’s transformational. LLMs are changing not just what gets surfaced in search but also how it’s evaluated and delivered.

If SEO is about getting found, GEO is about getting referenced.

By understanding how LLMs think, what they struggle with, and how they choose what to cite, we can create content that performs brilliantly in both traditional and generative search environments.

This isn’t a “one or the other” situation. It’s a case of expanding our toolkit. At Passion Digital, we combine performance and imagination to help brands stay visible, relevant, and ready for whatever comes next.

Because the future of search isn’t coming, it’s already here.