SEO for Large Language Models and Generative Search

dou.eu 2 godzin temu

The development of Large Language Models (LLMs) and their integration with search engines and chatbots is changing the principles of SEO. Now, it's not just important to rank highly in search results, but also to appear in AI-generated answers from services like ChatGPT, Claude, Gemini, or Google's new Search Generative Experience (SGE) – AI Overview.

This new field has been dubbed LLM SEO – optimising content for language models. I, along with the UniRidge team, want to share which SEO strategies are truly effective today:

  • what Google's Search Generative Experience (SGE) is and how to prepare for it;
  • how SGE impacts traditional SEO;
  • how to create content that AI will quote;
  • which sites appear most often in AI answers;
  • the role of text structure and structured data (schema.org);
  • whether backlinks are still important.

What is Google's Search Generative Experience?

Search Generative Experience (SGE) is an experimental Google feature that integrates generative AI directly into search results. Launched in 2023, SGE expands on the capabilities of the Featured Snippet by providing users with more comprehensive and interactive information without them needing to click on links. It's worth noting that SGE is often confused with Featured Snippets. Both elements appear at the very top of the Google results page, providing a quick answer, but they are fundamentally different in their nature and technology.

A Featured Snippet, or "position zero," is a block that appears above the traditional search results. For example, if you search for "graphic design laptop requirements," the Featured Snippet might show a list of laptops from a single website.

Featured Snippet

Search Generative Experience (SGE), on the other hand, is a more detailed, AI-generated overview that draws on multiple sources. Instead of classic links, the user can immediately get an AI-generated answer in the form of a summary, facts, graphics, or a list of sources. For the same query, "graphic design laptop requirements," SGE might generate several paragraphs of comparison directly on the results page. To the right, there are photos and links to various sources. Below, there are suggestions for specific models and tips on how to choose. The traditional search results are shifted below this large AI block.

SGE — AI Overview

SGE is still experimenting with where to place ad blocks, so we'll likely see other placements in the future. It's important to constantly monitor these changes, as Google is continuously testing and modifying how SGE works, including its frequency and appearance.

SGE creates its answer by combining data from several websites simultaneously. Icons or source names appear next to fragments of the answer, allowing the user to click and go to the original page. Below the AI block, there are often suggestions for follow-up questions that the user can click on. This creates a conversational search experience – the user can ask subsequent questions without leaving the Google page.

To avoid confusing SGE and Featured Snippet, let's compare them:

Featured Snippet vs SGE

Test results for Google's AI Overview show that SGE's accuracy level is on par with Featured Snippets, with the added benefit that SGE doesn't "hallucinate" or invent things in the way other LLM products might.

I would say that SGE is a logical continuation and a more advanced version of the Featured Snippet. It's an evolution of search that shows Google is moving more and more from simply displaying links to providing more comprehensive answers.

How SGE Impacts Traditional SEO

Some people say SEO has been dead for a long time. If so, a new reincarnation is here. SGE will undoubtedly lead to a drop in traffic from classic search results because users get their answers without having to click. So far, this is a regionally inconsistent phenomenon and doesn't affect all topics. According to data from BrightEdge, the full rollout of SGE could impact queries that generate over 40 billion visits. For example, the "health" category will make up 76% of queries covered by SGE, "finance" 17%, "e-commerce" 49%, "B2B technology" 48%, and "tourism" and "entertainment" 30% each.

With the advent of SGE, organic traffic may decline even further, but this doesn't mean SEO is failing. A user might get an answer without needing to click, but your brand can still be mentioned in that answer. So it's worth analysing other metrics: has the number of brand mentions, direct traffic, or branded search queries increased? This could be a sign that your website is appearing in AI answers. There are also now SGE trackers available, such as AIO Research from SE Ranking, that show whether a site is getting mentions in AI Overview snippets.

An interesting piece of analysis comes from a study by Authoritas, which found that in 93.8% of cases, AI references sources that are not in the top 10 Google search results. This research suggests that when an AI creates an answer, it almost never takes information from the same websites that are on the first page of Google's organic results.

As a marketer, I would also recommend investing more in your blog (if you're in an information-heavy industry with extensive SGE coverage) and in other user acquisition channels (if you expect a drop in AI-related traffic, you should compensate with activity in, for example, social media or newsletters).

Which Websites Are Most Often Cited by LLMs?

To appear in generative answers, your content must be formulated in a way that directly answers specific user questions. Concise paragraphs, clear explanations, and well-marked key information all increase the chances of being cited by AI.

Structured data (schema.org) makes it easier for algorithms to analyse and classify content, while the E-E-A-T principles (experience, expertise, authority, and trustworthiness) help build confidence. It's also crucial that the content author is clearly presented and the information is supported by sources and evidence. This means the website should include: the author's name, a biography, data sources, up-to-date information, and certifications, etc.

AI most often quotes content from authoritative, well-known, and frequently visited sources. Language models generate answers by combining information from many places – from official information sites to internet forums. What does this mean for an SEO specialist? It means that now, more than ever, you should focus on PR strategies and work on the quality of your backlinks.

The main types of websites that AI most willingly quotes or paraphrases in SGE results are:

  • Official sources and knowledge bases. This includes well-known encyclopaedias (e.g., Wikipedia), technical documentation, government websites (.gov), and scientific publications. Models often fact-check using knowledge bases like Google's Knowledge Graph or Wikipedia, so content verified there has the best chance of influencing an AI's response.
  • Informational portals and mainstream media. Content from well-known media – news, reports, and analysis – is often used by AI. Language models "read" huge volumes of this material and extract facts, expert quotes, and statistical data. If your brand is mentioned on popular media platforms (e.g., Forbes, Financial Times, industry portals), there's a high chance that AI will "see" this data and use it in its answer.
  • Popular blogs and expert articles. Blogs with a large readership or recognition in a specific niche also become a content source for AI models. Well-organised "content hubs" (i.e., collections of articles around a single topic, for example, dev.ua) signal to the model that the site is an expert on the subject. Models are happy to quote experts from well-regarded blogs, especially when these texts are frequently shared and linked.
  • Forums, communities, and review sites. User-generated content from forums (Reddit, StackExchange), Q&A services (Quora), and review sites is a valuable data source for models. AI models are trained on large amounts of this data and often quote user statements as "lived experience." For example, Google is increasingly showing not just facts, but also discussions from Reddit or reviews from forums in its AI answers.
  • Social media and video platforms. Some models (especially those integrated with search engines) may also consider content from social media, such as posts from X or videos from YouTube. Although AI rarely quotes such sources directly at the moment, a brand's presence on social media builds recognition and trust, which indirectly influences the AI's decision.
  • Local directories and listings. For location-oriented queries, it's also important to have your website and company details in public databases (Google Maps, Yelp, Clutch, etc.). Being included in rankings, "Top 10 Best" lists, and industry awards helps not only with brand visibility but also with the spread of mentions.

When a brand is present in industry directories, has consistent social media profiles, and a visible reputation, algorithms recognise it better and are more likely to include it in their generated answers.

Practical Recommendations for Optimising Content for LLMs

LLM models learn from vast amounts of text and, when generating answers, select fragments that most accurately and fully respond to a user's query. Therefore, for your content to be quoted or paraphrased by AI chatbots, it must be as clear, accurate, and machine-readable as possible.

  1. Text structure makes analysis easier. Use headings (h1, h2, h3), divide the text into thematic sections, and present information in short paragraphs. Have one key idea per paragraph or section. A clear hierarchy of headings and subheadings helps both readers and LLMs understand the material's structure. Use lists, create tables, and use bold text for important information – these all serve as "structural clues" for the AI. Models pay attention to bullet points and numbered steps, as these elements signal well-organised information (a set of facts, a list of tips) that is easy to include in an answer.
  2. Answer common questions directly in the text. Frame headings as questions ("What to do if...", "How to choose...") and place a clear answer immediately below them. This increases the chance that your answer will be used as a ready-made fragment in an AI-generated response. For example, if an article has a section "How to create a website?" with detailed step-by-step instructions, the language model can "look into" this section and relay the most important steps to the user. Focus on user intent: understand what they're asking and provide content that directly addresses it. The FAQ block is one of the most effective tools for AI optimisation, as it integrates easily into language model responses.
  3. Write in natural, understandable language. Models like ChatGPT better understand and reproduce a natural, conversational tone than texts full of keywords. There's no need to overdo the SEO – it's better to use synonyms and phrases that people actually use in their questions. If a sentence is too "confusing" for the average reader, there's a risk that the AI will also skip it. This is where the basics of syntax come to mind: avoid complex sentences and break large, complicated sentences into smaller fragments.
  4. Put the most important information first and use your own data. Since an AI might only use a single paragraph or a few sentences from your text, you should use so-called front-loading, which means placing key information at the very beginning of a section. This improves the visibility of key facts. You should also add unique data, figures, and conclusions from your own research, as models love specifics and often quote numbers or statistics, especially if they are presented in a clear format (e.g., a table, a list).
  5. Prioritise a high level of readability. Content written "for people," with short sentences and without a formal, business-like style, is suitable for both users and AI. Tools like Yoast SEO recommend that text be easy to read (short paragraphs, etc.) – this kind of content is also easier for language models to analyse. In short: what's good for the reader is good for the AI.
  • Check how AI sees you. From time to time, ask ChatGPT or other AI models about a topic that concerns you or about your brand. For example: "How to open a CT scan files?" – and see if the model mentions your product. If not, it's a signal that, for example, you need to increase your brand's visibility online. This kind of monitoring helps you understand what information the AI "knows" about you and how it presents it, allowing you to better adapt your strategy.
Traditional SEO vs LLM SEO and Generative Search

Technical Aspects of Your Website for SGE and AI Overview

Optimising for LLMs doesn't override the basic principles of SEO – website loading speed, mobile optimisation, proper meta tags, sitemaps, etc. In fact, content optimised for language models usually works well in traditional search too.

You should periodically audit your website to ensure there are no technical issues. Online tools for a quick SEO audit can help you check key aspects. There are many nuances, but let's focus on two you should pay attention to: first, how your website is scanned by bots, and second, its structure.

Scanning

The first step of any technical audit is to check your robots.txt (a file at your-domain/robots.txt with a set of directives for the search bot). Important pages and sections should not be blocked in robots.txt for Googlebot and AI crawlers. Some specialists are even proposing a separate llm.txt file, similar to robots.txt. In any case, a site that is correctly open for scanning via robots.txt is a prerequisite for getting into any AI system.

Popular AI crawlers (user agents)

If bots can't access your website, they won't be able to use its content in their answers.

Content Structure and Structured Data

Site structure (i.e., the semantic arrangement of HTML and the visual organisation of text) and structured data (schema.org, microdata) can both affect whether your content is used by AI. However, their significance is different.

For LLM models, the key is the site's content and how easily it can be "extracted" as plain text. LLMs analyse a site just like a human would, but faster – paying attention to headings, paragraphs, and lists. Therefore, proper formatting and a logical structure significantly increase the chances that your content will be used in an AI answer.

Structured data – special tags (schema.org, JSON-LD, microdata) – are invisible to the user but give machines additional context about the site's content. For example, FAQ Schema for a Q&A section, or Product Schema for product details, price, and availability. But there are many other formats:

  • RealEstateListing – for property listings.
  • JobPosting – for job vacancy pages. It includes the job title, company, location, requirements, and salary.
  • Event – for event pages. It contains information about the name, date, location, and tickets (for conferences, concerts, webinars).
  • Recipe – suitable for cooking websites and blogs.

In traditional SEO, schema.org is very useful: it allows you to get so-called rich search results (rating stars, questions, product cards, etc.). In the context of LLMs, the impact of structured data is indirect, as LLMs don't analyse schema data directly. Language models operate on content that has been indexed or is in their memory (or that they get via a search engine). As experts note, schema.org is not a ranking factor for LLMs. The most important thing is the quality and readability of the text itself.

In summary, I'd say that schema.org doesn't guarantee that ChatGPT will choose your website, but it won't hurt, and it will even help, if you take care of the rest.

In Conclusion

The era of AI-powered search has already begun, and SEO specialists must stay ahead of the curve. Watch out for new SGE formats and treat them as opportunities for growth. SGE is changing the way users interact with search engines. This isn't the end of SEO – it's a new phase where what matters isn't just your position on Google, but also your presence in AI-generated answers.

SEO in the age of AI means:

  • content written with user questions in mind;
  • a clear structure;
  • specific and trusted sources;
  • a strong online brand reputation.

The first to adapt will gain an advantage. So don't stop creating high-quality content and looking after your website's technical health – this is an approach that "kills two birds with one stone," supporting both SEO and visibility in AI answers.

Idź do oryginalnego materiału