If you have spent any time reading about AI search lately, you have probably bumped into a small text file called `llms.txt`. The pitch is simple: drop a tidy Markdown file at the root of your domain, list your most important pages with one line of context each, and large language models get a clean map of your site instead of wrestling with your full HTML.
That is the idea. Writing one takes about an hour. The harder questions are what goes in it, where it lives, how you keep it current without it rotting into a stale list, and whether any AI engine actually reads it yet. We are going to answer all of those, and we are going to do it with a real, copy-ready example rather than a vague description.
This guide is written from the inside. We run an auto-refreshed `llms.txt` on our own domain, and we regenerate it at the end of every publishing run, so the file you see below is close to the shape of the one we actually ship.
AIToolsBakery is an independent AI-tools review site. We are not a hosting company upselling a feature, and we take no vendor money to move a verdict, so when we tell you adoption is early and the evidence is mixed, that is the honest read and not a hedge to protect a partner. Treat `llms.txt` as cheap insurance, not a growth hack.
The short version: An llms.txt file is a Markdown index at your site root that lists key pages with short descriptions, so AI models can navigate your content. Write an H1 name, a blockquote summary, then H2 sections of links. It is easy to ship, but real AI-engine adoption is still early and unproven.
What an llms.txt File Is and Why It Exists
An `llms.txt` file is a plain Markdown document that you place at the root of your domain, at `https://yourdomain.com/llms.txt`. It is meant to give language models a curated, low-noise summary of your site: who you are, what matters, and where the good content lives.
The format was proposed in late 2024 by Jeremy Howard of Answer.AI and is documented at llmstxt.org. The problem it tries to solve is real. When a model lands on a typical web page, it has to chew through navigation menus, cookie banners, sidebars, ads, and scripts to find the few hundred words that actually matter. Context windows are finite, so all that clutter is wasted budget. An `llms.txt` file hands the model a clean shortlist instead, in Markdown, which models parse easily.
It is worth being precise about what this standard is, because the marketing around it has gotten loud. As of mid-2026, `llms.txt` is a community proposal, not an official standard ratified by the IETF or the W3C. It is closer in spirit to `robots.txt` than to anything baked into a browser. That does not make it worthless. It just means you should size your expectations accordingly.
The Format and Spec, With a Worked Example
The spec is short, which is part of its appeal. There is exactly one required element (the H1), and everything else is convention. Here is the structure in order:
- An H1 with your site or project name. This is the only required line.
- A blockquote (`>`) holding a one-paragraph summary. Technically optional, but you should always include it.
- Optional free-text details (paragraphs or lists, but no headings).
- One or more H2 sections, each holding a Markdown list of links in the form `name: optional note`.
- An optional H2 literally named `Optional`, whose links a model can skip when it needs a shorter context.
Here is a representative `llms.txt` for a content and review site like ours. Copy the shape, swap in your own pages.
“`markdown
AIToolsBakery
Independent reviews, comparisons, and how-to guides for AI search and AI productivity tools. We test tools hands-on, take no vendor money to move a verdict, and publish honest verdicts for buyers and builders.
AIToolsBakery is run by a small independent team. Our focus areas are
AI search engines, generative engine optimization, and AI writing and
research tools. All reviews are first-hand and editorially independent.
Core guides
- Generative Engine Optimization: How to get cited by AI answer engines.
- How to Rank in ChatGPT: Practical playbook for visibility inside ChatGPT.
- How to Write an llms.txt File: This guide, with a copy-ready example.
Tool reviews
- Best AI Search Engines: Ranked comparison of AI answer engines.
- Perplexity Review: Hands-on verdict, pricing, and limits.
About
- About AIToolsBakery: Who we are and how we test.
- Editorial and independence policy: How we keep reviews unbiased.
Optional
- Full archive: Every post, newest first.
- Contact and licensing: Press and content licensing.
“`
Notice the discipline here. The summary is one tight paragraph. Each link carries a short, plain description of what the reader (or model) gets. The `Optional` section holds the nice-to-haves that a model can drop when it is tight on context. Keep the whole file lean. A good `llms.txt` is usually well under 10 kilobytes, because the point is signal, not a sitemap dump.
Faz says: The single most common mistake we see is people pasting their entire sitemap into `llms.txt`. That defeats the purpose. This file is a curated menu, not the whole kitchen. If a link would not earn a spot in your main navigation, it probably does not belong here either.
llms.txt vs llms-full.txt vs robots.txt
These three files get muddled constantly, so here is the clean breakdown. They do different jobs and you can ship all three.
| File | What it contains | Job it does | Who it is for |
|---|---|---|---|
| `llms.txt` | A short Markdown index: H1, summary, curated links with notes | Points models to your best content | AI models reading at answer time |
| `llms-full.txt` | The actual full text of your key pages, concatenated into one big Markdown file | Lets a model ingest your whole content in one pass, no extra fetches | AI models and IDE agents that want everything inline |
| `robots.txt` | Crawl rules: allow and disallow paths, plus AI-bot directives | Tells crawlers what they may and may not fetch | Search and AI crawlers deciding access |
The key thing to understand: `llms.txt` is a map, while `llms-full.txt` is the territory. The `llms-full.txt` variant (typically generated by tooling such as `llms_txt2ctx`) can be twenty-plus times larger because it inlines the actual content rather than just linking to it. The original spec at llmstxt.org does not define `llms-full.txt` as a separate standard; it emerged as a tooling convention. And `robots.txt` is a different animal entirely. It governs access and crawling, not curation. None of these replaces the others.
Saru says: Do not delete your `robots.txt` and assume `llms.txt` covers it. They are not substitutes. `robots.txt` is the bouncer at the door deciding who gets in. `llms.txt` is the menu you hand the guests who made it inside. You want both.
Where to Put It and How to Validate It
Placement is simple and non-negotiable: the file goes at the root of your domain, served at `https://yourdomain.com/llms.txt`. Not in a subfolder, not in `/docs/`. Root. Serve it as plain text or Markdown with a `200` response, and make sure it is publicly reachable without a login.
To validate it:
- Open `https://yourdomain.com/llms.txt` in a browser and confirm it loads as readable text, not a download or a 404.
- Check that every link resolves. Dead links in this file are worse than no file, because they signal neglect.
- Run it through a Markdown linter or one of the free `llms.txt` validators that appeared through 2025 and 2026. They flag a missing H1, malformed link syntax, and oversized files.
- Confirm your server returns the correct content type and does not redirect the URL somewhere unexpected.
That is the whole checklist. There is no registration step and nobody to submit it to. You publish it and it is live.
How to Generate and Auto-Refresh It (The AITB Approach)
The biggest practical risk with `llms.txt` is not writing it. It is letting it go stale. You publish a clean file, then over the next three months you ship twenty new posts and retire five old ones, and now your `llms.txt` is a misleading snapshot of a site that no longer exists. A stale index is arguably worse than none.
You have two ways to keep it current.
The manual route is fine for a small, slow-moving site. You keep the file in your repo or your CMS, and every time you publish something worth featuring, you hand-edit the relevant H2 section. Set a recurring reminder to prune dead entries. If you publish a couple of times a month, this is sustainable.
The automated route is what we run, and we recommend it for any site publishing at pace. Our `llms.txt` is generated, not hand-written. At the end of every publishing run, our pipeline rebuilds the file from a single source of truth (the CMS and the sitemap), regenerates it, and republishes it to the root. Because it is the last step in the pipeline, the file can never drift more than one publish cycle behind reality.
If you want to build the same thing, the recipe is straightforward:
- Pull your published pages from your CMS API or parse your XML sitemap.
- Filter to the pages that deserve a spot. We use cluster pillars, top reviews, and core guides, not every tag page.
- Group them into H2 sections by cluster or content type.
- Pull each page’s short description from its meta description or a custom field, so the one-line notes write themselves.
- Assemble the Markdown (H1, blockquote, sections) and write it to the web root.
- Call that script as the final step of your deploy or publish process.
Once that exists, the file maintains itself. That is the entire point of automating it: you stop thinking about `llms.txt` as a document and start treating it as build output.
Does It Actually Work Yet? An Honest Answer
Here is where we have to be straight with you, because a lot of the content out there is not. As of mid-2026, no major AI provider has publicly committed to reading and acting on `llms.txt` in their production answer surfaces. Not OpenAI, not Google, not Anthropic, not Perplexity. Having the file does not measurably move your odds of being cited in those tools today. Anyone who tells you `llms.txt` is a guaranteed ranking lever for ChatGPT or Gemini right now is selling something.
That said, the picture is not nothing. A few honest data points:
- IDE-based coding agents (Cursor, Cline, Continue, Aider and similar) increasingly look for `llms.txt` when you point them at a documentation site, and MCP servers for docs sometimes consume it directly. If you run a developer-facing site, this audience is real today.
- Google appears to be crawling these files. Observers tracked the number of `llms.txt` files seen in Google’s index climbing into the low hundreds of thousands through early 2026. Crawling is not the same as ranking, but it shows the files are being noticed.
- Adoption among serious sites has spread fast. Companies like Anthropic, Stripe, Cloudflare, and Vercel ship one. Estimates still put overall adoption in the single-digit-to-low-teens percentage of sites, so this is early.
So why bother? Because the cost is roughly a half day to build and near zero to maintain once automated, the file does no harm to your existing SEO, and the day a major engine decides to respect it, you are already correct instead of scrambling. That is the calculus. It is insurance with a tiny premium, not a miracle. We ship one for exactly that reason, and we are upfront that the upside is currently potential rather than proven.
Common Mistakes to Avoid
A short list of the errors we see most often:
- Dumping the whole sitemap. Curate. This is a highlight reel, not an archive.
- Putting it in the wrong place. It must be at the domain root, not in a subfolder.
- Letting it go stale. A file that describes last quarter’s site erodes trust. Automate the refresh.
- Broken or redirecting links. Validate every URL. Dead links here look like neglect.
- Skipping the descriptions. A bare list of URLs throws away the one job this file does well, which is adding context.
- Treating it as a replacement for `robots.txt` or real on-page content. It is an addition, not a substitute. Your pages still need to be good.
- Overpromising internally. If your team thinks shipping this will spike AI citations next week, set expectations now.
Conclusion
Writing an `llms.txt` file is genuinely easy. Open a Markdown file, add an H1 with your site name, a one-paragraph blockquote summary, then a handful of H2 sections listing your best pages with a short note on each. Put it at your domain root, validate that it loads and that the links work, and if you publish regularly, generate it from your CMS or sitemap so it refreshes itself instead of rotting.
The honest framing is the one we hold ourselves to: adoption is early, no major answer engine has committed to it yet, and the proven payoff today is modest. But the cost is small, the file does no harm, and it positions you well if the standard catches on. We run an auto-refreshed one on our own site for precisely that reason. Ship yours, keep it current, and do not expect it to do the work that good content and clean structure already have to do.
To go deeper on the surrounding playbook, read our guide to generative engine optimization and our practical walkthrough on how to rank in ChatGPT.



