What is llms.txt?

A proposed standard file at a website's root that gives AI engines a structured Markdown guide to the site's content — analogous to robots.txt for search crawlers, but designed for LLM ingestion.

llms.txt is a proposed standard (originated by Jeremy Howard, 2024) for a Markdown file at the root of a website that gives AI engines a structured, human-readable guide to the site's most important content. The format is intentionally simple: an H1 with the site name, a blockquote summary, a body of optional context, and H2 sections containing curated link lists. A common companion is llms-full.txt (or llms_full.txt), which contains the full long-form Markdown content for direct LLM ingestion without the model needing to crawl the site.

In Detail

The standard emerged because language models that process web content (either at training time or at inference time via tools like browsing) struggle with sites that are JavaScript-heavy, content-buried, or otherwise designed for human navigation rather than machine ingestion. llms.txt provides a vendor-neutral, simple, Markdown-based path for sites to expose their important content explicitly to AI engines. Whether AI engine vendors will adopt the standard at scale is still an open question (as of 2026); adoption is growing among AI-first companies and developer tools.

Why It Matters

For brands that care about how they're represented in AI engines, llms.txt is a low-cost, standards-aligned way to give AI engines a curated view of the site's most important content. Even before broad vendor adoption, having llms.txt and llms-full.txt in place positions the brand for the moment vendors begin honoring the standard.

Real-World Examples

https://huper.technology/llms.txt — Huper's own llms.txt covering products, programmatic SEO hubs, and entity grounding

https://huper.technology/llms-full.txt — Huper's long-form Markdown export of all key content

Anthropic, Vercel, Cloudflare, and many AI-first companies have shipped llms.txt files

Many open-source documentation sites have adopted the standard

How Huper Implements This

Huper Technology ships both llms.txt and llms-full.txt at the root of huper.technology. The llms-full.txt is auto-generated at build time from the data files for the programmatic SEO surface, so it stays in sync with content as the site evolves. GEO Audit (in build) measures customer sites' llms.txt presence as part of the broader GEO posture audit.

Frequently Asked Questions

Do AI engines actually read llms.txt today?

Adoption is growing but uneven as of 2026. Some AI engines that browse the web at inference time may consume llms.txt; some may not yet. The standard is still relatively new. Having llms.txt in place is a low-cost positioning move — costs little, may yield material upside as vendor adoption grows.

Is llms.txt the same as a sitemap.xml?

No. sitemap.xml is a machine-readable list of URLs designed for traditional search-engine crawlers. llms.txt is a Markdown-formatted curated guide designed for AI-engine ingestion. The two complement each other; sites that care about both AI and traditional search ship both.

What's the difference between llms.txt and llms-full.txt?

llms.txt is a curated, link-list-style guide to the site's most important content (small, ~kilobytes). llms-full.txt is the full long-form content of the site exported as Markdown (large, ~tens to hundreds of kilobytes), suitable for direct AI ingestion without crawling. Many sites ship both.

Ready to deploy AI agents?

Tell us what you need. We’ll build, deploy, and manage your AI agents — on our cloud or yours.

Talk to Us