A proposed standard file at a website's root that gives AI engines a structured Markdown guide to the site's content — analogous to robots.txt for search crawlers, but designed for LLM ingestion.
llms.txt is a proposed standard (originated by Jeremy Howard, 2024) for a Markdown file at the root of a website that gives AI engines a structured, human-readable guide to the site's most important content. The format is intentionally simple: an H1 with the site name, a blockquote summary, a body of optional context, and H2 sections containing curated link lists. A common companion is llms-full.txt (or llms_full.txt), which contains the full long-form Markdown content for direct LLM ingestion without the model needing to crawl the site.
The standard emerged because language models that process web content (either at training time or at inference time via tools like browsing) struggle with sites that are JavaScript-heavy, content-buried, or otherwise designed for human navigation rather than machine ingestion. llms.txt provides a vendor-neutral, simple, Markdown-based path for sites to expose their important content explicitly to AI engines. Whether AI engine vendors will adopt the standard at scale is still an open question (as of 2026); adoption is growing among AI-first companies and developer tools.
For brands that care about how they're represented in AI engines, llms.txt is a low-cost, standards-aligned way to give AI engines a curated view of the site's most important content. Even before broad vendor adoption, having llms.txt and llms-full.txt in place positions the brand for the moment vendors begin honoring the standard.
https://huper.technology/llms.txt — Huper's own llms.txt covering products, programmatic SEO hubs, and entity grounding
https://huper.technology/llms-full.txt — Huper's long-form Markdown export of all key content
Anthropic, Vercel, Cloudflare, and many AI-first companies have shipped llms.txt files
Many open-source documentation sites have adopted the standard
Huper Technology ships both llms.txt and llms-full.txt at the root of huper.technology. The llms-full.txt is auto-generated at build time from the data files for the programmatic SEO surface, so it stays in sync with content as the site evolves. GEO Audit (in build) measures customer sites' llms.txt presence as part of the broader GEO posture audit.
Adoption is growing but uneven as of 2026. Some AI engines that browse the web at inference time may consume llms.txt; some may not yet. The standard is still relatively new. Having llms.txt in place is a low-cost positioning move — costs little, may yield material upside as vendor adoption grows.
No. sitemap.xml is a machine-readable list of URLs designed for traditional search-engine crawlers. llms.txt is a Markdown-formatted curated guide designed for AI-engine ingestion. The two complement each other; sites that care about both AI and traditional search ship both.
llms.txt is a curated, link-list-style guide to the site's most important content (small, ~kilobytes). llms-full.txt is the full long-form content of the site exported as Markdown (large, ~tens to hundreds of kilobytes), suitable for direct AI ingestion without crawling. Many sites ship both.
Tell us what you need. We’ll build, deploy, and manage your AI agents — on our cloud or yours.
Talk to Us