Why does AI show outdated information about your business?
And how that bug is changing the world as we know it.
AI search shows outdated information about your business because every AI chatbot pulls from a third-party search index with its own crawl schedule. The index lags the live web by days or weeks. The gap is widening, not closing, because the web AI is reading from is increasingly written by other AI systems reading older versions of the web.
I made a fix on a client site. I verified it. I requested a reindex. I waited. Weeks later, Claude was still telling me to fix things I had already fixed.
Not a day. Not two. Weeks.
When I finally pushed back, Claude told me its search results were running through a third-party index with its own refresh cadence on top of Google’s. It told me the lag was its problem, not mine. It told me to skip the AI and verify at google.com directly.
That answer is the architecture of every AI search product, briefly visible.
How AI search actually works
When an AI chatbot answers a question by searching the web, it isn’t searching the web. It’s searching one specific search engine’s index of the web. The UI never says so. The distinction is invisible. It also determines everything the AI can and cannot see.
Claude reads through Brave Search. Anthropic confirmed it in March 2025 by adding Brave to its subprocessor list, and software engineer Simon Willison verified it independently by running matching queries through Claude and Brave and comparing the citations.
ChatGPT reads through Bing. OpenAI’s VP of Engineering said as much in a public AMA: “we use a set of services and Bing is an important one.” Independent log analysis shows Bing handling most of ChatGPT’s web retrieval, supplemented by OpenAI’s own crawler, OAI-SearchBot.
Gemini and AI Overviews read through Google’s index. The same pipeline that powers Search. Google publishes the API in Gemini’s developer documentation. A tool called “Grounding with Google Search” returns search results as structured citations the model writes around.
Perplexity is the messy one. It maintains its own crawler, PerplexityBot, with its own index. It also uses third-party search APIs.
Three separate webs. Plus Perplexity standing somewhere between them.
When something on a page changes, the relevant search engine’s crawler has to re-crawl the page on its own schedule before any of these AIs see the change. Crawl frequency varies by site authority and traffic. Low-traffic pages wait longer at every engine. The AI inherits whatever lag its underlying search provider has, and the user sees the lag without ever seeing the provider.
That’s the architecture problem. The deeper problem is what the architecture is now reading from.
Why it’s getting worse, not better
In April 2025, Ahrefs analyzed 900,000 newly created web pages and found that 74.2% of them contained AI-generated content. Three out of four. That’s not a forecast. That’s what the live web actually looked like a year ago. The percentage is higher now.
The training data for the next generation of these models is the same pool. The next generation reads a pool more synthetic than this generation read. The generation after that reads a pool more synthetic still. The arrow points one direction.
This is not a cache lag problem. This is a representation system feeding on itself.
AI eats stale and produces branches of stale. The next AI eats those branches and produces branches of branches. The image of the image of the image. Each generation sees less of the live web and more of the previous generation’s interpretation of the live web.
Baudrillard called this the procession of simulacra: the image leaves the thing behind and starts standing in for it. The endpoint is a representation that bears no relation to the thing it once represented. We have a working example now, deployed at scale, with marketing budgets pointed at it.
What this means for your marketing budget
Marketing spend is shifting from search ads to AI visibility. Buyers are asking ChatGPT and Claude and Perplexity instead of Google. The answers they get are produced by a model reading a third-party index of a web that is three-quarters written by other models reading older third-party indexes. The further along the chain you go, the further the answer drifts from anything anyone could verify on a live page.
What this means in practice: the AI’s version of your business is starting to substitute for your business in the moments when buyers decide.
How to verify what’s actually live about your business
Skip the AI when you want to know what’s currently visible to a buyer. Go to google.com, bing.com, and search.brave.com directly. Run the same query at each. Each engine shows you exactly what it currently has. The AI is one cache layer downstream of that.
That’s the page-level fix. It works for verifying that a specific change has propagated and for catching directories or listings that have drifted.
The system-level problem has no page-level fix. The gap between the cache and the live page is where your customer is making decisions today. The gap between the simulacrum and the live world is where your customer will be making decisions soon. At that point the live page is no longer the thing. The image of the image of the image is.
That’s how the world as we know it changes.