A comprehensive study conducted by Search Atlas has found that six major large language model platforms demonstrate zero data leakage of sensitive user information, addressing widespread privacy concerns while highlighting persistent issues with AI hallucination. The research, which evaluated OpenAI, Gemini, Perplexity, Grok, Copilot, and Google AI Mode through controlled experiments simulating worst-case data exposure scenarios, provides significant reassurance for businesses and individuals concerned about confidentiality when using AI tools.
The study's methodology involved introducing unique, non-public facts to each model through direct prompts and simulated web search results, then testing whether these facts could be retrieved in subsequent interactions without search access. Across all platforms, researchers found no evidence that models retained or replayed the sensitive information, with zero correct answers produced after initial exposure. The complete study details are available at https://searchatlas.com.
One key experiment revealed behavioral differences among platforms when handling unknown information. OpenAI, Perplexity, and Grok tended to respond with uncertainty, frequently providing "I don't know" responses when reliable information was lacking. In contrast, Gemini, Copilot, and Google AI Mode were more inclined to generate confident yet incorrect answers. Crucially, none of these incorrect responses matched the previously provided private information, demonstrating that hallucination—the fabrication of incorrect information—is distinct from data leakage.
The second experiment examined whether information retrieved via live web search would persist once search access was disabled. Researchers selected a real-world event occurring after all models' training cutoffs to ensure correct answers could only originate from live retrieval. When search was enabled, models answered most questions correctly, but once search was disabled, those correct answers largely disappeared. This indicates that models do not store or carry forward facts obtained during prior interactions through retrieval mechanisms.
For businesses and privacy-conscious users, these findings suggest that sensitive information shared during a single AI session acts more like temporary "working memory" rather than being absorbed into lasting memory that could be revealed to other users. This addresses a primary concern in enterprise AI adoption—the fear that proprietary business strategies or private details might be leaked to other users through the AI system.
The study emphasizes that while data leakage concerns appear unfounded based on this research, hallucination remains a genuine challenge. Platforms exhibiting lower accuracy—Gemini, Copilot, and Google AI Mode—did not achieve this by repeating previously received information but by generating plausible-sounding yet incorrect answers. This distinction is crucial for risk assessment, as it shifts focus from privacy concerns to accuracy verification requirements.
For developers and AI builders, the research underscores the importance of retrieval-based systems like Retrieval-Augmented Generation (RAG), which connect models to live databases or search systems. These approaches remain the most dependable method for ensuring accurate responses for current events, proprietary information, or frequently updated data, as models lack built-in mechanisms to retain facts discovered during earlier interactions without such systems.
The implications extend to researchers and fact-checkers, highlighting that LLMs cannot "learn" from corrections provided in previous conversations. If a model contains errors in its underlying training data, it may persist in repeating those mistakes unless the model is retrained or correct sources are provided anew. This limitation emphasizes the need for ongoing verification of AI-generated content, particularly in contexts where accuracy is paramount.
Manick Bhan, Founder of Search Atlas, noted that much enterprise AI adoption concern stems from untested assumptions about data leakage, and this study aimed to rigorously test those assumptions under controlled conditions. While AI is not risk-free—with hallucination being a documented issue—the specific fear that data may be leaked to another user was not supported by evidence across any platform evaluated.
These findings could accelerate AI adoption in sectors where data sensitivity has been a barrier, such as healthcare, finance, and legal services. Organizations can now engage with AI tools with greater confidence regarding data privacy, though they must maintain robust verification processes to address hallucination risks. The study provides a clearer framework for understanding actual versus perceived AI risks, enabling more informed decision-making about AI implementation strategies.


