DataGrail report finds your vendor may be sending data to AI models you never approved

Contents

How researchers uncovered the growing gap between AI vendor contracts and reality One-third of AI systems also process sensitive data, and the true number is likely higher Why consent management became 2025’s most punished privacy failure Data deletion requests surge 567% as the cost of manual processing hits $1.5 million a year State regulators issued $3.4 billion in privacy fines last year, and both parties want more Privacy teams are losing a third of their staff just as AI governance demands explode Can a vendor-produced report be trusted to diagnose the problems that vendor sells solutions for?The next frontier: agentic AI could spread unvetted data across entire organizations autonomously

The data processing agreement (DPA), a fundamental contract that companies rely on to evaluate how vendors handle personal data, is losing its reliability. This alarming conclusion comes from DataGrail’s Privacy and AI Trends Report 2026, released today.

DataGrail, a San Francisco-based privacy platform, analyzed 2,400 popular business software providers and discovered that 63.6% of vendors advertising AI capabilities fail to disclose a third-party AI subprocessor in their legal documentation. This suggests that many companies purchasing AI-enabled software might inadvertently expose their customers’ data to unreviewed and unknown AI models and pipelines.

“All software vendors are trying to move to become AI vendors, which makes sense, but the technologies are moving faster than AI governance can actually keep up,” said DataGrail co-founder and CEO Daniel Barber in an exclusive interview with VentureBeat prior to the report’s release. “The DPA should be the reliable document that teams use to evaluate AI risk, but based on that number, that’s not enough in 2026.”

The report highlights a concerning trend in an enterprise environment where organizations with significant shadow AI activities face average breach costs of $4.63 million, which is $670,000 more than those with minimal or no shadow AI, according to IBM’s 2025 Cost of Data Breach Report. Additionally, U.S. states issued $3.425 billion in privacy-related fines in the same year, surpassing the total fines from the previous five years combined, with Gartner predicting this trend will continue through 2028.

How researchers uncovered the growing gap between AI vendor contracts and reality

DataGrail’s examination of the 63.6% figure involved more than just contract analysis. The research team cross-referenced DPA disclosures with product documentation, GitHub environments, API connections, and marketing materials for each of the 2,400 tracked vendors.

Barber explained the process to VentureBeat: “We looked at the DPA as the baseline, but then what we also looked at is the GitHub environment, the API connections that a particular vendor has, the product documentation, the marketing documentation, and triangulate that information to discern — okay, so the DPA document says use OpenAI, but actually you’ve got these three AI subprocessors over here in your product documentation outlining features and functionality, but that is not reflected in your DPA.”

When asked about his confidence in these gaps indicating actual shadow AI risk rather than proprietary technology use, Barber was clear: “Very confident, because we looked at the sample of the 2,400 systems, and we spent a substantial amount of time actually looking at product documentation, GitHub environments, looking at actual API connections, because we integrate with these systems as well, so we know how they process personal information. It is from primary research.”

This disclosure gap undermines the trust chain that privacy programs depend on. Barber illustrated this with a scenario: A company uses an AI recruiting tool whose DPA lists Claude as its foundational model. After a security review of Anthropic’s AI, it turns out the tool also uses OpenAI and Gemini — models the company never evaluated.

These undisclosed models process resumes and make automated hiring decisions, potentially exposing sensitive personal data like home addresses and financial details to AI systems not vetted by the company. This could lead to FTC regulation violations regarding automated employment decision-making. “How those vendors are evaluating and performing that automated decision making could be really disastrous for a business,” Barber noted.

One-third of AI systems also process sensitive data, and the true number is likely higher

The disclosure gap is alarming on its own, but DataGrail’s report reveals a more serious issue: 32.8% of AI systems that disclose AI capabilities also engage in at least one other high-risk activity, like processing sensitive personal information or enabling automated decision-making. Among systems with self-reported risk factors, 47.1% handle personal data, 20.7% could facilitate automated decision-making, 16.5% manage sensitive categories like health or financial data, and 7.5% deal with biometric data.

The report suggests these figures likely underestimate actual exposure, as they only account for formally disclosed activities. Vendors might underreport access to personal data and may not foresee riskier applications of their tools, even with good intentions.

This has immediate regulatory consequences. The CCPA’s new risk assessment requirement, effective January 1, 2026, mandates that businesses conduct and document risk assessments for activities posing significant privacy risks, with submission to CalPrivacy required by April 2028, under penalty of perjury.

Activities like processing sensitive personal information with AI or using AI for automated decision-making are precisely what trigger this obligation. The report notes that 42% of companies abandoned AI projects in 2025, citing data privacy concerns as a primary obstacle, based on S&P Global research. Barber argues that privacy teams engaging early with AI projects can prevent waste by implementing safeguards before launch, with AI risk assessments as the starting point.

Although shadow AI is a newer threat, traditional privacy issues have intensified, as seen with consent management becoming the most enforced privacy topic of 2025. California reported $4.3 million in CCPA consent settlements, and there were over 1,400 class action wiretapping suits, driven by private firms investigating tracking pixels and session replay software.

Despite this enforcement surge, 63% of the 5,000 websites audited by DataGrail still fail to comply with universal opt-out mechanisms like the Global Privacy Control signal. This figure, while an improvement from 75% non-compliance in 2023, shows slow progress relative to the rapid increase in enforcement.

Barber cited the case of Todd Snyder, a menswear retailer fined $345,178 in May 2025 by the California Privacy Protection Agency, as an example of increased enforcement beyond big tech. “This is a business that has two or three stores across the U.S. They have 300 employees,” he said. “They run tight margins because they’re a consumer menswear clothing store.”

The California Attorney General also reached a $2.75 million settlement with Disney for not honoring opt-out signals, while the California Privacy Protection Agency took action against PlayOn Sports and Ford. This demonstrates the range and depth of regulatory activity. The report found that 27.1% of trackers firing after a GPC signal come from Google Analytics, while 43.8% are for targeted advertising via platforms like Meta and Microsoft.

Among users engaging with consent banners, 48.3% click “Accept all,” while only 12.4% choose “Essential only” and 2.3% customize preferences. A full 37% leave the banner without making a choice. The takeaway: less than 15% of users consciously opt out of tracking, meaning consent banners pose relatively low business risk when properly configured but significant regulatory risk when they are not.

Data deletion requests surge 567% as the cost of manual processing hits $1.5 million a year

Data subject request volume reached a record high for the fifth consecutive year. Deletion requests have surged 567% since 2021, now accounting for 87% of all data subject requests. In contrast, access requests have gradually decreased as consumers bypass visibility to go directly for deletion.

The cost is substantial. For a mid-sized organization with 5 million annual web visitors, the report estimates manual DSR management costs around $1.5 million annually, based on Gartner’s estimated cost of $1,524 per manual DSR. The average cost has risen from $238,000 in 2021 to $1.51 million in 2025, making manual processing not just inefficient but “irresponsible,” as the report argues.

Barber stressed that these figures reflect verified human requests, excluding bot and spam traffic, and that data broker scenarios, which will see a massive influx of requests under California’s Delete Act, are reported separately. “That is a natural increase,” Barber told VentureBeat. “If you’ve now got 20-plus U.S. states with privacy regulation, it’s unlikely that we see a federal bill passed, even though we’ve seen one proposed. And while we don’t see federal awareness and regulation, we do see at the state level over 20 states, and that may actually increase awareness for the consumer even more.”

He added an important point on business response: “99% of DataGrail customers do process that deletion” even for residents of states without privacy laws, “simply because it’s too hard at this point. Discerning and even communicating to the person, ‘Hey, you live in Montana, sorry, you’re just in an unfortunate state without regulation’ — you just can’t do that.” Data brokers saw a 398% increase in deletion requests compared to 2024, averaging over 2,000 handled per month.

State regulators issued $3.4 billion in privacy fines last year, and both parties want more

The regulatory landscape has shifted from education to punishment. Nearly half of U.S. states now have a comprehensive privacy law in effect, alongside over 160 AI-specific laws. In 2025 alone, state legislatures enacted 145 AI-related laws, with another thousand introduced or reworked. According to Gartner, over 50% of the U.S. population is now covered by a comprehensive state privacy law, with 24 additional states expected to pass laws within five years. States have also begun pooling resources, with ten forming the Consortium of Privacy Regulators last year, pledging to coordinate investigations across state lines.

Barber noted that privacy enforcement is largely bipartisan, shielding it from the political winds of the current administration. “Privacy overall is a pretty bipartisan issue,” he said. “It’s easy to pass privacy regulation because constituents somewhat expect privacy in their day-to-day living. If you were flying on an airline and they said, ‘Okay, this seat, if you want your privacy, you’re going to have to pay $6 more,’ you’re like, ‘I’m going to go to another airline.’ It’s an expected part of a transaction at this stage.”

He predicted other states will follow California’s enforcement model. “California has their enforcement division, CalPrivacy. That group has one task: to ensure enforcement of privacy throughout businesses. Is it likely that we see other states get funding and support to fund these types of groups? Highly likely. The enforcement fines — the actual payments — go back to us as constituents. That type of model, you could imagine, being very popular across the country.”

Privacy teams are losing a third of their staff just as AI governance demands explode

One of the report’s most puzzling findings is that privacy teams lost up to 33% of their workforce last year, even as their workloads increased across every tracked metric. Cisco data in the report shows that 90% of privacy programs expanded in 2025 because of AI, while only 12% of AI governance programs are deemed mature. Meanwhile, 74% of privacy teams planned to use AI for privacy-related tasks in 2026, according to ISACA’s State of Privacy 2026 survey.

Barber attributes this to a broader macroeconomic trend rather than a lack of value placed on privacy. “It’s actually a fascinating macro trend, and probably one you’ve seen across all functions,” he said. “Businesses are driving more efficiency in all parts of the business. Privacy teams, five years ago, we would have said, ‘Well, there’s more regulation, the volume of deletions have increased 500%, we need more humans.’ It’s become clear that AI provides capabilities that can do the work for privacy individuals.” He compared this to design teams: “They might have had a design team of 20 people five years ago, now they have a design team of five, courtesy of Claude Design or Gamma or whatever the tool may be. I think that’s what we’re seeing here as well.”

DataGrail has introduced its AI agent, Vera — launched in March 2026 — as part of the solution. Vera is integrated into DataGrail’s existing platform to automate privacy workflows across various jurisdictions. The company was also named the first production-ready Model Context Protocol server for privacy, implementing the standard by Anthropic to enable customers to use DataGrail tools from applications like Slack, email, or Claude.

Can a vendor-produced report be trusted to diagnose the problems that vendor sells solutions for?

DataGrail, a company that directly benefits from the challenges its report identifies, has raised $84.2 million over five funding rounds, with the largest being a $45 million Series C in October 2022 led by Third Point Ventures. Its platform addresses the precise data mapping, DSR automation, consent management, and risk assessment issues highlighted in the report.

Barber acknowledged this potential conflict of interest. “It’s a fair statement,” he said when asked about possible skepticism. “DataGrail doesn’t provide a service to keep DPAs up to date — that’s on a business to evaluate how they work with a vendor. What DataGrail does help to do is assessments, and automate those assessments using our AI agent, Vera, to assess that increased risk.”

He argued that a structural reading of the data is more neutral: “This is evidence to show that the DPA unfortunately is not keeping up with technology and the speed at which technology is innovating. That’s both exciting but also we need to accept that’s where we are.” The methodology adds some credibility to this claim.

The report uses anonymized privacy operations data from hundreds of enterprise customers, the 2,400-system AI tracking database, and the 5,000-website consent audit — sources at least partially independent of DataGrail’s commercial interests. Furthermore, the findings on enforcement spending, DSR volume trends, and regulatory expansion align with independently published data from Gartner, Cisco, and state enforcement agencies.

The next frontier: agentic AI could spread unvetted data across entire organizations autonomously

When asked about the most important trend not included in the report, Barber pointed to a next-generation risk that extends the shadow AI problem into more dangerous territory: agentic AI workflows. Gartner predicts 40% of enterprise applications will feature task-specific AI agents by end of 2026, up from under 5% in 2025 — a pace of adoption that could rapidly outstrip the governance mechanisms companies are only now beginning to build.

“Where we go next with this research is agent processing,” Barber said. “How are agents then leveraging that information? Because the downstream ramifications would be far more concerning for a business. One particular system is using shadow AI, the business has no idea that that’s happening, and then an agent is propagating that information across a whole bunch of other places. The guardrails of you and I checking the system will be lower than maybe what we’ve seen in the past with agentic workflows.”

He described the difference in human terms: “The identity of an agent is different than a human. There is thought that goes into what am I about to use here, where did this information come from, how was it collected — that may not be considered in the same way for an agentic workflow. We need to solve the root of the problem, which is how are these businesses leveraging AI subprocessors. But this quickly becomes an agentic problem that could be far more concerning.”

For enterprise privacy and security leaders reviewing this report today, the unsettling truth is that the foundational documents and processes they have depended on to manage vendor risk are deteriorating in real time. The DPA is losing its reliability as an instrument. State enforcement is ramping up on a bipartisan basis. Privacy teams are shrinking even as their responsibilities grow. The next wave of agentic AI systems could further distribute unchecked data processing across networks of autonomous agents operating with even less human oversight than current tools.

Five years ago, when DataGrail released its first trends report, deletion requests were a fraction of today’s volume, only a few states had privacy laws, and “shadow AI” was an unknown term. Each year since, the report has warned of worsening problems, and the data has consistently supported this. The companies that will endure the upcoming challenges will not be those with the largest compliance teams or the most extensive policy binders. They will be the ones adapting to a disorienting new reality: by 2026, the AI processing your customers’ data may not match the contracts you signed — and by 2027, autonomous agents might be making decisions about that data.

DataGrail report finds your vendor may be sending data to AI models you never approved

How researchers uncovered the growing gap between AI vendor contracts and reality

One-third of AI systems also process sensitive data, and the true number is likely higher

Data deletion requests surge 567% as the cost of manual processing hits $1.5 million a year

State regulators issued $3.4 billion in privacy fines last year, and both parties want more

Privacy teams are losing a third of their staff just as AI governance demands explode

Can a vendor-produced report be trusted to diagnose the problems that vendor sells solutions for?

The next frontier: agentic AI could spread unvetted data across entire organizations autonomously

Popular Posts

NIH Lab Studying Deadly Pathogens Goes Offline Over Safety Issues. Is The Public At Risk?

Dancers, pipers to converge on Dunedin

How two FDA officials, Prasad and Hoeg, seized vaccine oversight

Geologists accidentally found a monstrous mosasaur fossil in Mississippi mud

CrowdStrike Partners With Australia’s AARNet to Thwart Cyber-attacks in Research and Learning Institutions

About US

Top Categories

Usefull Links