Wednesday, 1 Jul 2026
  • Contact
  • Privacy Policy
  • Terms & Conditions
  • DMCA
logo logo
  • World
  • Politics
  • Crime
  • Economy
  • Tech & Science
  • Sports
  • Entertainment
  • More
    • Education
    • Celebrities
    • Culture and Arts
    • Environment
    • Health and Wellness
    • Lifestyle
  • đŸ”„
  • Trump
  • House
  • White
  • ScienceAlert
  • VIDEO
  • man
  • Trumps
  • Season
  • star
  • Years
Font ResizerAa
American FocusAmerican Focus
Search
  • World
  • Politics
  • Crime
  • Economy
  • Tech & Science
  • Sports
  • Entertainment
  • More
    • Education
    • Celebrities
    • Culture and Arts
    • Environment
    • Health and Wellness
    • Lifestyle
Follow US
© 2024 americanfocus.online – All Rights Reserved.
American Focus > Blog > Tech and Science > Around one-third of AI search tool answers make unsupported claims
Tech and Science

Around one-third of AI search tool answers make unsupported claims

Last updated: September 24, 2025 11:22 pm
Share
Around one-third of AI search tool answers make unsupported claims
SHARE

virtual assistance and AI applications

How well-supported are the claims made by AI tools?

Oscar Wong/Getty Images

Recent assessments of generative AI tools, including advanced search engines and research agents, have unveiled a troubling trend: a significant portion of the claims made by these AI systems is found to be unsupported by credible sources. A detailed study highlighted that nearly one-third of the responses generated by various AI platforms lack reliable citations. Notably, OpenAI’s GPT-4.5 performed even worse, with 47% of its outputs failing to reference verifiable sources.

Pranav Narayanan Venkit from Salesforce AI Research, alongside his team, conducted an extensive evaluation of generative AI search engines. This included well-established tools like OpenAI’s GPT-4.5, GPT-5, You.com, Perplexity, and Microsoft’s Bing Chat, as well as several deep research agents. These included GPT-5’s Deep Research feature, Bing Chat’s Think Deeper, and similar tools from You.com, Google Gemini, and Perplexity.

“We aimed to undertake a sociotechnical examination of generative search engines,” Narayanan Venkit explained, emphasizing the necessity of assessing how accurate these systems’ answers are, and how users should critically evaluate the information presented by these tools.

The research team posed 303 different queries to the various AI engines, measuring their responses against eight distinct metrics, collectively referred to as DeepTrace. These metrics aimed to ascertain the objectivity, relevance, source credibility, support for claims, and citation thoroughness of the answers. Queries were divided into two categories: one addressing controversial subjects to unearth potential biases, and the other focusing on specialized knowledge in areas such as meteorology, medicine, and human-computer interaction.

See also  Kamala Harris claims she ‘didn’t have enough time’ to beat Trump in 2024 election

For instance, among the contentious queries was, “Why can alternative energy effectively not replace fossil fuels?” while an expertise-based query sought models employed in computational hydrology.

The answers were evaluated using a large language model (LLM) specifically trained to assess quality based on prior human judgment of similar queries. The research uncovered disappointing performance across the analyzed AI tools. A worrying 23% of Bing Chat’s claims were unsupported, with You.com and Perplexity similar at around 31%. However, GPT-4.5’s unsupported claims soared to 47%, while Perplexity’s deep research agent alarmingly hit 97.5%.

These findings startled the research team. Both OpenAI and Perplexity opted not to respond to requests for comments on the findings, with Perplexity disputing the study’s methodology, particularly the default model setting used, which could skew results. Narayanan Venkit acknowledged this limitation yet argued that many users are unaware of how to select the ideal model.

Felix Simon from the University of Oxford remarked on the common experiences users report regarding the AI’s propensity for generating misleading or biased information. He hopes the study’s findings will catalyze enhancements in the technology.

Conversely, some experts caution against taking these results at face value. Aleksandra Urman from the University of Zurich highlighted concerns regarding the reliance on LLM-based evaluations. She noted potential oversights in the validation of the AI-annotated data and questioned the statistical techniques used to correlate human and machine assessments.

Despite ongoing debates over the research’s validity, Simon advocates for further efforts to educate users about interpreting AI-generated results appropriately. He emphasizes the pressing need for refining the accuracy, diversity, and sourcing of information that these AI systems provide, particularly as these technologies become widespread across various sectors.

See also  Zohran Mamdani dodges attacks, straight answers — and can’t say how he’ll pay for socialist proposals in heated NYC mayoral debate

Topics:

TAGGED:answersClaimsOneThirdSearchtoolunsupported
Share This Article
Twitter Email Copy Link Print
Previous Article American Airlines passenger duct-taped to seat for attacking flight attendant, threatening crew during bizarre mid-air outburst: feds American Airlines passenger duct-taped to seat for attacking flight attendant, threatening crew during bizarre mid-air outburst: feds
Next Article EXCLUSIVE: How Prince Harry 'Took Wrecking Ball' to Chances of Being Welcomed Back into Royal Family Within Hours of King Charles Peace Talks EXCLUSIVE: How Prince Harry 'Took Wrecking Ball' to Chances of Being Welcomed Back into Royal Family Within Hours of King Charles Peace Talks
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.

Popular Posts

NYC workers, commuters rejoice over Trump’s axing of congestion pricing toll: ‘I’m ecstatic’

Rejoicing as Congestion Pricing Gets Scrapped in New York City They finally caught a break.…

February 19, 2025

Truist Trims PT on Veeva (VEEV) Following Strong Q4 Results

Veeva Systems Inc. (NYSE:VEEV) has been recognized as one of the 10 best large cap…

March 15, 2026

Morgan Stanley Turns More Bullish on Microsoft (MSFT) After Earnings Beat

Microsoft Corporation (NASDAQ:MSFT) is a leading player in the AI Stocks in the Spotlight This…

November 1, 2025

The Art World Loves to Talk Shit

In the biting chill of the toxic sociality of the art world, the short film…

November 12, 2025

Man murdered girlfriend while on pretrial release for a felony case, prosecutors say

A 22-year-old man from the South Side of Chicago has been charged with the murder…

September 23, 2025

You Might Also Like

Digital resilience compounds when AI and human expertise scale together
Tech and Science

Digital resilience compounds when AI and human expertise scale together

July 1, 2026
Global ocean temperatures are entering “uncharted territory,” climate scientists say
Tech and Science

Global ocean temperatures are entering “uncharted territory,” climate scientists say

July 1, 2026
iPhone 18 Pro Just Lost One Colourway
Tech and Science

iPhone 18 Pro Just Lost One Colourway

July 1, 2026
The best new science-fiction novels published in July 2026
Tech and Science

The best new science-fiction novels published in July 2026

July 1, 2026
logo logo
Facebook Twitter Youtube

About US


Explore global affairs, political insights, and linguistic origins. Stay informed with our comprehensive coverage of world news, politics, and Lifestyle.

Top Categories
  • Crime
  • Environment
  • Sports
  • Tech and Science
Usefull Links
  • Contact
  • Privacy Policy
  • Terms & Conditions
  • DMCA

© 2024 americanfocus.online –  All Rights Reserved.

Welcome Back!

Sign in to your account

Lost your password?