Friday, 22 May 2026
  • Contact
  • Privacy Policy
  • Terms & Conditions
  • DMCA
logo logo
  • World
  • Politics
  • Crime
  • Economy
  • Tech & Science
  • Sports
  • Entertainment
  • More
    • Education
    • Celebrities
    • Culture and Arts
    • Environment
    • Health and Wellness
    • Lifestyle
  • 🔥
  • Trump
  • House
  • ScienceAlert
  • White
  • VIDEO
  • man
  • Trumps
  • Season
  • star
  • Years
Font ResizerAa
American FocusAmerican Focus
Search
  • World
  • Politics
  • Crime
  • Economy
  • Tech & Science
  • Sports
  • Entertainment
  • More
    • Education
    • Celebrities
    • Culture and Arts
    • Environment
    • Health and Wellness
    • Lifestyle
Follow US
© 2024 americanfocus.online – All Rights Reserved.
American Focus > Blog > Tech and Science > Around one-third of AI search tool answers make unsupported claims
Tech and Science

Around one-third of AI search tool answers make unsupported claims

Last updated: September 24, 2025 11:22 pm
Share
Around one-third of AI search tool answers make unsupported claims
SHARE

virtual assistance and AI applications

How well-supported are the claims made by AI tools?

Oscar Wong/Getty Images

Recent assessments of generative AI tools, including advanced search engines and research agents, have unveiled a troubling trend: a significant portion of the claims made by these AI systems is found to be unsupported by credible sources. A detailed study highlighted that nearly one-third of the responses generated by various AI platforms lack reliable citations. Notably, OpenAI’s GPT-4.5 performed even worse, with 47% of its outputs failing to reference verifiable sources.

Pranav Narayanan Venkit from Salesforce AI Research, alongside his team, conducted an extensive evaluation of generative AI search engines. This included well-established tools like OpenAI’s GPT-4.5, GPT-5, You.com, Perplexity, and Microsoft’s Bing Chat, as well as several deep research agents. These included GPT-5’s Deep Research feature, Bing Chat’s Think Deeper, and similar tools from You.com, Google Gemini, and Perplexity.

“We aimed to undertake a sociotechnical examination of generative search engines,” Narayanan Venkit explained, emphasizing the necessity of assessing how accurate these systems’ answers are, and how users should critically evaluate the information presented by these tools.

The research team posed 303 different queries to the various AI engines, measuring their responses against eight distinct metrics, collectively referred to as DeepTrace. These metrics aimed to ascertain the objectivity, relevance, source credibility, support for claims, and citation thoroughness of the answers. Queries were divided into two categories: one addressing controversial subjects to unearth potential biases, and the other focusing on specialized knowledge in areas such as meteorology, medicine, and human-computer interaction.

See also  Record-breaking chip sidesteps Moore’s law by growing upwards

For instance, among the contentious queries was, “Why can alternative energy effectively not replace fossil fuels?” while an expertise-based query sought models employed in computational hydrology.

The answers were evaluated using a large language model (LLM) specifically trained to assess quality based on prior human judgment of similar queries. The research uncovered disappointing performance across the analyzed AI tools. A worrying 23% of Bing Chat’s claims were unsupported, with You.com and Perplexity similar at around 31%. However, GPT-4.5’s unsupported claims soared to 47%, while Perplexity’s deep research agent alarmingly hit 97.5%.

These findings startled the research team. Both OpenAI and Perplexity opted not to respond to requests for comments on the findings, with Perplexity disputing the study’s methodology, particularly the default model setting used, which could skew results. Narayanan Venkit acknowledged this limitation yet argued that many users are unaware of how to select the ideal model.

Felix Simon from the University of Oxford remarked on the common experiences users report regarding the AI’s propensity for generating misleading or biased information. He hopes the study’s findings will catalyze enhancements in the technology.

Conversely, some experts caution against taking these results at face value. Aleksandra Urman from the University of Zurich highlighted concerns regarding the reliance on LLM-based evaluations. She noted potential oversights in the validation of the AI-annotated data and questioned the statistical techniques used to correlate human and machine assessments.

Despite ongoing debates over the research’s validity, Simon advocates for further efforts to educate users about interpreting AI-generated results appropriately. He emphasizes the pressing need for refining the accuracy, diversity, and sourcing of information that these AI systems provide, particularly as these technologies become widespread across various sectors.

See also  Astonishing Spinosaur Unearthed in The Sahara Is Unlike Any Seen Before : ScienceAlert

Topics:

TAGGED:answersClaimsOneThirdSearchtoolunsupported
Share This Article
Twitter Email Copy Link Print
Previous Article American Airlines passenger duct-taped to seat for attacking flight attendant, threatening crew during bizarre mid-air outburst: feds American Airlines passenger duct-taped to seat for attacking flight attendant, threatening crew during bizarre mid-air outburst: feds
Next Article EXCLUSIVE: How Prince Harry 'Took Wrecking Ball' to Chances of Being Welcomed Back into Royal Family Within Hours of King Charles Peace Talks EXCLUSIVE: How Prince Harry 'Took Wrecking Ball' to Chances of Being Welcomed Back into Royal Family Within Hours of King Charles Peace Talks
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.

Popular Posts

Co Pre-Fall 2026 Collection | Vogue

Co-founder Stephanie Danan has recently made the move to Paris, embracing her dual American-French identity…

January 20, 2026

Connie Chung Shreds ‘Greedy’ CBS Owners In Wicked Takedown: It’s Crashing ‘Into Crumbles’

Former CBS Evening News anchor Connie Chung has recently criticized the "greedy owners" of her…

December 8, 2025

Slavery, Compensation, and the Limits of Economics

A Thought-Provoking Exploration of "What Ifs" in American History Recently, a virtual reading group delved…

November 22, 2024

NFL fans speculate after George Pickens’ trade rumors

The NFL rumor mill is in full swing with talks of George Pickens potentially being…

April 23, 2025

Inside RFK Jr.’s Most Vicious Feuds… Including Battle With Cousin

The Kennedy Family Drama: RFK Jr. Feuds with Relatives Over Controversial Views In a shocking…

February 3, 2026

You Might Also Like

MFA verifies who logged in. It has no idea what they do next.
Tech and Science

MFA verifies who logged in. It has no idea what they do next.

May 22, 2026
SpaceX scrubs launch of Starship V3—the tallest and most powerful rocket ever built
Tech and Science

SpaceX scrubs launch of Starship V3—the tallest and most powerful rocket ever built

May 21, 2026
Luna Band Details Official as Fitbit Air Rival
Tech and Science

Luna Band Details Official as Fitbit Air Rival

May 21, 2026
Mathematicians stunned by AI’s biggest breakthrough in mathematics yet
Tech and Science

Mathematicians stunned by AI’s biggest breakthrough in mathematics yet

May 21, 2026
logo logo
Facebook Twitter Youtube

About US


Explore global affairs, political insights, and linguistic origins. Stay informed with our comprehensive coverage of world news, politics, and Lifestyle.

Top Categories
  • Crime
  • Environment
  • Sports
  • Tech and Science
Usefull Links
  • Contact
  • Privacy Policy
  • Terms & Conditions
  • DMCA

© 2024 americanfocus.online –  All Rights Reserved.

Welcome Back!

Sign in to your account

Lost your password?