In a groundbreaking initiative, a scientific conference opened the door for paper submissions from all scientific disciplines, but with an intriguing stipulation: most of the work had to be undertaken by AI. Dubbed Agents4Science 2025, this virtual event on October 22 showcased the developments of artificial intelligence agents—systems that synergize large language models with other resources to tackle complex tasks.
AI agents took charge at every stage, from developing hypotheses to analyzing data and delivering the initial round of peer reviews, while human reviewers evaluated the leading submissions. Ultimately, 48 out of 314 papers were chosen, each required to elaborate on the collaboration between human researchers and AI throughout the research and writing phases.
“We’re witnessing a fascinating shift in paradigms,” remarked James Zou, a Stanford University computer scientist and co-organizer of the conference. “Researchers are beginning to consider AI as a collaborative scientist.”
Currently, most scientific journals and conferences prohibit AI coauthors and restrict peer reviewers from depending on AI tools. These regulations are designed to mitigate hallucinations and related concerns associated with AI’s employment. However, this stance complicates the assessment of AI’s efficacy in scientific inquiry. The goal of Agents4Science was to delve into this issue, described by Zou as an experimental approach, with all submissions available for public examination.
During the virtual conference, human participants presented AI-augmented research across various fields, including economics, biology, and engineering. Min Min Fong, an economist at UC Berkeley, worked alongside AI to analyze car-towing data in San Francisco, discovering that eliminating high towing fees supported low-income individuals in retaining their vehicles.
“AI significantly facilitated our computational tasks,” acknowledged Fong, yet she cautioned that, “it’s essential to approach AI usage with caution.”
For instance, the AI consistently cited incorrect dates regarding the implementation of the towing fee waiver, prompting Fong to verify the information against the original source to correct the mistake. “The essential scientific work remains driven by humans,” she stated.
Risa Wechsler, a computational astrophysicist at Stanford and a reviewer for the submissions, reported mixed outcomes. While she found the papers to be technically valid, she noted, “they lacked both interest and significance.” She expressed enthusiasm for AI’s research potential but remained skeptical about the ability of current AI agents to “formulate robust scientific inquiries.” Additionally, she remarked that AI’s technical capabilities could sometimes obscure poor scientific reasoning.
Nonetheless, there were promising signs for AI’s future in scientific endeavors. Silvia Terragni, a machine learning engineer at Upwork in San Francisco, mentioned she provided ChatGPT with contextual information regarding her company’s challenges and asked for paper proposals. “One of them ended up being a winner,” she noted, recognized among the top three submissions at the conference, which focused on leveraging AI reasoning in job marketplaces. “I believe AI can indeed generate innovative ideas,” she said.

