OpenAI Accused of Accidentally Deleting Data Relevant to Lawsuit Filed by The New York Times and Daily News
Legal battles continue to unfold as The New York Times and Daily News pursue legal action against OpenAI for allegedly using their content without permission to train its AI models. In a recent development, lawyers for the publishers claim that OpenAI engineers inadvertently deleted crucial data related to the case, complicating the ongoing litigation.
Earlier this year, OpenAI agreed to provide virtual machines to allow The Times and Daily News’ legal teams to search for their copyrighted material within OpenAI’s training datasets. However, a recent letter filed in the U.S. District Court for the Southern District of New York revealed that on November 14, OpenAI erased all of the publishers’ search data stored on one of the virtual machines, leading to significant setbacks in the legal process.
Efforts to recover the deleted data were partially successful, but the loss of folder structure and file names rendered the recovered data unusable for identifying where the plaintiffs’ content may have been used in OpenAI’s models. As a result, The Times and Daily News are now faced with the daunting task of recreating their work from scratch, consuming valuable time and resources.
While the plaintiffs’ counsel acknowledges that there is no evidence of intentional deletion by OpenAI, they emphasize the importance of OpenAI taking responsibility for searching its own datasets for potential copyright infringement using its own tools.
In response to the allegations, OpenAI’s attorneys denied any deliberate deletion of evidence and instead placed the blame on a system misconfiguration that led to the technical issue. They asserted that implementing changes requested by the plaintiffs resulted in the removal of folder structure and some file names on a temporary cache drive, with no actual loss of files.
OpenAI has maintained its stance that using publicly available data for training AI models falls under fair use, including content from publishers like The Times and Daily News. Despite this, OpenAI has entered into licensing agreements with several publishers, such as the Associated Press and Financial Times, indicating a shift towards a more collaborative approach in accessing copyrighted material for AI training.
While OpenAI has not confirmed or denied training its AI systems on specific copyrighted works without permission, the ongoing legal battle highlights the complexities surrounding AI training using copyrighted content and the need for clearer guidelines in this rapidly evolving field.