Anti-piracy Organization Takes Down AI Training Dataset 'Books3' Used by Meta's Large Models


The New York Times and the Daily News encountered an unexpected twist in their copyright lawsuit: an OpenAI engineer inadvertently deleted virtual machine search data that could have been key evidence, adding a dramatic turn to this high-profile legal dispute. According to a letter submitted to the U.S. District Court for the Southern District of New York on Wednesday night, lawyers and technical experts for the two media companies had previously invested over 150 hours searching OpenAI's AI training dataset. However, on November 14, an OpenAI engineer accidentally deleted data stored on the virtual machine.
LAION launched Re-LAION-5B, the world's first AI training dataset that fully removes links to CSAM, aimed at addressing the issue of Child Sexual Abuse Material (CSAM). This dataset has been significantly improved over LAION-5B and is mainly divided into two versions: Re-LAION-5B Research and Research-Safe. A total of 2,236 CSAM links have been removed, including 1,008 from child protection organizations' lists. The dataset contains 5.5 billion pairs of text and images, designed to help
Recently, the 'Diting' seismic wave large model, jointly developed by the National Supercomputing Center in Chengdu, the Institute of Geophysics of the China Earthquake Administration, and Tsinghua University, was officially released in Chengdu, Sichuan. This model is the first seismic wave large model in the country to reach 100 million parameters, marking a significant breakthrough in the integration of seismology research and artificial intelligence technology in China.
Firecrawl's Branding Format API extracts a website's full brand DNA—colors, logo, design framework—from a URL, aiding designers and entrepreneurs in quickly understanding or emulating visual styles for efficiency.....
Sam Altman, CEO of OpenAI, pointed out that the return on a regular university degree will decline rapidly, but not immediately. He predicts that the widespread adoption of artificial intelligence will significantly affect the returns on future education, emphasizing the impact of technological change on the value of traditional degrees.