Nielsen's metadata services giant Gracenote has formally filed a lawsuit with the United States District Court for the Southern District of New York, accusing OpenAI of large-scale scraping of its proprietary media metadata database without authorization and without paying fees, which was used to train commercial AI products such as ChatGPT.

Gracenote claims that OpenAI's actions not only constitute serious copyright infringement, but also directly threaten the company's business foundation by "replicating" its core assets. Gracenote points out that its database was manually annotated by hundreds of editors, containing detailed program descriptions, video characteristics, unique identifiers, and complex relationship graphs. The company emphasizes that what was infringed is not only text, but also its patented "data correlation framework."

The complaint states that when users ask ChatGPT to describe popular TV series such as "Game of Thrones," the AI's output is almost identical to the descriptions written by Gracenote editors. This indicates that the relevant data has been directly copied and embedded into the model.

Gracenote is concerned that if AI companies can freely scrape and provide this data, end customers such as smart TV manufacturers will no longer purchase licensed services, instead relying on AI-generated alternatives, leading to the collapse of the metadata market ecosystem. Gracenote stated that it had previously contacted OpenAI multiple times to discuss licensing, but was repeatedly rejected or ignored, eventually forcing it to take legal action.

OpenAI's spokesperson responded that its model training is based on "publicly available data" and complies with the "fair use" principle under current copyright law.