A job document released by the Indian government shows that the country wants AI companies to pay for accessing content used to train models, but only after these companies start generating revenue. The proposal was written by a committee on generative AI and copyright established by the Ministry of Industry and Internal Trade, aiming to find a balance between protecting the interests of copyright holders and promoting AI innovation.
Hybrid Model: Three Core Elements
In response to the global debate triggered by AI model developers mostly not paying for copyrighted content, the committee proposed a hybrid model consisting of three elements:
Comprehensive Licensing Mechanism - AI developers can obtain a comprehensive license to use all legally obtained content for training, without having to negotiate the right to use each item individually.
Post-commercialization Royalties - Royalties will only be paid after the AI tool becomes commercialized, with rates set by a government-appointed committee, and these rates will be subject to judicial review.
Centralized Royalty Management - A unified mechanism will be established to collect and distribute royalties, aiming to reduce transaction costs, provide legal certainty, and support both large and small AI developers in fairly accessing resources.
CRCAT: The Proposed Royalty Collection Agency
The report proposed a specific name for the royalty collection agency — the Copyright Royalty Collecting Association for AI Training (CRCAT) — and suggested establishing it as a non-profit organization composed of copyright holder associations. The report also proposed the creation of an "AI Training Copyright Works Database," inviting content creators to register their works to qualify for royalties from CRCAT.

The report states that this model aims to "provide AI developers with convenient access to AI training content, simplify the licensing process, reduce transaction costs, and ensure fair compensation for copyright holders."
The Indian government believes that the "zero-price licensing model" of free content access is inappropriate, as it "would weaken the motivation for human creativity and could lead to a long-term shortage of human-generated content."
Committee members also found that "accessing large volumes of data and high-quality data is crucial for AI development," but they were concerned that licensing negotiations for such content might lead to "prolonged negotiations and high transaction costs, which could hinder innovation, especially for startups and small and medium enterprises."
The proposed arrangement is not without precedent. Some countries already have performance rights organizations responsible for collecting royalties from venues playing recorded music and distributing them to artists. Similar mechanisms also exist in areas such as news reprints.
India's Specificity and Prospects
India's situation presents considerable challenges, as the country recognizes 22 official languages, with eight languages having more than 50 million speakers, and its media and publishing ecosystem is large and fragmented.
Although tech giants are still fiercely debating whether they have the right to train models without prior payment, they are also reaching agreements on daily operations. If New Delhi can pay reasonable royalties, this proposal may be welcomed by large tech companies.
The Indian government has announced that the country aims to become a global leader in all aspects of AI. To achieve this goal, the Indian government has taken a relatively friendly approach towards tech giants entering the local market. This proposal could become an important reference model for global AI copyright policies.
