AI Training vs Copyright in India | One Nation, One License Framework

AI Training

Generative artificial intelligence(AI) has triggered one of the sharpest clashes yet between technological innovation and copyright law in India with respect to AI Training. Powerful models are trained on vast troves of human-created books, music, images, news, and audiovisual works. This raises the uncomfortable questions of:

a) who gave permission

b) who gets paid, and

c) what happens to the bargaining power of individual creators when their work is absorbed into training datasets.

India simultaneously aims to become a global AI hub under the IndiaAI Mission.

The two factions, comprising copyright holders and AI developers, were at an impasse on how to safeguard the creative rights of individuals while simultaneously training their AI models. For India as a nation, there is considerable economic scope in both aspects, on one hand, AI can help boost economic growth, and on the other, copyright helps further our culture, tradition and heritage. Combining them would reap significant gains for India. This places India at a fork where making a “do nothing” or “pure exception” approach is politically and economically risky.​

The new working paper on “One Nation, One License, One Payment” emerges from this backdrop and tries to answer the question of how AI developers can continue to access large, diverse training corpora while ensuring that authors, artists, journalists, and other rightsholders are not reduced to unpaid raw material for commercial AI systems. It is evident that neither blanket text-and-data-mining exceptions nor purely voluntary licences can strike the right balance at India’s scale.

The paper, therefore, proposes a hybrid model that treats access to training as a statutory entitlement, but couples it with a mandatory remuneration right and a central collecting body, inviting public debate on whether this structure can genuinely align AI-driven growth with long-term respect for copyright.

The AI Training hybrid model in simple terms

The hybrid model is the middle ground between full freedom for AI developers and full control for copyright holders. Instead of granting a broad exception for text and data mining or requiring case-by-case licences, it allows AI developers to train on all lawfully accessed copyrighted content, while simultaneously guaranteeing creators a statutory right to be paid for that use.​

In plain terms, this means that training access is treated as a legal entitlement, and remuneration is a right of the copyright holders. However, this transaction is regulated by a central collecting body, using government-approved royalty rates. AI companies gain legal certainty and low transaction costs, while creators trade off control over consent for a predictable and enforceable revenue stream. So, the model tries to meet the needs of both sides without making the right-holders feel like they are an input for the AI model developers.

Mandatory blanket licence

Through this blanket license, AI developers shall have a blanket licence to use lawful accessible copyrighted content for training generative AI systems. This is an advantage for developers, as they previously had to conduct individual negotiations to obtain permission every time they used the data. With the blanket licence, AI developers get a green light to use all lawfully accessed copyrighted works for training purposes, without requiring further negotiation for individual permissions. In return, a share of the revenue from commercialised AI systems would be paid as royalties to rightsholders, so access is permission-free but not payment-free.​

The statute itself would authorise training on any category of copyrighted work, provided the developer joins the scheme and complies with its payment and reporting obligations.​ At the same time, this permission is not free. The developer must pay statutory remuneration rights to creators, so that the use of training triggers royalties routed through the proposed central collecting entity. An advantage for rightsholders is that they cannot refuse the inclusion of their works for training, but they gain a structured claim to compensation backed by government-set rates and judicial review.

Central government-backed collecting body

The central government-backed collection is envisioned as a single, nonprofit institution. This body would be formed by the rightsholder, but it will be supervised by the Union government. The body would act as the exclusive interface for AI developers who would pay all AI‑training royalties to them, under standardised rules and monitored governance.

The body would pass these payments to sector‑wise copyright societies and collective management organisations (CMO) like music, film, publishing, news, etc, which then distribute AI‑training royalties to registered individual creators. To keep the system credible, its mandate would include transparent accounting, inclusive representation of different creative sectors, and accessible grievance channels, so that smaller or informal creators can also meaningfully participate in and benefit from the AI‑royalty ecosystem.

 

Royalty calculation and distribution

Royalty rates would be set by a government-appointed committee using transparent criteria, and affected parties could seek judicial review of those rates. Distribution would rely on work registration in specialised databases for AI training royalties, with unclaimed amounts held for three years. After three years, the amount would be transferred to a welfare fund for underrepresented sectors if no eligible CMO is identified.​

Royalty calculation would be based on clear rules set in advance, typically as a percentage of the revenue that AI developers earn from systems trained on copyrighted content. A government-appointed committee would frame and periodically revise these rates, and its decisions could be challenged through judicial review, which is intended to ensure the process remains transparent and fair to both developers and rightsholders.​

For distribution, the central collecting body would allocate the royalty pool to different sector-specific societies and CMOs according to agreed criteria, such as usage data, repertoire size, or registration records. Those societies would then pay individual creators and rights owners who have registered their works for AI‑training royalties, while unclaimed or residual amounts could be channelled into welfare or support schemes for under‑represented creative communities

Impact on AI developers

AI developers would gain legal certainty and a single-window mechanism to clear training rights, significantly reducing transaction costs and negotiation delays. Payment obligations would arise only after the commercialisation of AI systems, which particularly benefits startups and MSMEs that lack capital during the training phase.​

Impact on creators and copyright holders

Creators retain no right to refuse use of their works for training, but they gain a statutory right to fair remuneration and access to the royalty pool even if they are not members of existing copyright societies. The model explicitly seeks to include informal and under-organised creative sectors, using welfare funds and incentives to help them form CMOs and claim their share.​

Role of the government agency

The government’s role is to designate Copyright Royalties Collective for AI Training (CRCAT), appoint the rate‑setting committee, and supervise governance and representation across different classes of works. This state‑backed structure is designed to standardise licensing terms across the market, reduce disputes, and provide a stable institutional home for royalty collection and distribution.​

Key concerns and open questions

Concerns and open questions arise about how far this model limits creators’ control and how effectively it will deliver real compensation. Rightsholders cannot refuse AI training, thus royalty rates, distribution systems, and grievance procedures must be fair to make the trade-off feel fair, especially for smaller or informal artists who struggle to enforce their rights.

Some are concerned that more compliance requirements could stifle innovation, especially among AI developer groups who prefer a broad text-and-data-mining exception with an opt-out option over a paid licensing model. Further, significant questions persist as to how training use will be quantified in practice, how mixed or foreign datasets will be addressed, and whether the centralised framework can remain transparent and efficient without devolving into an unduly slow or bureaucratic structure.

Balancing innovation and rights

The committee argues that a pure exception would erode copyright incentives and eventually degrade both human creativity and AI quality, while purely voluntary licensing cannot handle the scale and transaction costs of AI training. The hybrid model is presented as a middle path that keeps data broadly available for AI development but structurally bakes in remuneration and sector‑wide inclusion for creators

Conclusion

The proposed model seeks to balance two core objectives: fostering robust growth in the AI sector and ensuring fair treatment of creators. It offers AI developers a clear and relatively low‑friction mechanism to access large training datasets, while simultaneously embedding a statutory right for authors, artists, and other rightsholders to receive remuneration through a centralised entity.​

However, the attendant trade‑offs are significant. Creators relinquish ex ante consent, developers assume new reporting and payment obligations, and there is a real risk that the institutional framework may become slow or opaque if not carefully designed. Ultimately, the effectiveness of this model will turn on implementation details, royalty setting, governance of the central body, and dispute‑resolution mechanisms, rather than on the abstract architecture alone.

Visit My Legal Pal for more such legal insights on various domains.

 

FAQs

1. What is India’s proposed hybrid copyright model for AI?
It’s a system that lets AI developers train on all lawfully accessed copyrighted content, but requires them to pay statutory royalties. This balances the need for large training datasets with fair compensation for creators.

2. What does “One Nation, One License, One Payment” mean?
It’s a single national licensing mechanism that gives AI companies permission to use copyrighted works for training while routing all royalty payments through a central collecting body.

3. Can creators refuse to let their work be used for AI training?
No. Under the proposed model, creators can’t block the use of their work for training. Their protection comes in the form of guaranteed remuneration through government-approved royalty rates.

4. How will royalties be calculated and distributed?
A government-appointed committee will set transparent royalty rates. Payments from AI companies will go to a central body, which will further distribute funds to sector-wise copyright societies and creators.

5. How does this model help AI developers?
Developers get legal certainty, a blanket licence, and a single-window payment system. They only pay after commercialisation, which helps startups and small companies manage costs.

6. Will this model slow down AI innovation?
Some developers worry about compliance burdens, but the system is designed to reduce negotiation bottlenecks and give companies predictable rules. Its success will depend on efficient implementation.

7. What happens to unclaimed royalties?
If a royalty remains unclaimed for three years, it’s moved to a welfare fund meant to support underrepresented creative communities.

8. Why does India need a hybrid model instead of a full exception or full licensing?
A blanket exception undermines creator rights, while case-by-case licensing is too slow and expensive at India’s scale. The hybrid model tries to give AI developers broad access while ensuring creators get paid.

9. Who will oversee the system?
A central government-supervised collecting body will handle royalty collection, governance, transparency, and coordination with copyright societies.

10. What are the biggest concerns with this model?
Key worries include reduced creator control, potential bureaucratic delays, challenges in measuring training data usage, and complexities involving foreign datasets.

Written by: Sloka Vineetha Chandra, Intern at My Legal Pal

Leave a Reply

Your email address will not be published. Required fields are marked *