Quoting straight from Jim Nielsen’s note on LLM training on copyrighted data:
As a broke teenager, the web was this strange wonderland where you could access all kinds of copyrighted material using tools developed by fringe individuals/communities: Napster, Kazaa, Torrents, Usenet, etc. These tools (at least in the beginning) weren’t really made for profit, just to subvert the gatekeepers (and yeah, steal their profits).
Now — in a strange twist of irony — things seem to have flipped:
- 1999: Individuals use digital tools to steal intellectual property from corporations.
- 2025: Corporations use digital tools to steal intellectual property from individuals.
The empire strikes back.