Quoting straight from Jim Nielsen’s note on LLM training on copyrighted data:

As a broke teenager, the web was this strange wonderland where you could access all kinds of copyrighted material using tools developed by fringe individuals/communities: Napster, Kazaa, Torrents, Usenet, etc. These tools (at least in the beginning) weren’t really made for profit, just to subvert the gatekeepers (and yeah, steal their profits).

Now — in a strange twist of irony — things seem to have flipped:

  • 1999: Individuals use digital tools to steal intellectual property from corporations.
  • 2025: Corporations use digital tools to steal intellectual property from individuals.

The empire strikes back.