• hitmyspot@aussie.zone
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    9 days ago

    That’s assuming you own the media in the first place. Often AI is trained with large amounts of data downloaded illegally.

    So, yes, it’s fair use to train on information you have or have rights to. It’s not fair use to illegally obtain new data. Even more, to renting that data often means you also distribute it.

    For personal use, I don’t have an issue with it anyway, but legally it’s not allowed.

    • Riskable@programming.dev
      link
      fedilink
      English
      arrow-up
      2
      ·
      7 days ago

      Incorrect. No court has ruled in favor of any plaintiff bringing a copyright infringement claim against an AI LLM. Here’s a breakdown of the current court cases and their rulings:

      https://www.skadden.com/insights/publications/2025/07/fair-use-and-ai-training

      In both cases, the courts have ruled that training an LLM with copyrighted works is highly transformative and thus, fair use.

      The plaintiffs in one case couldn’t even come up with a single iota of evidence of copyright infringement (from the output of the LLM). This—IMHO—is the single most important takeaway from the case: Because the only thing that really mattered was the point where the LLMs generate output. That is, the point of distribution.

      Until an LLM is actually outputting something, copyright doesn’t even come into play. Therefore, the act of training an LLM is just like I said: A “Not Applicable” situation.

      • hitmyspot@aussie.zone
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 days ago

        Just a heads up that anthropic have just lost a $1.5b case for downloading and storing copyrighted works. That’s $3,000 per author of 500000 books.

        The wheels of justice move slowly but fair use has limits. Commercial use is generally not one. Commentary and transformation are, so we’ll see how this progresses with the many other cases.

        Warner Brothers have recently filed another case, I think.

        • Riskable@programming.dev
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 day ago

          Anthropic didn’t lose their lawsuit. They settled. Also, that was about their admission that they pirated zillions of books.

          From a legal perspective, none of that has anything to do with AI.

          Company pirates books -> gets sued for pirating books. Companies settles with the plaintiffs.

          It had no legal impact on training AI with copyrighted works or what happens if the output is somehow considered to be violating someone’s copyright.

          What Anthropic did with this settlement is attack their Western competitor: OpenAI, specifically. Because Google already settled with the author’s guild for their book scanning project over a decade ago.

          Now OpenAI is likely going to have to pay the author’s guild too. Even though they haven’t come out and openly admitted that they pirated books.

          Meta is also being sued for the same reason but they appear to be ready to fight in court about it. That case is only just getting started though so we’ll see.

          The real, long-term impact of this settlement is that it just became a lot more expensive to train an AI in the US (well, the West). Competition in China will never have to pay these fees and will continue to offer their products to the West at a fraction of the cost.

      • hitmyspot@aussie.zone
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 days ago

        While that’s interesting info and links, I don’t think that’s true.

        https://share.google/opT62A4cIvKp6pwhI This case with Thomson has, but is expected to be overturned.

        Most of the big cases are in the early stages. Let’s see what the Disney one does.

        There is also the question, not just of copyright or fair use, but legally obtaining the data. Facebook torrented terabytes of data and claimed they did not share it. I don’t know that that’s enough to claim innocence. It hasn’t been for individuals.

        The question is whether they are actually transformative. Just being different is not enough. I can’t use Disney IP to make my new movie, for instance.