Generative artificial intelligence (GenAI) company Anthropic has claimed to a US court that using copyrighted content in large language model (LLM) training data counts as “fair use”, however.

Under US law, “fair use” permits the limited use of copyrighted material without permission, for purposes such as criticism, news reporting, teaching, and research.

In October 2023, a host of music publishers including Concord, Universal Music Group and ABKCO initiated legal action against the Amazon- and Google-backed generative AI firm Anthropic, demanding potentially millions in damages for the allegedly “systematic and widespread infringement of their copyrighted song lyrics”.

  • SuiXi3D@kbin.social
    link
    fedilink
    arrow-up
    64
    ·
    9 months ago

    …then maybe they shouldn’t exist. If you can’t pay the copyright holders what they’re owed for the license to use their materials for commercial use, then you can’t use ‘em that way without repercussions. Ask any YouTuber.

    • Even_Adder@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      12
      ·
      edit-2
      9 months ago

      You might want to read this article by Kit Walsh, a senior staff attorney at the EFF, and this one by Katherine Klosek, the director of information policy and federal relations at the Association of Research Libraries. YouTube’s one-sided strike-happy system isn’t the real world.

      Headlines like these let people assume that it’s illegal, rather than educate them on their rights.

      • Snot Flickerman@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        23
        ·
        9 months ago

        When Annas-Archive or Sci-Hub get treated the same as these giant corporations, I’ll start giving a shit about the “fair use” argument.

        When people pirate to better the world by increasing access to information, the whole world gets together to try to kick them off the internet.

        When giant companies with enough money to make Solomon blush pirate to make more oodles of money and not improve access to information, it’s “fAiR uSe.”

        Literally everyone knew from the start that books3 was all pirated and from ebooks with the DRM circumvented and removed. It was noted when it was created it was basically the entirety of private torrent tracker Bibliotik.

        • Even_Adder@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          11
          ·
          edit-2
          9 months ago

          AI training should not be a privilege of the mega-corporations. We already have the ability to train open source models, and organizations like Mozilla and LAION are working to make AI accessible to everyone. We can’t allow the ultra-wealthy to monopolize a public technology by creating barriers that make it prohibitively expensive for regular people to keep up. Mega corporations already have a leg up with their own datasets and predatory terms of service that exploit our data. Don’t do their dirty work for them.

          Denying regular people access to a competitive, corporate-independent tool for creativity, education, entertainment, and social mobility, we condemn them to a far worse future, with fewer rights than we started with.

          • Snot Flickerman@lemmy.blahaj.zone
            link
            fedilink
            English
            arrow-up
            16
            ·
            edit-2
            9 months ago

            How am I doing their dirty work for them? I literally will stop thinking that they’re getting away with piracy for profit when we stop haranguing people who are committing to piracy for the benefit of mankind.

            I’m not saying Meta should be stopped, I’m saying the prosecution of Sci-Hub and Annas-Archive need to be stopped under the same pretenses.

            If it’s okay to pirate for the purpose of making money (what we put The Pirate Bay admins in jail for), then it’s okay to pirate to benefit mankind.

            There is literally no way in hell someone can convince me what Meta and others are doing is not pirating to use the data contained within to make money. What’s good for the goose is good for the gander, as they say.

            I reiterate, they knew it was pirated and had DRM circumvented when they downloaded it. There was zero question of the source of this data. They knew from the beginning they intended to profit from the use of this data. How is that different than what we accused The Pirate Bay admins of?

            It really feels like “Well these corporations have money to steal more prolifically than little people, so since they’re stealing is so big, we have to ignore it.”

            • Rivalarrival@lemmy.today
              link
              fedilink
              arrow-up
              3
              ·
              9 months ago

              There is literally no way in hell someone can convince me what Meta and others are doing is not pirating

              Then your argument is non-falsifiable, and therefore, invalid.

              Major corporations and pirates are finally on the same side for once. “Fair Use” finally has financial backing. Meta is certainly not a friend, but our interests currently align.

              The worst possible outcome here is that copyright trolls manage to convince the courts that they are owed licensing fees. Next worse is a settlement that grants rightsholders a share of profits generated by AI, like they got from manufacturers of blank tapes and CDs.

              Best case is that the MPAA, RIAA, and other copyright trolls get reminded that “Fair Use” is not an exception to copyright law, but the fundamental reason it exists: Fair Use is the promotion of science and the useful arts. Fair Use is the rule; Restriction is the exception.

              • Zaktor@sopuli.xyz
                link
                fedilink
                English
                arrow-up
                6
                ·
                9 months ago

                Then your argument is non-falsifiable, and therefore, invalid.

                Wow this is some powerful internet word salad, just shot gunning scientific sounding words at the wall to try to pretty up a basic internet debate. Falsifiability is about scientific hypotheses, not statements of belief. “Nothing you can say can convince me that murder isn’t wrong” may mean there’s no further use in debate, but it isn’t “non-falsifiable” in any meaningful way nor does it somehow make the argument for the immorality of murder “invalid”.

        • VoterFrog@kbin.social
          link
          fedilink
          arrow-up
          4
          ·
          9 months ago

          You don’t see the difference between distributing someone else’s content against their will and using their content for statistical analysis? There’s a pretty clear difference between the two, especially as fair use is concerned.

      • Zaktor@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        3
        ·
        9 months ago

        By and large copyright infringement is illegal. That some things aren’t infringement doesn’t change that a general stance of “if I don’t have permission, I can’t copy it” is correct. The first argument in the EFF article is effectively the title: “it can’t be copyright, because otherwise massive AI models would be impossible to build”. That doesn’t make it fair use, they just want it to become so.

        • Rivalarrival@lemmy.today
          link
          fedilink
          arrow-up
          3
          ·
          9 months ago

          The purpose of copyright is to promote the sciences and useful arts. To increase the depth, width, and breadth of the public domain. “Fair Use” is not the exception. “Fair Use” is the fundamental purpose for which copyrights and patents exist. Copyright is not the rule. Copyright is the exception. The temporary exception. The limited exception. The exception we grant to individuals for their contribution to the public.

          “it can’t be copyright, because otherwise massive AI models would be impossible to build”.

          If that is, indeed, true, and if AI is a progression of science or the useful arts, then it is copyright that must yield, not AI.

    • helenslunch@feddit.nl
      link
      fedilink
      arrow-up
      5
      ·
      9 months ago

      I love seeing Lemmy users trip over themselves to declare that copyrights don’t or shouldn’t exist when it comes to pirating, right up until it comes to AI. Then Copyrights are enshrined by The Constitution and all the corporations NEED to pay for them, even when they’re not actually copying anything.

      • zaphod@lemmy.ca
        link
        fedilink
        English
        arrow-up
        11
        ·
        9 months ago

        You do realize that there may in fact be different, distinct groups of Lemmy users with differing, potentially non-overlapping beliefs, yeah?

        • helenslunch@feddit.nl
          link
          fedilink
          arrow-up
          4
          ·
          9 months ago

          Sure but Lemmy also operates as a sort of hivemind. This is the top-voted post in the last 24 hours and piracy content usually makes up at least 25% of content here.

          • zaphod@lemmy.ca
            link
            fedilink
            English
            arrow-up
            8
            ·
            9 months ago

            Oh, well, you’ve clearly done the kind of deep and thoughtful analysis that would allow you to determine the general opinions of all Lemmy users. My mistake. Carry on.

      • SuiXi3D@kbin.social
        link
        fedilink
        arrow-up
        7
        ·
        9 months ago

        Using copyrighted material for something you aren’t gonna make any money off of? Cool, go hog wild. If you’re gonna use some music or art that you didn’t make in something that will make you money, the folks that made whatever you used should get a cut. Not the whole cut, but a cut.

        • Moira_Mayhem@beehaw.org
          link
          fedilink
          arrow-up
          3
          ·
          9 months ago

          If an artist falls in love with drawing and learns to draw from Jack Kirby’s work and at the beginning even imitates his style, does he owe Jack Kirby royalties for every drawing he does as he ‘learned’ on Jack’s copyrighted art?

          • SuiXi3D@kbin.social
            link
            fedilink
            arrow-up
            3
            ·
            9 months ago

            I think in that case, no. ‘Style’ is one thing, directly using someone’s art in your own work is something else entirely. However, we’re talking about a person here, not a program developed by a company for the express purpose of making as much money as possible in the shortest amount of time. Until AI can truly demonstrate that it is truly thinking and not simply executing commands given, I don’t think the lines are blurred nearly enough to suggest that someone learning to paint and an AI trained on hundreds of thousands of pieces of art for the purpose of making money for the company that built it are remotely the same.

      • Sneezycat@sopuli.xyz
        link
        fedilink
        arrow-up
        4
        ·
        9 months ago

        And corporations want people to pay for it but they don’t want to pay for it themselves. It’s almost as if no one likes copyright, but it benefits some ppl more than others.

      • Pigeon@beehaw.org
        link
        fedilink
        arrow-up
        1
        ·
        9 months ago

        You do realize that lemmy contains very many users, many of whom disagree on any number of things. You are randomly assigning the opinions of lemmy’s pirate users to a random commenter without evidence that they actually hold those opinions, because it’d be convenient for you if they’re contradicting themself in any way (though the degree to which that would be a contradiction is also arguable). It’s just a way of constructing a strawman instead of engaging with your interlocutor’s actual words.

        Also, part of the problem is that these LLMs very often do directly copy and spit out articles and random forum posts and etc word-for-word verbatim, or it’ll do something that’s the equivalent of a plagiarist who swaps a few words around in a sad attempt to not get caught. It becomes especially likely depending on how specific the search is, like if you look for a niche topic hardly anyone has written extensively on or for the solution to an esoteric problem that maybe just one person on a forum somewhere found an answer to. It also typically does not even give credit or link to its sources.

        Plus, copyright law, if it exists, must apply to everyone, including major coporations. That’s a separate issue than whether or not copyright law needs reform (it obviously does). If you wanna abolish copyright, fine, ok, get it abolished through the government. But while copyright law is still the law, I’m not ozk with giving magacorps a pass to break it legally, especially when they’re more than happy to sue random, harmless individuals for violating their own copyrights. They want the law not to apply to them because they’re rich.

        The argument they’re making is just ridiculous on its face when you compare it to other crimes. If AI should be allowed to violate copyright because otherwise it can’t exist as it is, then anyone should be able to violate copyright because otherwise their cool projects won’t be able to exist. And I should be able to rob a bank because otherwise I won’t have all that money. You should be able to commit murder because otherwise your annoying coworker will keep bugging you. She should be able to walk out of a store with an iPhone without paying for it because otherwise she won’t have an iPhone. Etc. It’s an argument that says the criminal’s motivations are legal justification for the crime. “You should let me legally do the thing because otherwise I can’t do the thing” is just not a convincing argument in my book.

        • helenslunch@feddit.nl
          link
          fedilink
          arrow-up
          1
          ·
          9 months ago

          You do realize that lemmy contains very many users

          Already addressed in another comment.

          part of the problem is that these LLMs very often do directly copy and spit out articles and random forum posts and etc word-for-word verbatim

          It’s a problem they’ve acknowledged and are actively working on.

          Plus, copyright law, if it exists, must apply to everyone, including major coporations.

          Well many people here would disagree. That was the entire point of my comment.