A bipartisan group of senators introduced a new bill to make it easier to authenticate and detect artificial intelligence-generated content and protect journalists and artists from having their work gobbled up by AI models without their permission.

The Content Origin Protection and Integrity from Edited and Deepfaked Media Act (COPIED Act) would direct the National Institute of Standards and Technology (NIST) to create standards and guidelines that help prove the origin of content and detect synthetic content, like through watermarking. It also directs the agency to create security measures to prevent tampering and requires AI tools for creative or journalistic content to let users attach information about their origin and prohibit that information from being removed. Under the bill, such content also could not be used to train AI models.

Content owners, including broadcasters, artists, and newspapers, could sue companies they believe used their materials without permission or tampered with authentication markers. State attorneys general and the Federal Trade Commission could also enforce the bill, which its backers say prohibits anyone from “removing, disabling, or tampering with content provenance information” outside of an exception for some security research purposes.

(A copy of the bill is in he article, here is the important part imo:

Prohibits the use of “covered content” (digital representations of copyrighted works) with content provenance to either train an AI- /algorithm-based system or create synthetic content without the express, informed consent and adherence to the terms of use of such content, including compensation)

  • just another dev@lemmy.my-box.dev
    link
    fedilink
    English
    arrow-up
    2
    ·
    4 months ago

    I respectfully disagree. I think small time AI (read: pretty much all the custom models on hugging face) will get a giant boost out of this, since they can get away with training on “custom” data sets - since they are too small to be held accountable.

    However, those models will become worthless to enterprise level models, since they wouldn’t be able to account for the legality. In other words, once you make big bucks of of AI you’ll have to prove your models were sourced properly. But if you’re just creating a model for small time use, you can get away with a lot.

    • interdimensionalmeme@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      ·
      4 months ago

      I am skeptical that this is how it will turn out. I don’t really believe there will be a path from 0$ to challenging big tech without a roadblock of lawyers shutting you down with no way out on the way.

      • just another dev@lemmy.my-box.dev
        link
        fedilink
        English
        arrow-up
        1
        ·
        4 months ago

        I don’t think so either, but to me that is the purpose.

        Somewhere between small time personal-use ML and commercial exploitation, there should be ethical sourcing of input data, rather than the current method of “scrape all you can find, fuck copyright” that OpenAI & co are getting away with.

        • interdimensionalmeme@lemmy.ml
          link
          fedilink
          English
          arrow-up
          1
          ·
          4 months ago

          I mean this is exactly the kind of regulation that microsoft/openai is begging for to cement their position. Then is going to be just a matter of digesting their surviving competitors until only one competitor remains, similar to Intel / AMD relationship. Then they can have a 20 year period of stagnation while they progressively screw over customers and suppliers.

          I think that’s the bad ending. By desperately trying to keep the old model of intellectual property going, they’re going to make the real AI nightmare of an elite few in control of the technology with an unconstrained ability to leverage the benefits and further solidifying their lead over everyone else.

          The collective knowledge of humanity is not their exclusive property. It also isn’t the property of whoever is the lastest person to lay a claim to an idea in effective perpetuity.

          • just another dev@lemmy.my-box.dev
            link
            fedilink
            English
            arrow-up
            1
            ·
            4 months ago

            Why?

            Once this passes, OpenAI can’t build ChatGPT on the same (“stolen”) dataset. How does that cement their position?

            Taking someone’s creation (without their permission) and turning it into a commercial venture, without giving payment or even attribution is immoral.

            If a creator (in the widest meaning of the word) is fine with their works being used as such - great, go ahead. But otherwise you’ll just have to wait before the work becomes public domain (which obviously does not mean publicly available).