The AI-focused COPIED Act would make removing digital watermarks illegal (as well as training any kind of AI on copyrighted content)

Grimy@lemmy.world · edit-2 4 months ago

The AI-focused COPIED Act would make removing digital watermarks illegal (as well as training any kind of AI on copyrighted content)

Grimy@lemmy.world · edit-2 4 months ago

This is essentially regulatory capture. The article is very lax on calling it what it is.

A few things to consider:

Laws can’t be applied retroactively, this would essentially close the door behind Openai, Google and Microsoft. Openai with sora in conjunction with the big Hollywood companies will be the only ones able to do proper video generation.
Individuals will not be getting paid, databrokers will.
They can easily pay pennies to a third world artist to build them a dataset copying a style. Styles are not copyrightable.
The open source scene is completely dead in the water and so is fine tuning for individuals.

Edit: This isn’t entirely true, there is more leeway for non commercial models, see comments below.

AI isn’t going away, all this does is force us and the economy into a subscription model.
Companies like Disney, Getty and Adobe reap everything.

In a perfect world, this bill would be aiming to make all models copyleft instead but sadly, no one is lobbying for that in Washington and money talks.

cm0002@lemmy.world · 4 months ago

Yup, I fucking knew it. I knew this is what would happen with everyone bitching about copyright this and that. I knew any legislation that came as a result was going be bastardized and dressed up to make it look like it’s for everyone when in reality it’s going to mostly benefit big corps that can afford licensing fees and teams of lawyers.

People could not/would not understand how these AI models actually processes images/text or the concept of “If you post publicly, expect it to be used publicly” and here we are…

LainTrain@lemmy.dbzer0.com · edit-2 4 months ago

As always, the anprims/luddites/ecofashies (who downvoted me) are like an anvil to left-wing ideas of progress, we’re too busy arguing amongst ourselves to make a stand to protect open source AI from regulation.

Honestly I blame Hbomberguy personally. People were a lot more open-minded before he tacked on that shitty little AI snark at the end of his plagiarism video.

hedgehog@ttrpg.network · 4 months ago

The open source scene is completely dead in the water and so is fine tuning for individuals.

Why do you think that? The existing data sets won’t be going anywhere. Fine tuning doesn’t require nearly the same amount of training images and it’s not infeasible to get them from individual artists.

Not that that actually matters to open source developers, though, as the developer obligations only apply if you’re making the product available for a commercial purpose, so they’re not relevant to developers of gratis solutions - and most libre developers are also gratis developers. If your platform is not commercial and doesn’t have at least 25 Million monthly active users, you don’t need to allow users to add content provenance information in the first place. If it’s not for a commercial purpose, you aren’t prohibited from training on content containing content provenance information, or from removing it and training on it.

Grimy@lemmy.world · edit-2 4 months ago

I’ll be honest, I read it too fast and didn’t see the “for commercial use part”. I still think this is problematic because a lot of fine tuners and some companies putting out models either have a Patreon or offer their model for individual use but not to host on generating services without compensation (a good example of this is pony for fine tuners or codestal(I think) for general model providers). It also means any one building models can’t then commercialize models on their end while still offering it for free to the community, it puts them in a tough position. I don’t know how Metas llama could survive this or Google’s gemma. I’m also curious how this affects huggingface since I’m not sure if they are making it available like it says in the bill by hosting it.

It does put the bill in a better light though and I will edit my comment.

just_another_person@lemmy.world · 4 months ago

Removed by mod

General_Effort@lemmy.world · 4 months ago

This is a brutally dystopian law. Forget the AI angle and turn on your brain.

Any information will get a label saying who owns it and what can be done with it. Tampering with these labels becomes a crime. This is the infrastructure for the complete control of the flow of all information.

msgraves@lemmy.dbzer0.com · 4 months ago

Exactly, this isn’t about any sort of AI, this is the old playbook of trying to digitally track images, just with the current label slapped on. Regardless of your opinion on AI, this is a terrible way to solve this.

Throw_away_migrator@lemmy.world · 4 months ago

Maybe I’m missing something, but my read is that it creates a mechanism/standard for labeling content. If content is labeled under this standard, it is illegal to remove the labeling or use it in a way the labeling prohibits. But I don’t see a requirement to label content with this mechanism.

If that’s the case I don’t see a problem. Now, if all the content is required to be labeled, then yes it’s a privacy nightmare. But my interpretation was that this is a mechanism to prevent AI companies from gobbling up content without consent and saying, “What? There’s nothing saying I couldn’t use it.”

ObliviousEnlightenment@lemmy.world · 4 months ago

Most everyone from corporations to tumblr artists will be opting into that. While it doesnt guarantee an information dystopia, it does enable it

I download images from the internet and remove watermarks to edit them in youtube videos as visual aid. I add a credit to the description because Im not a cunt, I just do it to make the video look better. I dont monetize content. Utterly and totally harmless, and would be illegal with such a label

General_Effort@lemmy.world · 4 months ago

It’s rather more than that. In the very least, it is a DRM system, meant to curtail fair use. We’re not just talking about AI training. The AutoTLDR bot here would also be affected. Manually copy/pasting articles while removing the metadata becomes illegal. Platforms have a legal duty to stop copyright infringement. In practice, they will probably have to use the metadata label to stop reposts and re-uploads of images and articles.

This bill was obviously written by lobbyists for major corpos like Adobe. This wants to make the C2PA standard legally binding. They have been working on this for the last couple years. OpenAI already uses it.

In the very least, this bill will entrench the monopolies of the corporations behind it; at the expense of the rights of ordinary people.

I don’t think it’ll stop there. Look at age verification laws in various red states and around the world. Once you have this system in place, it would be obvious to demand mandatory content warnings in the metadata. We’re not just talking about erotic images but also about articles on LGBTQ matters.

More control over the flow of information is the way we are going anyway. From age-verification to copyright enforcement, it’s all about making sure that only the right people can access certain information. Copyright used to be about what businesses can print a book. Now it’s about what you can do at home with your own computer. We’re moving in this dystopian direction, anyway, and this bill is a big step.

The bill talks about “provenance”. The ambition is literally a system to track where information comes from and how it is processed. If this was merely DRM, that would be bad enough. But this is an intentionally dystopian overreach.

EG you have cameras that automatically add the tracking data to all photos and then photoshop adds data about all post-processing. Obviously, this can’t be secure. (NB: This is real and not hypothetical. More)

The thing is, a door lock isn’t secure either. It takes seconds to break down a door, or to break a window instead. The secret ingredient is surveillance and punishment. Someone hears or sees something and calls the police. To make the ambition work, you need something at the hardware level in any device that can process and store data. You also need a lot of surveillance to crack down on people who deal in illegal hardware.

I’m afraid, this is not as crazy as it sounds. You may have heard about the recent “Chat Control” debate in the EU. That is a proposal, with a lot of support, that would let police scan the files on a phone to look for “child porn” (mind that this includes sexy selfies that 17-year-olds exchange with their friends). Mandatory watermarking, that let the government trace a photo to the camera and its owner, is mild by comparison.

The bill wants government agencies like DARPA to help in the development of better tracking systems. Nice for the corpos that they get some of that tax money. But it also creates a dynamic in the government that will make it much more likely that we continue on a dystopian path. For agencies, funding will be on the line; plus there are egos. Meanwhile, you still have the content industry lobbying for more control over its intellectual “property”.

ArchRecord@lemm.ee · 4 months ago

It’s like applying DRM law to all media ever. And we know the problems with DRM already, as exemplified 2 decades ago by Cory Doctorow in his talk at Microsoft to convince them not to endorse and use it.

toothbrush@lemmy.blahaj.zone · edit-2 4 months ago

They did it. They’re passing the worst version of the AI law. Thats the end for open source AI! If this passes, all AI will be closed source, and only from giant tech companies. Im sure they will find a way to steal your stuff “legally”.

LainTrain@lemmy.dbzer0.com · 4 months ago

To the cheer of so-called progressives who never understood the tech and continue to be wilfully ignorant of it the corporations win again.

2xsaiko@discuss.tchncs.de · 4 months ago

This is exactly what OpenAI etc. wanted to achieve with all the “AI safety” bullshit doomer talk. I really hope this doesn’t pass

trashgirlfriend@lemmy.world · 4 months ago

No open source plagiarism machine :(

LainTrain@lemmy.dbzer0.com · edit-2 4 months ago

Dw artbros and other corporation defenders will get curbstomped by the closed-source ones instead, not only will you be out of employment, but you will be unemployable without a ChatGPT subscription, and Altman/Musk/whoever will be worth trillions as a result. But at least it won’t be “plagiarism” because the lobbyists will ensure that it’s all nice and legal.

And the worst part is you honestly deserve it for not listening to us.

Also, this is you:

ZILtoid1991@lemmy.world · 4 months ago

Okay, then lets ban art generators, problem solved!

LainTrain@lemmy.dbzer0.com · 4 months ago

That’s also not rational, but at least it’s consistent so it’s an improvement.

Anyway banning them is impossible, even if one country bans it, all the other countries will still have them - the internet is the whole world, remember? And even then, LLMs would still exist too, and arguably those are far more significant.

ZILtoid1991@lemmy.world · 4 months ago

“Why ban bad thing if bad country allows bad thing?”

People are saying the same about raising the minimum wage, implementing labor protections, etc. “Okay, advocate for fair wages, but then that minimum wage job of yours will be outsourced to China/India/Vietnam/etc.!”

LainTrain@lemmy.dbzer0.com · 4 months ago

GenAI isn’t a bad thing though.

girsaysdoom@sh.itjust.works · 4 months ago

Did you read the documents? It’s not as bad as what you’re saying.

It looks like the prohibited acts (section 6) specifically mention for commercial purposes where attribution markers are separated from the content. So, commercial AI software that doesn’t retain these markers or copyright marker removal done to mislead or affect in a commercial way would be against the law in 2 years.

I don’t see how this affects anything open source related. The way I understand it is that this will just force commercial applications to adapt to this and move on.

toothbrush@lemmy.blahaj.zone · edit-2 4 months ago

oh cool, nevermind then. However, most open source AI is done for commercial purposes, so it will still cripple the ecosystem.

e$tGyr#J2pqM8v@feddit.nl · edit-2 4 months ago

I don’t like AI but I hate intellectual property. And the people that want to restrict AI don’t seem to understand the implications that has. I am ok with copying as I think copyright is a load of bullocks. But they aren’t even reproducing the content verbatim are they? They’re ‘taking inspiration’ if you will, transforming it into something completely different. Seems like fair use to me. It’s just that people hate AI, and hate the companies behind it, and don’t get me wrong, rightfully so, but that shouldn’t get us all to stop thinking critically about intellectual property laws.

just another dev@lemmy.my-box.dev · 4 months ago

I’m the opposite, actually. I like generative AI. But as a creator who shares his work with the public for their (non-commercial) enjoyment, I am not okay with a billionaire industry training their models on my content without my permission, and then use those models as a money machine.

interdimensionalmeme@lemmy.ml · 4 months ago

This law will ensure only giant tech company have this power. Hobbyists and home players will be prevented.

just another dev@lemmy.my-box.dev · 4 months ago

What are you basing that on?

Content owners, including broadcasters, artists, and newspapers, could sue companies they believe used their materials without permission or tampered with authentication markers.

Doesn’t say anything about the right just applying to giant tech companies, it specifically mentions artists as part of the protected content owners.

interdimensionalmeme@lemmy.ml · 4 months ago

That’s like saying you are just as protected regardless which side of the mote you stand on.

It’s pretty clear the way things are shaping up is only the big tech elite will control AI and they will lord us over with it.

The worst thing that could happen with AI. It falling into the hands of the elites, is happening.

just another dev@lemmy.my-box.dev · 4 months ago

I respectfully disagree. I think small time AI (read: pretty much all the custom models on hugging face) will get a giant boost out of this, since they can get away with training on “custom” data sets - since they are too small to be held accountable.

However, those models will become worthless to enterprise level models, since they wouldn’t be able to account for the legality. In other words, once you make big bucks of of AI you’ll have to prove your models were sourced properly. But if you’re just creating a model for small time use, you can get away with a lot.

interdimensionalmeme@lemmy.ml · 4 months ago

I am skeptical that this is how it will turn out. I don’t really believe there will be a path from 0$ to challenging big tech without a roadblock of lawyers shutting you down with no way out on the way.

just another dev@lemmy.my-box.dev · 4 months ago

I don’t think so either, but to me that is the purpose.

Somewhere between small time personal-use ML and commercial exploitation, there should be ethical sourcing of input data, rather than the current method of “scrape all you can find, fuck copyright” that OpenAI & co are getting away with.

rekorse@lemmy.world · 4 months ago

Just because intellectual property laws currently can be exploited doesnt mean there is no place for it at all.

e$tGyr#J2pqM8v@feddit.nl · 4 months ago

That’s an opinion you can have, but I can just as well hold mine, which is that restricting any form of copying is unnatural and harmful to society.

rekorse@lemmy.world · 4 months ago

Do you believe noone should be able to charge money for their art?

e$tGyr#J2pqM8v@feddit.nl · 4 months ago

That’s right. They can put their art up for sale, but if someone wants to take a free copy nothing should be able to stop them.

rekorse@lemmy.world · 4 months ago

That effectively makes all art free. At best its donation based.

e$tGyr#J2pqM8v@feddit.nl · 4 months ago

Yes, that would be best.

rekorse@lemmy.world · 4 months ago

That would lead to most art being produced by people who are wealthy enough to afford to produce it for free, wouldn’t it?

What incentive would a working person have to work on becoming an artist? Its not like artists are decided at birth or something.

afraid_of_zombies@lemmy.world · 4 months ago

True but you people have had hundreds of years to fix the system and have not.

Adderbox76@lemmy.ca · 4 months ago

They’re ‘taking inspiration’ if you will, transforming it into something completely different.

That is not at all what takes place with A.I.

An A.I. doesn’t “learn” like a human does. It aggregates multiple chunks from multiple sources. It’s just really really tiny chunks so it’s hard to tell sometimes.

That’s why you can ask two AI’s to write a story based on the same prompt and some of their lines will be exactly the same. Because it’s not taking inspiration from, it’s literally copying bits and pieces of other works and it happens that they both chose that particular bit.

If you do that when writing a paper in university it’s called plagerism.

Get the fuck out of here with your “A.I. takes inspiration…” it copies nothing more. It doesn’t add anything new to the sum total of the creative zeitgeist because it’s just remixes of things that already exist.

LainTrain@lemmy.dbzer0.com · edit-2 4 months ago

it copies nothing more

it’s just remixes of things that already exist.

So it does do more than copying? Because as you said - it remixes.

It sounds like the line you’re trying to draw is not only arbitrary, but you yourself can’t even stick with it for more than one sentence.

Everything new is some unique combination of things that already exist, the elements it draws from are called sources and influences, and rules according to which they’re remixed are called techniques/structures e.g. most movies are three acts, and many feature specific techniques like J-cuts.

Heck even re-arranging elements of just one thing is a unique and different thing, or is your favourite song and a remix of it literally the same? Or does the remix not have artistic value, even though someone out there probably likes the remix, but not the original?

I think your confusion stems from the fact you’re a top shelf, grade-A Moron.

You’re an organic, locally sourced and ethically produced idiot, and you need to learn how basic ML works, what “new” is, and glance at some basic epistemology and metaphysics before you lead us to ruin because you don’t even understand what “new” entails, before your reactionary rhetoric leads us all down straight to cyberpunk dystopias.

NιƙƙιDιɱҽʂ@lemmy.world · 4 months ago

Damn, attack the argument, not the person, homie.

LainTrain@lemmy.dbzer0.com · 4 months ago

Yeah, sorry

Richard@lemmy.world · 4 months ago

You just reiterate what other anti-ML extremists have said like a sad little parrot. No, LLMs don’t just copy. They network information and associations and can output entirely new combinations of them. To do this, they make use of neural networks, which are computational concepts analogous to the way your brain works. If, according to you, LLMs just copy, then that’s all that you do as well.

ObliviousEnlightenment@lemmy.world · edit-2 4 months ago

Consider youtube poop, Im serious. Everyclip in them is sourced from preexisting audio and video, and mixed or distorted in a comedic format. You could make an AI to make youtube poops using those same clips and other “poops” as training data. What it outputs might be of lower quality, but in a technical sense it would be made in an identical fashion. And, to the chagrin of Disney, Nintendo, and Viacom, these are considered legally distinct entities; because I dont watch Frying Nemo in place of Finding Nemo. So why would it be any different when an AI makes it?

afraid_of_zombies@lemmy.world · 4 months ago

You can do the same thing with the Hardy Boys. You can find the same page word for word in different books. You can also do that with the Bible. The authors were plagiarizing each other.

It doesn’t add anything new to the sum total of the creative zeitgeist because it’s just remixes of things that already exist.

Do yourself a favor and never ever go into design of infrastructure equipment or eat at a Pizza Hut or get a drink from Starbucks or work for an American car company or be connected to Boeing.

Everyone has this super impressive view of human creativity and I am waiting to see any of it. As far as I can tell the less creative you are the more success you will have. But let me guess you ride a Segway, wear those shoes with toes, have gone through every recipe of Julia Childs, and compose novels that look like Finnegan’s Wake got into a car crash with EE Cummings and Gravity’s Rainbow.

Now leave me alone with I eat the same burger as everyone else and watch reruns of Family Guy in my house that looks like all the other ones on the street

_sideffect@lemmy.world · 4 months ago

A bit late now, isn’t it?

All the big corporations have already trained most of their current ai, so all this does is put the up and comers at a disadvantage.

MagicShel@programming.dev · 4 months ago

It could halt the progress of improving their models and stagnate the whole technology.

That being said, it only halts progress for American companies. Other countries will happily ignore this law and grow beyond our capabilities. I’m not sure if that’s better or worse than the current situation.

bionicjoey@lemmy.ca · 4 months ago

Reminds me of Russia before WWI began. They realized they had fallen horribly behind the rest of the world in terms of military technology, so they called an arms limitation treaty conference where they pushed for basically every country in the world to agree to stop inventing any new weapons of any kind.

fuzzzerd@programming.dev · 4 months ago

How’d that work out for them? Answer? Not well. History repeats itself, so here we go!

Kuvwert@lemm.ee · 4 months ago

From what I understand the next rounds of ai are being trained on further refined versions of the same datasets and supplemented with synthetic data.

The damage to existing copyrighted content is already done.

Source: I’m a random internet user

General_Effort@lemmy.world · 4 months ago

It’s all still there. No damage was done.

Kuvwert@lemm.ee · 4 months ago

Well, perceived damage anyway. I can’t speak to how IP owners have been effected by LLMs, and I don’t believe it would be easy to quantify.

just another dev@lemmy.my-box.dev · 4 months ago

Seeing as laws can’t be applied retroactively, what would have been the alternative?

Grimy@lemmy.world · 4 months ago

deleted by creator

linearchaos@lemmy.world · 4 months ago

Ladies and gentlemen of the jury, before you stands 8-year-old Billy Smith. He stands accused of training on copyrighted material. We actually have live video of him looking and reading books from the library. He he trained on the contents of over 100 books this year.

We ask you to enforce the maximum penalty and send his parents to prison.

TheGrandNagus@lemmy.world · edit-2 4 months ago

I get what you’re saying, but there’s something of a difference between someone studying something for months or years then writing about it, and a language model ran by one of the tech giants scraping media and immediately generating stuff from it, for commercial use, for the profit of the company that owns it.

It’s kinda like how plagiarising somebody’s book word for word never used to be a crime when it was a painstaking process of manually writing it back out for every copy. When the printing press came out, though? It allowed dodgy businesses to large-scale fuck over authors, and the law had to play catch-up.

I don’t actually think this proposal is that well thought out, but I also don’t think we should think of AI models or corporations as being people - they aren’t people, and they shouldn’t necessarily have the same rights and privileges that we do.

Evotech@lemmy.world · 4 months ago

There’s a lot of private people training models (Lora, Dora’s etc) / fine-tuning checkpoints and what have you

Training models is not just giant tech corps anymore

TheGrandNagus@lemmy.world · edit-2 4 months ago

I know, I have one running locally on my PC, it’s neat.

I still don’t think that changes my point, though - that a large AI model, particularly one that can scrape the whole web of any content it can find, then immediately be used to generate a practically infinite amount of content in seconds is very different to the idea of a little 8 year old in a library reading books then writing something himself.

And I still maintain that companies aren’t people and shouldn’t necessarily have the same rights as a person.

ObliviousEnlightenment@lemmy.world · 4 months ago

What of the images random people generate from software like dall e? Those are made from the same training data, and what this poicy does to them is make media creation more inaccessible even though the technology exists. Also, copying a book word for word by hand isnt/wasnt plagarism, its unlicensed duplication. Plagarism would be changing just the proper nouns and pretending like its a completely seperate book

assassin_aragorn@lemmy.world · 4 months ago

No matter how much you’d like for it to be the case, proprietary algorithms owned by big corporations are not remotely comparable to children.

0laura@lemmy.world · 4 months ago

proprietary algorithms owned by big corporations

tell that to civitai users lol

ZILtoid1991@lemmy.world · 4 months ago

Your machine learning algorithms are not people. No amount of calling it Alex or giving it a voice stolen from a well-known actress will change that fact.
If I traced an artwork or copied GPL licensed code into an non-GPL one, my ass would be beaten by others on the internet.
So far, the main usecase of this generative technology is scamming, intentionally creating distrust in the artist community, and an even worse and scummier form of plagiarism, but it doesn’t matter because some shitpost that goes hard, “what if a content creator needs a stock photo?”, and “what if it could be used to resurrect your favorite artist?”.
Power imbalance. There’s a difference a young creator not having money to buy a training material and a big corporation wanting to destroy their profession.

ArmokGoB@lemmy.dbzer0.com · 4 months ago

If I traced an artwork or copied GPL licensed code into an non-GPL one, my ass would be beaten by others on the internet.

If I gave you an arbitrary image from Midjourney and all of the training data from it, I doubt you could match it to the “source art.” AI images are usually transformative.

ObliviousEnlightenment@lemmy.world · 4 months ago

This, exactly. AI is generating new images. Oh whoop de do, they did it by mixing a bunch of pixels. As though making an image out of tiny photos isnt literally the same thing and considered transformative. People just have a double standard about a program instead of a person doing it. (Except for that subset kd online artists, they’re just bezerk about copyright and credit in general)

ZILtoid1991@lemmy.world · 4 months ago

Which part of “an even worse and scummier form of plagiarism” you didn’t understand?

ArmokGoB@lemmy.dbzer0.com · 4 months ago

What part of “transformative” did you not understand?

ZILtoid1991@lemmy.world · 4 months ago

Different scale, but just go on and defend your billion dollar industry, because “what if it was open source” despite the open source community would never have the ability and the resources to train these models.

ClamDrinker@lemmy.world · 4 months ago

What are you talking about? The open source community has trained these kinds of models. They’re out there.

ArmokGoB@lemmy.dbzer0.com · 4 months ago

I honestly could not give less of a shit who’s training the models. I’m not gonna boycott C# because it was developed by Microsoft. There are open source implementations of generative AI that make use of freely-available models.

interdimensionalmeme@lemmy.ml · 4 months ago

Thanks chatgpt

ZILtoid1991@lemmy.world · 4 months ago

Pleased to take part in creating the scarcity free future by letting hustle bros to ruin art communities, and letting terminally online people to create endless followups to Metropolis Pt. II instead of them sending death threats to Dream Theater!

Doomsider@lemmy.world · edit-2 4 months ago

If you put something on the Internet you are giving up ownership of it. This is reality and companies taking advantage of this for AI have already proven this is true.

You are not going to be able to put the cat back in the bag. The whole concept of ownership over art, ideas, and our very culture was always ridiculous.

It is past time to do away with the joke of the legal framework we call IP law. It is merely a tool for monied interests to extract more obscene profit from our culture at this point.

There is only one way forward and that is sweeping privacy protections. No more data collection, no more targeted advertising, no more dark patterns. The problem is corporations are not going to let that happen without a fight.

nasi_goreng@lemmy.zip · 4 months ago

deleted by creator

hedgehog@ttrpg.network · 4 months ago

There are plenty of internet culture outside Western that still respect ownership, people don’t just take random things on internet without permission. Western internet culture =/= entire internet.

Which cultures are you referring to?

afraid_of_zombies@lemmy.world · 4 months ago

Fakelandia it is on a continent that you probably haven’t heard of.

afraid_of_zombies@lemmy.world · 4 months ago

Yeah in theory but in practice that isn’t happening. In theory the laws could be structured such that creatives are being paid fairly and distributors make some money and that the general public knows the stuff will be public domain in a relatively short period of time.

No one is doing it and they had hundreds of years to figure out how to do it. You are asking us to take it on faith and I personally will not.

LainTrain@lemmy.dbzer0.com · 4 months ago

Incredibly well-put. IP is just land for the wannabe landlords of information and culture.

They are just attempting to squeeze the working class dry, take the last freedoms we have so we have to use their corporate products.

cyd@lemmy.world · edit-2 4 months ago

If this passes, this would have the perverse effect of making China (and maybe to a lesser extent the Middle East) the leading suppliers of open source / open weight AI models…

Melt@lemm.ee · 4 months ago

China would be the world leader in making AI model trained on copyrighted content

catloaf@lemm.ee · 4 months ago

And as the vast majority of content is not licensed for AI model training, they would have an immensely larger dataset to train on.

Petter1@lemm.ee · 4 months ago

Well, there is also Europe ✌🏻

General_Effort@lemmy.world · 4 months ago

No. In the EU, the lobbyists have already won. Major countries, like Germany, have always had very conservative copyright laws. I believe it’s one reason why their cultures are losing so hard.

Surprisingly, Japan has adopted a very sensible law on AI training.

afraid_of_zombies@lemmy.world · 4 months ago

I am just sitting here with my eye twitching thinking of all the code I have had to deal with from German companies over the years.

riodoro1@lemmy.world · 4 months ago

So the rich have already scalped what they could. Now it can be made illegal

just another dev@lemmy.my-box.dev · 4 months ago

Because even when some of the water has gotten out, you still go plug the dam.

The best moment was earlier. The second best moment is now.

Grimy@lemmy.world · edit-2 4 months ago

This is more akin to diverting a public river into private land so the landowner can charge everyone what they were getting for free.

The river cannot be dammed and this bill doesn’t aim to even try.

A better solution would be to make all models copyleft, so even if corporations dip their cup in the water, whatever they produce has to be thrown back in.

trollbearpig@lemmy.world · edit-2 4 months ago

Maybe I’m missing something, but I don’t understand what you guys mean by “the river cannot be dammed”. The LLM models need to be retrained all the time to include new data and in general to get them to change their behavior in any way. Wouldn’t this bill apply to all these companies as soon as they retrain their models?

I mean, I get the point that old models would be exempt from the law since laws can’t be retroactive. But I don’t get how that’s such a big deal. These companies would be stuck with old models if they refuse to train new ones. And as much hype as there is around AI, current models are still shit for the most part.

Also, can you explain why you guys think this would stop open source models? I have always though that the best solution to stop these fucking plagiarism machines was for the open source community to create an open source training set were people contribute their art/text/whatever. Does this law prevents this? Honestly to me this panic sounds like people without any artistic talent wanted to steal the work of artists and they are now mad they can’t do it.

Grimy@lemmy.world · edit-2 4 months ago

The game right now is about better training methods and curating current datasets, new data is not needed.

Obviously though, eventually they will want new data so their models aren’t stuck in the past but this won’t stop them from getting it. There isn’t a future where individuals negotiate with google on how much they get paid, all that data is already owned by the platform it’s being posted on. Almost all websites slap on their own copyright or something similar, even for images. Deviant art and even Cara, the platform that’s suppose to be artist friendly, does this. Anything uploaded to Google maps gets a copyright on it if I’m not mistaken, Reddit as well. This data will be prohibitively expensive as to create a moat and strengthen soft monopolies.

Public datasets are great but aren’t enough in most cases. This is also the equivalent of saying “well they diverted the river, why don’t you build yourself a stream”. It’s also problematic since by it’s public nature, it means corporations can come over, dip their cup in the water and throw it into their river. It brings down their costs while making sure nothing can actually compete with them.

Also worth noting that there is no worthy public dataset for videos. 98% of the data is owned by YouTube or Hollywood.

trollbearpig@lemmy.world · 4 months ago

My man, I think you are mixin a lot of things. Let’s go by parts.

First, you are right that almost all websites get some copyright rights when you post on their platforms. At best, some license the content as Creative Commons or similar licenses. But that’s not new, that has been this way forever. If people are surprised that they are paying with their data at this point I don’t know what to say hahaha. The change with this law would be that no one, big tech companies or open source, gets to use this content for free to train new models right?

Which brings me back to my previous question, this law applies to old data too right? You say “new data is not needed” (which is not true for chat LLMs that want to include new data for example), but old data is still needed to use the new methods or to curate the datasets. And most of this old data was acquired by ignoring copyright laws. What I get from this law is that no one, including these companies, gets to keep using this “illegaly” acquired data now right? I mean, I’m pretty sure this is the case since movie studios and similar are the ones pushing for this law, they will not go like “it’s ok you stole all our previous libraries, just don’t steal the new stuff” hahahaha.

I do get your point that the most likely end result is that movie studios, record labels, social media platforms, etc, will just start selling the rights to train on their data and the only companies who will be able to afford this are the big tech companies. But still, I think this is a net possitive (weird times for me to be on the side of these awful companies hahaha).

First of all, it means no one, including big tech companies, get to steal content that is not theirs or given to them willingly. I’m particularly interested in open source code, but the same applies to indie art and any other form of art outside of the big companies. When we say that we want to stop the plagiarism it’s not a joke. Tech companies are using LLMs to attack the open source community by stealing the code under the excuse of LLMs being transformative (bullshit of course). Any law that stops this is a possitive to me.

And second of all, consider the 2 futures we have in front of us. Option one is we get laws like this, forcing AI to comply with copyright law. Which basically means we maintain the current status quo for intellectual property. Not great obviously, but the alrtenative is so much worse. Option two is we allow people to use LLMs to steal all the intellectual property they want, which puts an end to basically any market incentives to produce art by humans. Again, the current copyright system is awful. But why do you guys want a system were we as individuals have to keep complying with copyright but any company can bypass that with an LLM? Or how do you guys think this is going to pan out if we just don’t regulate AI?

Grimy@lemmy.world · 4 months ago

Google already paid 6 million to Reddit for their dataset (preemptively since I’m guessing they are lobbying for laws like this), I didn’t get a dime. Who do you think this helps here?

The change with this law would be that no one, big tech companies or open source, gets to use this content for free to train new models right?

My point is that this essentially insure that ONLY big tech companies will get to use the content. Do you think they mind spending a few million if it gives them a monopoly? They actively want this.

If it’s between the platform I used getting paid for my content while I get nothing and then I have to pay Openai to use a tool built with my content or the platform and me getting nothing while I get free AI, I will chose the latter.

There are two scenarios and in both, AI massively brings up productivity and huge layoffs happen. The difference is in one scenario, the tools are priced low enough so it’s economical to replace 5 workers with them but high enough so those same workers can’t afford them and compete with the business that just fired them. A situation where no company can remain competitive without paying Openai or Google 50k a month is a dystopian nightmare.

Open source is the best way to make sure this doesn’t happen and while these laws are the smallest of speed bumps for big tech companies, it is a literal wall for FOSS.

The best solution would be to copyleft all models using public data, the second best would be to leave things as is. This isn’t a solution but regulatory capture.

trollbearpig@lemmy.world · 4 months ago

My man, I think you are delisuonal hahahaha. You are giving AI way too much credit to a technology that’s just a glorified autocomoplete. But I guess I get your point, if you think that AI (and LLMs in particular hahahaha) is the way of the future and all that, then this is apocalyptic hahahahaha.

But you are delisuonal my man. The only practical use so far for these stupid LLMs is autocomplete which works great when it works. And bypassing copyright law by pretending it’s producing novel shit. But that’s a whole other discussion, time will show this is just another bubble like crypto hahahaha. For now, I hope they at least force everyone to stop plagiarising other peoples work with AI.

afraid_of_zombies@lemmy.world · 4 months ago

Yeah it is really messed up that Disney made untold tens of billions of dollars on public domain stories, effectively cut us off from our own culture, then extended the duration to indefinite. I wonder why near everyone was silent about this issue for multiple decades until it became cliche to pretend to care about furry porn creators.

Creatives have always been screwed, we are the first civilization to not only screw them but screw the general public. As shit as it was in the past you could just copy a freaken scroll.

Anyway you guys have fun defending some of the worst assholes in human history while acting like you care about people you weren’t even willing to give a buck a month to on patreon.

LordCrom@lemmy.world · 4 months ago

There’s absolutely no way to enforce this.

piecat@lemmy.world · edit-2 4 months ago

No? You don’t think the courts would approve of a fishing expedition for forth amendment violation access to your computer?

Treczoks@lemmy.world · 4 months ago

As if a law could prevent anything of that. They simply demand “Pigs Must Fly”, and don’t waste a thought on how utterly unrealistic this is.

UnderpantsWeevil@lemmy.world · 4 months ago

As if a law could prevent anything of that.

Generating legal liability goes a long way towards curbing how businesses behave, particularly when they can be picked on by rival mega-firms.

But because we’ve made class action lawsuits increasingly difficult, particularly after Comcast Corp. v. Behrend, the idea that individual claimants can effectively prosecute a case against an interstate or international entity is increasingly farcical. You’re either going to need big state agencies (the EU seems increasingly invested in cracking down on American tech companies for anti-competitive practices) or rivalrous business interests (MPAA/RIAA going after Big Tech backed AI firms) to leverage this kind of liability. It’s still going to be open season on everyone using DeviantArt or Pinterest or whatever.

catloaf@lemm.ee · 4 months ago

This sounds exactly like existing copyright law and DRM.

Grimy@lemmy.world · 4 months ago

It’s strengthening copyright laws by negating the transformative clause when dealing with AI

just another dev@lemmy.my-box.dev · 4 months ago

Hopefully the next step: force every platform that deals in user generated content to give users the choice to exploit that content for a fraction of the profit, or to exclude their content from processing.

It’s amazing how many people don’t realize that they themselves also hold copyright over their content, and that laws like these protect them as well.

ObliviousEnlightenment@lemmy.world · 4 months ago

I posted this in a thread, but Im gonna make it a parent comment for those who support this bill.

Consider youtube poop, Im serious. Every clip in them is sourced from preexisting audio and video, and mixed or distorted in a comedic format. You could make an AI to make youtube poops using those same clips and other “poops” as training data. What it outputs might be of lower quality (less funny), but in a technical sense it would be made in an identical fashion. And, to the chagrin of Disney, Nintendo, and Viacom, these are considered legally distinct entities; because I dont watch Frying Nemo in place of Finding Nemo. So why would it be any different when an AI makes it?

MeaanBeaan@lemmy.world · 4 months ago

I see this argument a lot as a defense for AI art and I see a couple major flaws in this line of thinking.

First, it’s treating the AI art as somehow the same as a dirivitive (or parody) work made by an actual person. These two things are not the same and should not be argued like they are.

AI art isn’t just dirivitive. It’s a Frankenstein’s Monster of a bunch of different pieces of art stitched together in a procedural way that doesn’t credit and in fact obfuscates the original works. This is problematic at best and flat out dishonest thievery at worst. Whereas a work made by a person that is dirivitive or parody has actual work and thought put into it by an actual person. And would typically at least credit the original works being riffed on. This involves actual creative thought and human touch. Even if it is dirivitive it’s unique in some way simply by virtue of being made by a person.

AI art cannot and will not ever be unique, at least not when used to just create a work wholesale. Because it’s not being creative. It’s calculating and nothing more. (at least if we’re talking about current tachnology. A possible future General AI could flout this argument. But that would get into an AI personhood conversation not really relevant to our current machine learning tech).

Secondly, no one is worried that some hypothetical shitty AI video is going to somehow usurp the work that it’s stealing from. What people are worried about is that AI art is going to be used in place of hiring actual artists for bigger projects. And the fact that this AI art exists solely because it’s scraped the internet of art from those same artists now losing their livelihoods makes the tech incredibly fucked up.

Now don’t get me wrong though. I do believe machine learning has its place in society. And we’ve already been using it for a long time to help with large tasks that would be incredibly difficult if not impossible for people to do on their own in a bunch of different industries. Things like medicine research in the pharmaceutical sector and fraud monitoring in the banking sector come to mind.

Also, there is an argument to be had that machine learning algorithms could be used as tools in creating art. I don’t really have a problem with those use cases. Things that come to mind are a bunch of different tools that exist in music production right now that in my opinion help in allowing artists to fulfill their vision. Watch some There I Ruined It videos on YouTube to see what I mean. Yeah that guy is using AI to make himself sound like other musicians. But that guy also had to be a really solid singer and impressionist in the first place for those songs to be any good at all.

Grimy@lemmy.world · edit-2 4 months ago

It’s a Frankenstein’s Monster of a bunch of different pieces of art stitched together in a procedural way that doesn’t credit and in fact obfuscates the original works

What you described is collage and is completely legal. How image generation works is much more complicated but in any case, both it and collage clearly fall under transformative use.

https://en.m.wikipedia.org/wiki/Transformative_use

Schadrach@lemmy.sdf.org · 4 months ago

This is problematic at best and flat out dishonest thievery at worst.

You could say that about literally all art - no artist can name and attribute every single influence that played even the smallest effect on the work created. Say I commissioned an image of an anime man in a french maid uniform in a 4 panel pop art style. In creating it at some level you are going to draw on every anime image you’ve seen, every picture of a french maid uniform, every 4 panel pop art image and create something that’s a synthesis of all those things. You can’t name and attribute every single example of all of those things you have ever seen, as well as anything else that might have influenced you.

Whereas a work made by a person that is dirivitive or parody has actual work and thought put into it by an actual person.

…and this is the crux of it - it’s not anything related to the actual content of the image, it’s simple protectionism for a class of worker. Basically creatives are seeing the possibility of some of their jobs being automated away and are freaking out because losing jobs to automation is something that’s only supposed to effect manufacturing workers.

Even if it is dirivitive it’s unique in some way simply by virtue of being made by a person.

Again, the argument is it’s nothing to do with the actual result, but with it being done by an actual human as opposed to a mere machine. A pixel for pixel identical image create by a human would be “art” by virtue of it being a human that put each pixel there?

MeaanBeaan@lemmy.world · 4 months ago

You could say that about literally all art

Except I couldn’t. Because a person being influenced by an artwork and then either intentionally or subconsciously reinterpreting that artwork into a new work of art is a fundamentally different thing from a power hungry machine learning algorithm digesting the near entirety of modern humanity’s art output to churn out an image manufactured to best satisfy some random person’s text prompt.

They’re just not the same thing at all.

The whole purpose of art is to be an outlet for expressing ourselves as human beings. It exists out of this need for expression; part of what makes a work worth appreciating is the human person(s) behind that said work and the effort and skill they put into making it.

…and this is the crux of it - it’s not anything related to the actual content of the image, it’s simple protectionism for a class of worker. Basically creatives are seeing the possibility of some of their jobs being automated away and are freaking out because losing jobs to automation is something that’s only supposed to effect manufacturing workers.

Yes it has nothing to do with the content of the image. I never claimed otherwise. In fact AI art sometimes being indistinguishable from human made art is part of the problem. But we’re not just talking about automating someone’s job. We’re talking about automating someone’s passion. Automating someone’s dream career. In an ideal world we’d automate all the shitty jobs and pay everyone to play guitar, paint a portrait, write a book, or direct a film. Art being made by AI won’t just take away jobs for creatives, it’ll sap away the drive we have as humans to create. And when we create less our existence will be filled with even more bleakness than it already is.

Again, the argument is it’s nothing to do with the actual result, but with it being done by an actual human as opposed to a mere machine. A pixel for pixel identical image create by a human would be “art” by virtue of it being a human that put each pixel there?

I’m not certain I understand what you’re asking. But If the human is the one making the decision on where to put the pixel then yeah that would be fine. But at no point am I arguing about whether or not AI art is “art”. That would just be a dumb semantic argument that’d go nowhere. I’m merely discussing why I believe AI art to be unethical. And the taking away work from creatives point is only one facet as to why I do.

JamesFire@lemmy.world · 4 months ago

The whole purpose of art is to be an outlet for expressing ourselves as human beings. It exists out of this need for expression; part of what makes a work worth appreciating is the human person(s) behind that said work and the effort and skill they put into making it.

This is completely and utterly your own opinion, not a fact. I know several people who can’t draw for shit, due to various reasons, but now AI allows them to create images they enjoy. One of them has aphantasia (They literally cannot imagine images).

This is basically trying to argue there’s only 1 correct way to make “art”, which is complete and utter bullshit. Imagine trying to say that a sculpture isn’t art because it was 3D printed instead of chiseled. It makes 0 sense for the method of making the art to impact whether or not it is art. “Expression” can take many forms. Why is this form invalid?

MeaanBeaan@lemmy.world · 4 months ago

This is completely and utterly your own opinion, not a fact. I know several people who can’t draw for shit, due to various reasons, but now AI allows them to create images they enjoy. One of them has aphantasia (They literally cannot imagine images).

Never claimed it wasn’t an opinion. And I fully acknowledge that tools can make creating art easier. Hell, I even support the use of machine learning tools when making art. When used as tools and not as a means of creating art wholesale they can enable creativity. But, I’m sorry, writing a text prompt for an AI to produce an image is not making art (for the person writing the prompt). It’s writing a prompt. In the same way that a project manager writing a brief for a contract artist to fullfil is also not creating the art. The AI is producing the art (and by extension the artists who created the works the AI was trained off of). Your friend with aphantasia is not.

This is basically trying to argue there’s only 1 correct way to make “art”, which is complete and utter bullshit. Imagine trying to say that a sculpture isn’t art because it was 3D printed instead of chiseled. It makes 0 sense for the method of making the art to impact whether or not it is art. “Expression” can take many forms. Why is this form invalid?

Again, I never said any form of art was invalid. Not even AI art. Nor do I think AI art isn’t art. AI art is perfectly capable of creating something worthwhile by means of its content. It’s basing it’s output on worthwhile works of art created by people after all. I’m merely arguing AI art is unethical. If you made a mural out of the blood of children you murdered it’d still be art. But it sure as shit wouldn’t be ethical.

Drewelite@lemmynsfw.com · edit-2 4 months ago

With all respect, your argument has a pretty obvious emotional valence. You don’t care if the result is 1:1, you care that it happened in a way that makes you uncomfortable. Art can be an outlet for self expression and no one is taking that away. What’s it to you if I enjoy asking an AI for art?

The fact of the matter is, capitalism has never been a good place for artists who want to follow their dream. If that’s something you want, then I’d suggest supporting the end of all work for money that automation provides. Then people can truly work on whatever they care about all day and not have to worry about feeding themselves.

Schadrach@lemmy.sdf.org · 4 months ago

Except I couldn’t. Because a person being influenced by an artwork and then either intentionally or subconsciously reinterpreting that artwork into a new work of art is a fundamentally different thing from a power hungry machine learning algorithm digesting the near entirety of modern humanity’s art output

The big differences there are whether it’s a person or a machine and just how much art one can digest as inspiration. Again, reference my example of a commission above - the main difference between a human and an AI making it is whether they look up a couple dozen examples of each element to get a general idea or 100 million examples of each element to mathematically generalize the idea, and the main reason the number of examples and power requirements need to be so different is that humans are extremely efficient pattern developing and matching machines, so efficient that sometimes the brain just fills in the pattern instead of bothering to fully process sensory inputs (which is why a lot of optical illusions work).

to churn out an image manufactured to best satisfy some random person’s text prompt.

At a level, “churning out an image to best satisfy some random person’s” description is essentially what happens when someone commissions a work or when producing things to spec as part of some project. They don’t generally say “just draw whatever you are inspired to” and hope they like the result. This is the thing that AI image generators are specifically good at, and is why I say it’s about protectionism for a class of workers who didn’t think their jobs could be automated away in whole or in part.

But we’re not just talking about automating someone’s job.

Except you are, you are just deeming that job “someone’s dream career” as though that changes whether or not it’s a job that is being automated in whole or part. Yes, it’s going to hurt the market for commissioned art works and the like. Again, upset because those jobs are supposed to be immune to automation and - whoopsie - they aren’t. Join the people in manufacturing, or the makers of buggy whips.

We’re talking about automating someone’s passion.

Literally no one is going to ban or forbid anyone from creating art because AI art exists.

hark@lemmy.world · 4 months ago

My best guess would be intent, which I think is an important component of fair use. The intent of youtube poop creators could be considered parody and while someone could use AI to create parody, the intent of creating the AI model itself is not parody (at least not for these massive AI models that most people use).

ObliviousEnlightenment@lemmy.world · 4 months ago

Transformation is in itself fair use is the thing. Ytp doesnt need to be parody or critique or anything else, because its fundamentally no longer the same product as whatever the source was as a direct result of editing

hark@lemmy.world · 4 months ago

Still, the AI model itself is not transformative, it is merely incorporating that data into its training set.

ObliviousEnlightenment@lemmy.world · 4 months ago

But what it outputs IS transformative, which- of course- is the e primary use

hark@lemmy.world · 4 months ago

If I include an image of mickey mouse (ripped straight from disney) in my application in a proprietary compression format, then the application decompresses that image and changes the hue (or whatever other kind of modification), then these are technically “transformations” but they’re not transformative.

ObliviousEnlightenment@lemmy.world · 4 months ago

The law being violated there is trademark, not copyright

hark@lemmy.world · 4 months ago

No it isn’t. The image of mickey mouse was literally copied (hence copyright, literally right to copy). Regardless, that’s still IP law being violated so I don’t know how that helps your case.

NeoNachtwaechter@lemmy.world · 4 months ago

LOL

So I take your photo, remove your watermark, put my own watermark on it, and then I sue you for removing my watermark.

General_Effort@lemmy.world · 4 months ago

Don’t be a fool. Of course, content corporations like Disney or the NYT are able to prove just when something was made.

NeoNachtwaechter@lemmy.world · edit-2 4 months ago

Don’t be a fool either.

Of course I am going to do this to you, not to Disney etc. because I am way better at creating proof than you are.

And of course Disney etc. are going to do this to you and me, because they are even better at creating proof than you and me are.

That’s how foolish this law is.

Womble@lemmy.world · 4 months ago

So what you’re saying is that this is a law designed to extend corporate control over information and culture even further?

General_Effort@lemmy.world · 4 months ago

This bill reads like it was written by Adobe.

This provenance labelling scheme already exists. Adobe was a major force behind it. (see here: https://en.wikipedia.org/wiki/Content_Authenticity_Initiative ). This bill would make it so that further development will be tax-funded through organizations like DARPA.

Of course, they are also against fair use. They pay license fees for AI training. For them, it means more cash flow.

explodicle@sh.itjust.works · 4 months ago

It’s pretty cheap to just time stamp everything.

afraid_of_zombies@lemmy.world · 4 months ago

Anyone supporting this better be against right of repair and jail time for anyone discussing a sporting event without written permission