@ClamDrinker

ClamDrinker@lemmy.world · 2 days ago

Definitely not just you, I instantly said “holy shit this is one ancient meme”. I remember it too from the early 2000’s.

ClamDrinker@lemmy.world · 3 days ago

If you think that’s depressing, wait until you find out that it’s basically nothing in the grand scheme of things.

spoiler

Most sources agree that we use about 4 trillion cubic meters of water every year worldwide (Although, this stat is from 2015 most likely, and so it will be bigger now). In 2022, using the stats here Microsoft used 1.7 billion gallons per year, and Google 5.56 billion gallons per year. In cubic meters that’s only 23.69 million cubic meters. That’s only 0.00059% of the worldwide water usage. Meanwhile agriculture uses on average 70% of a country’s daily fresh water.

Even if we just look at the US, since that’s where Google and Microsoft are based, they use 322 billion gallons of water every day, resulting in about 445 billion cubic meters per year, that’s still 0.00532%. So you can have 187 more Googles and Microsofts before you even top a single percentage.

_

And as others have pointed out the water isn’t gone, there’s some cyclicality in how the water is used.

ClamDrinker@lemmy.world · 9 days ago

Yeah and honestly, this is largely a reasonable standard for anyone running an email server. If you don’t have SPF, DKIM and DMARC, basically anyone can spoof your emails and you’d be none the wiser. It also makes spam much harder to send without well, sacrificing IP addresses to the many spam lists. I wouldn’t be surprised if some people setting up their own mail server were made aware of these things because of being blocked.

ClamDrinker@lemmy.world · edit-2 12 days ago

There is so much wrong with this…

AI is a range of technologies. So yes, you can make surveillance with it, just like you can with a computer program like a virus. But obviously not all computer programs are viruses nor exist for surveillance. What a weird generalization. AI is used extensively in medical research, so your life might literally be saved by it one day.

You’re most likely talking about “Chat Control”, which is a controversial EU proposal to scan either on people’s devices or from provider’s ends for dangerous and illegal content like CSAM. This is obviously a dystopian way to achieve that as it sacrifices literally everyone’s privacy to do it, and there is plenty to be said about that without randomly dragging AI into that. You can do this scanning without AI as well, and it doesn’t change anything about how dystopian it would be.

You should be using end to end regardless, and a VPN is a good investment for making your traffic harder to discern, but if Chat Control is passed to operate on the device level you are kind of boned without circumventing this software, which would potentially be outlawed or made very difficult. It’s clear on it’s own that Chat Control is a bad thing, you don’t need some kind of conspiracy theory about ‘the true purpose of AI’ to see that.

ClamDrinker@lemmy.world · edit-2 14 days ago

Yes, but most people dont have that or take way too long than is worth in effort and (lack of) enjoyability for a simple meme. There already exist models to unblur entire images in seconds. AI should take the shitty work lol.

ClamDrinker@lemmy.world · edit-2 17 days ago

You don’t solve a dystopia by adding more dystopian elements. Yes, some companies are scum and they should be rightfully targeted and taken down. But the way you do that is by targeting those scummy companies specifically, and creatives aren’t the only industry suffering from them. There are broad spectrum legislatures to do so, such as income based equality (proportional taxing and fining), or further regulations. But you don’t do that by changing fundamental rights every artists so far has enjoyed to learn their craft, but also made society what it is today. Your idea would KILL any scientific progress because all of it depends on either for profit businesses (Not per se the scummy ones) and the freedom to analyze works without a license (Something you seem to want to get rid of), in which the vast majority is computer driven. You are arguing in favor of taking a shot to the foot if it means “owning the ~~libs~~ big companies” when there are clearly better solutions, and guess what, we already have pretty bad luck getting those things passed as is.

And you think most artists and creatives don’t see this? Most of us are honest about the fact of how we got to where we are, because we’ve learned how to create and grow our skill set this same way. By consuming (and so, analyzing) a lot of media, and looking a whole lot at other people making things. There’s a reason “good artists copy, great artists steal” is such a known line, and I’d argue against it because I feel it frames even something like taking inspiration as theft, but it’s the same argument people are making in reverse for AI.

But this whole conversation shouldn’t be about the big companies, but about the small ones. If you’re not in the industry you might just not know that AI is everywhere in small companies too. And they’re not using the big companies if they can help it. There’s open source AI that’s free to download and use, that holds true to open information that everyone can benefit from. By pretending they don’t exist and proposing an unreasonable ban on the means, denies those without the capital and ability to build their own (licensed) datasets in the future, while those with the means have no problem and can even leverage their own licenses far more efficiently than any small company or individuals could. And if AI does get too good to ignore, there will be the artists that learned how to use AI, forced to work for corporations, and the ones that don’t and can’t compete. So far it’s only been optional since using AI well is actually quite hard, and only dumb CEOs would put any trust in it replacing a human. But it will speed up your workflow, and make certain tasks faster, but it doesn’t replace it in large pieces unless you’re really just making the most generic stuff ever for a living, like marketing material.

Never heard of Cara. I don’t doubt it exists somewhere, but I’m wholly uninterested in it or putting any work I make there. I will fight tooth and nail for what I made to be mine and allowing me to profit off it, but I’m not going to argue and promote for taking away the freedom that allowed me to become who I am from others, and the freedom of people to make art in any way they like. The freedom of expression is sacred to me. I will support other more broad appealing and far more likely to succeed alternatives that will put these companies in their place, and anything sensible that doesn’t also cause casualties elsewhere. But I’m not going to be in favor of being the “freedom of expression police” against my colleagues, and friends, or anyone for that matter, on what tools they can or cannot not use to funnel their creativity into. This is a downright insidious mentality in my eyes, and so far most people I’ve had a good talk about AI with have shared that distaste, while agreeing to it being abused by big companies.

Again, they can use whatever they want, but Nightshade (And Glaze) are not proven to be effective, in case you didn’t know. They rely on misunderstandings, and hypothetically only work under extremely favorable situations, and assume the people collecting the dataset are really, really dumb. That’s why I call it snake oil. It’s not just me saying exactly this.

ClamDrinker@lemmy.world · 20 days ago

If you think I’m being optimistic about UBI, I can only question how optimistic you are about your own position receiving wide spread support. So far not even most artists stand behind anti AI standpoints, just a very vocal minority and their supporters who even threaten and bully other artists that don’t support their views.

It’s not about “analysis” but about for-profit use. Public domain still falls under Fair Use.

I really don’t know what you’re trying to say here. Public domain is free of any copyright, so you don’t need a fair use exemption to use it at all. And for-profit use is not a factor for whether analysis is allowed or not. And if it was, again, it would stagnate the ability for society to invent and advance, since most frequent use is for profit. But even if it wasn’t, one company can produce the dataset or the model as a non-profit, and the other company could use that for profit. It doesn’t hold up.

As it stands, artists are already forming their own walled off communities to isolate their work from being publicly available

If you want to avoid being trained on by AI, that’s a pretty good way to do it yes. It can also be combined with payment. So if that helps artists, I’m all for it. But I have yet to hear any of that from the artists I know, nor seen a single practical example of it that wasn’t already explicitly private (eg. commissions or a patreon). Most artists make their work to be seen, and that has always meant accepting that someone might take your work and be inspired by it. My ideas have been stolen blatantly, and I cannot do a thing about it. That is the compromise we make between creative freedom and ownership, since the alternative would be disastrous. Even if people pay for access, once they’ve done so they can still analyze and learn from it. But yes, if you don’t want your ideas to be copied, never sharing it is a sure way to do that, but that is antithetical to why most people make art to begin with.

creating software to poison LLMs.

These tools are horribly ineffective though. They waste artists time and/or degrade the artwork to the point humans don’t enjoy it either. It’s an artists right to use it though, but it’s essentially snake oil that plays on these artists fears of AI. But that’s a whole other discussion.

So either art becomes largely inaccessible to the public, or some form of horrible copyright action is taken because those are the only options available to artists.

I really think you are being unrealistic and hyperbolic here. Neither of these have happened nor have much of chance of happening. There are billions of people producing works that could be considered art and with making art comes the desire to share it. Sure there might only be millions that make great art, but if they would mobilize together that would be world news, if a workers strike in Hollywood can do that for a significantly smaller amount of artists.

Ultimately, I’d like a licensing system put in place Academics have to cite their sources for research That way, if they’ve used stuff that they legally shouldn’t, it can be proven.

The reason we have sources in research is not for licensing purposes. It is to support legitimacy, to build upon the work of the other. I wouldn’t be against sourcing, but it is a moot point because companies that make AI models don’t typically throw their dataset out there. So these datasets might very well be sourced. One well known public dataset LAION 5b, does source URLs. But again, because analysis can be performed freely, this is not a requirement.

Creating a requirement to license data for analysis is what you are arguing here for. I can already hear every large corporation salivating in the back at the idea of that. Every creator in existence would have to pay license to some big company because they interacted with their works at some point in their life and something they made looked somewhat similar. And copyright is already far more of a tool for big corporations, not small creators. This is a dystopian future to desire.

ClamDrinker@lemmy.world · edit-2 20 days ago

I think you are making the mistake of assuming disagreement with your stance means someone would say no to these questions. Simply put - it’s a strawman.

Most (yes, even corporations, albeit much less so for the larger ones), would say “Yes” to this question on it’s face value, because they would want the same for their own “sweat of the brow”. But certain uses after the work is created no longer have a definitive “Yes” to their answer, which is why your ‘simple question’ is not an accurate representation, as it forms no distinctions between that. You cannot stop your publicly posted work from being analyzed, by human or computer. This is firmly established. As others have put in this thread, reducing protections over analysis will be detrimental to both artists as well as everyone else. It would quite literally cause society’s ability to advance to slow down if not halt completely as most research requires analysis of existing data, and most of that is computer assisted.

Artists have always been undervalued, I will give you that. But to mitigate that, we should provide artists better protections that don’t rely on breaking down other freedoms. For example, UBI. And I wish people that were against AI would focus on that, since that is actually something you could get agreement on with most of society and actually help artists with. Fighting against technology that besides it negatives also provides great positives is a losing battle.

ClamDrinker@lemmy.world · edit-2 20 days ago

You’re confusing LLMs with other AI models, as LLMs are magnitudes more energy demanding than other AI. It’s easy to see why if you’ve ever looked at self hosting AI, you need a cluster of top line business GPUs to run modern LLMs while an image generator can be run on most consumer 3000, 4000 series Nvidia GPUs at home. Generating images is about as costly as playing a modern video game, and only when it’s generating.

ClamDrinker@lemmy.world · edit-2 1 month ago

I appreciate the back up, but with the rare sighting of an organism that is able to consume downvotes and irrationality for a living, I don’t think it’s worth your (or anyone’s) time trying to convince them to change to an ‘unnatural’ diet ^^

ClamDrinker@lemmy.world · 1 month ago

Shame you’re not willing to see the unreasonableness in yourself. Now you have to go to the shop to buy some crickets to avoid being seen as a hypocrite.

ClamDrinker@lemmy.world · edit-2 1 month ago

You can eat shit if you want, it won’t kill you. There’s tribes in Greenland that eat bird poop as a delicacy. So it must be natural! Or how about snails, grasshoppers, worms, crickets? All edible, even good. All things an omnivore can and do consume at time. You should stop being so unnatural and cutting all of these things out of your diet.

Oh, you won’t? Guess that makes you a religious believer now. C’mon man, you must be trying to be dense on purpose.

ClamDrinker@lemmy.world · edit-2 1 month ago

I’m not a vegan - but we are omnivores, we can eat plants. There is nothing unnatural about it. Let alone if you compare it to our modern ‘normal’ food, which is chock full of extra sugar, extra fat, extra protein, extra artificial additives like preservatives, sweeteners, and what not. It’s also factual that you can get more energy out of directly consuming plant material than eating an animal that consumed said plant material. If you take the biggest offenders for that, cows. You need 8 kg of feed for them to produce a kg of meat, this is known as it’s feed conversion ratio (source). Other animals (Like chicken and fish) are better, but a ration below 1 is essentially impossible.

I like the taste of meat as much as the next (average) person, but vegans do have a factual basis for their stance. But non-vegans rebuttal to that is realistically just “I don’t want to give up meat because I like it” not “the facts aren’t on your side.” - Lets be honest about that.

ClamDrinker@lemmy.world · 1 month ago

Ironically voting for parties like AfD is probably the most illusioned vote you could make. The other parties are full of shit too but there it’s the byproduct not the main feature.

ClamDrinker@lemmy.world · edit-2 2 months ago

I feel you in avoiding public transit. That’s where my hate comes from as well. And yes, many people that do these things have have excuses. Because they need to, to justify doing their business in a place where their habit unavoidably harms and frustrates other people. I hate the fact we still allow that so readily as society. Or at least we should restrict it further to the point a normal person doesn’t have to be bothered by people like that in public. It undermines public services to an extent.

But after I no longer needed to use public transit, I did start to see things in a slightly different light. And that’s the only thing I wanted to say. People that are conscientious about enjoying any kind of mind altering substances will choose to do so safely and harmlessly outside of public, or in designated places like clubs specifically for that substance. Harm reduction must be central to substance use. And I know now that many people have that mentality. But that mentality is somewhat threatened exactly because they make sure nobody is bothered by them. It causes the experience to be defined by those people in public places, the loud minority.

ClamDrinker@lemmy.world · 2 months ago

I hate public smokers with a passion. But you must realize that you have effectively zero exposure to people that contain their smoke by doing it at home or using a method without smoke production. And there could be a lot more of those.

The last line is especially golden for me since I live in the Netherlands so we have plenty of weed being smoked but the vast vast majority of public smoke hinderance is from tobacco smokers. If they decide to smoke in public they have absolutely no shame and will literally do it at places like bus stops and just outside restaurants. Weed smokers rarely do that here. So if I were to believe you it seems to just be correlated to people with shitty attitudes rather than the substance.

But there’s no denying that if everyone would drop alcohol for weed, it would be better. Not because weed is harmless but because alcohol is pretty terrible health wise.

ClamDrinker@lemmy.world · 2 months ago

I never anthropomorphized the technology, unfortunately due to how language works it’s easy to misinterpret it as such. I was indeed trying to explain overfitting. You are forgetting the fact that current AI technology (artificial neural networks) are based on biological neural networks. There is a range of quirks that it exhibits that biological neural networks do as well. But it is not human, nor anything close. But that does not mean that there are no similarities that can be rightfully pointed out.

Overfitting isn’t just what you describe though. It also occurs if the prompt guides the AI towards a very specific part of it’s training data. To the point where the calculations it will perform are extremely certain about what words come next. Overfitting here isn’t caused by an abundance of data, but rather a lack of it. The training data isn’t being produced from within the model, but as a statistical inevitability of the mathematical version of your prompt. Which is why it’s tricking the AI, because an AI doesn’t understand copyright - it just performs the calculations. But you do. And so using that as an example is like saying “Ha, stupid gun. I pulled the trigger and you shot this man in front of me, don’t you know murder is illegal buddy?”

Nobody should be expecting a machine to use itself ethically. Ethics is a human thing.

People that use AI have an ethical obligation to avoid overfitting. People that produce AI also have an ethical obligation to reduce overfitting. But a prompt quite literally has infinite combinations (within the token limits) to consider, so overfitting will happen in fringe situations. That’s not because that data is actually present in the model, but because the combination of the prompt with the model pushes the calculation towards a very specific prediction which can heavily resemble or be verbatim the original text. (Note: I do really dislike companies that try to hide the existence of overfitting to users though, and you can rightfully criticize them for claiming it doesn’t exist)

This isn’t akin to anything human, people can’t repeat pages of text verbatim like this and no toddler can be tricked into repeating a random page from a random book as you say.

This is incorrect. A toddler can and will verbatim repeat nursery rhymes that it hears. It’s literally one of their defining features, to the dismay of parents and grandparents around the world. I can also whistle pretty much my entire music collection exactly as it was produced because I’ve listened to each song hundreds if not thousands of times. And I’m quite certain you too have a situation like that. An AI’s mind does not decay or degrade (Nor does it change for the better like humans) and the data encoded in it is far greater, so it will present more of these situations in it’s fringes.

but it isn’t crafting its own sentences, it’s using everyone else’s.

How do you think toddlers learn to make their first own sentences? It’s why parents spend so much time saying “Papa” or “Mama” to their toddler. Exactly because they want them to copy them verbatim. Eventually the corpus of their knowledge grows big enough to the point where they start to experiment and eventually develop their own style of talking. But it’s still heavily based on the information they take it. It’s why we have dialects and languages. Take a look at what happens when children don’t learn from others: https://en.wikipedia.org/wiki/Feral_child So yes, the AI is using it’s training data, nobody’s arguing it doesn’t. But it’s trivial to see how it’s crafting it’s own sentences from that data for the vast majority of situations. It’s also why you can ask it to talk like a pirate, and then it will suddenly know how to mix in the essence of talking like a pirate into it’s responses. Or how it can remember names and mix those into sentences.

Therefore it is factually wrong to state that it doesn’t keep the training data in a usable format

If your arguments is that it can produce something that happens to align with it’s training data with the right prompt, well yeah that’s not incorrect. But it is so heavily misguided and borders bad faith to suggest that this tiny minority of cases where overfitting occurs is indicative of the rest of it. LLMs are a prediction machines, so if you know how to guide it towards what you want it to predict, and that is in the training data, it’s going to predict that most likely. Under normal circumstances where the prompt you give it is neutral and unique, you will basically never encounter overfitting. You really have to try for most AI models.

But then again, you might be arguing this based on a specific AI model that is very prone to overfitting, while I am arguing this out of the technology as a whole.

This isn’t originality, creativity or anything that it is marketed as. It is storing, encoding and copying information to reproduce in a slightly different format.

It is originality, as these AI can easily produce material never seen before in the vast, vast majority of situations. Which is also what we often refer to as creativity, because it has to be able to mix information and still retain legibility. Humans also constantly reuse phrases, ideas, visions, ideals of other people. It is intellectually dishonest to not look at these similarities in human psychology and then treat AI as having to be perfect all the time, never once saying the same thing as someone else. To convey certain information, there are only finite ways to do so within the English language.

ClamDrinker@lemmy.world · 2 months ago

This is an issue for the AI user though. And I do agree that needs to be more conscious in people’s minds. But I think time will change that. Perhaps when the photo camera came out there were some shmucks that took pictures of people’s artworks and claimed it as their own because the novelty of the technology allowed that for a bit, but eventually those people are properly differentiated from people properly using it.

ClamDrinker@lemmy.world · edit-2 2 months ago

Like if I download a textbook to read for a class instead of buying it - I could be proscecuted for stealing

Ehh, no almost certainly not (But it does depend on your local laws). But that honestly just sounds like some corporate boogyman to prevent you from pirating their books. The person hosting the download, if they did not have the rights to publicize it freely, would possibly be prosecuted though.

To illustrate, there’s this story of John Cena who sold a special Ford after signing a contract with Ford to explicitly forbid him from doing that. However, the person who bought the car was never prosecuted or sued, because they received the car from Cena with no strings attached. They couldn’t be held responsible for Cena’s break of contract, but Cena was held personally responsible by Ford.

For physical goods there is ‘theft by proxy’ though (receiving stolen goods that you know are most likely stolen), but that quite certainly doesn’t apply to digital, copyable goods. As to even access any kind of information on the internet, you have to download and thus, copy it.

ClamDrinker@lemmy.world · edit-2 2 months ago

That would be true if they used material that was paywalled. But the vast majority of the training information used is publicly available. There’s plenty of freely available books and information that you only require an internet connection for to access, and learn from.