I see talk here and there about how any company or individual can easily use anything we post on Lemmy however they want. This could include AI training, behavior analysis, or user profiling. With the recent news of Reddit data being sold and licensed for AI training, I thought this would be a great time to preemptively discuss how we feel about this topic and brainstorm ways to discourage unwanted use of the content we post.
I’ve seen some users add a license to the end of each of their comments. One idea might be this: Add a feature to Lemmy where each user can choose a content license that applies to everything they post. For example, one user might choose to no rights for their content (like CC0) because they don’t care how their data is used. Another user might not want companies profiting off their posts, so they’d choose a more restrictive license.
I’m eager to here everyone’s thoughts on the whole topic, so to kick things off:
- Do you care how your public data and posted content is used? Why or why not?
- What do you think of choosing a content license for your Lemmy account? Does this contradict the FOSS model?
- Should Lemmy have features to protect user data/content in this way, or should that be left up to the user to figure out on their own?
Data is becoming an increasingly valuable commodity in the digital world. Hopefully these big-picture conversations can help us see what we value as a community and be more prepared for the future.
I think there are two big things:
No one can sell the data when it’s already freely available for everyone
Only the data that needs to be collected, is collected. Only the data that needs to be public, is public.
Pushing for these points to remain true should help a lot.
On the first point (pretty sure I also posted about possible licenses at one point), the problem is that it doesn’t really help anyone that doesn’t care about the license. Until we can collectively organize to defend our licenses legally, the next best thing might be to remove the profit incentive entirely.
On the second point, this does away with a LOT of other metadata (ex. Location, device orientation, contacts, etc.) that won’t be available for abuse. Reddit killed off third party apps because they want people to use the official app. Part of that was for ads, but part of that was ALSO to collect more data. If we build the platforms and the clients in a way that the data isn’t collected, then we’ll be better off