Hayao Miyasaki is the co-founder of Studio Ghibli, a Japanese animation studio known worldwide for their stunning, emotional, beautiful stories and movies. At the core of Studio Ghibli’s work is a deep engagement with questions of humanity. About what it means to be a human, about how to care for one another and the world […]
In our current society, little people can get away with it. I can take whatever style I want and train a model on it. There’s already many ghibli ressources in the open source scene, and a lot of them date from 2 years ago.
This whole situation is rage bait to manipulate the population into cheering for new copyright laws so politicians get little push back when they start writing pro-corporate laws regarding AI.
Mostly youtube, reddit and image search. I guess I could just record a Netflix stream if I needed the whole movie. I guess recording a Netflix stream is pirating? Probably easier with a torrent.
What does it matters? I don’t think pirating is unethical especially when it’s not even redistribution but transformative. Openai has never stopped me from pirating or even asked me to stop. Not sure what you mean with “no one else”.
You ever ask yourself if the memes made from movie scenes used pirated media?
I’m mostly talking about being able to train on copyrighted content. This is on me though, I got mixed up. That’s what I meant in my first comment.
If you think someone can train a model on legally obtained data (Google images, YouTube, internet archive), then that is fair.
Personally, I think using pirated or at least bought content that is ripped (Netflix, DVDs) should be exempt (for everyone obviously, not just OpenAI.) Some data is already behind huge mega corps like record labels, Hollywood, publishing houses, etc. OpenAI can afford the cost but the little guys will be screwed when it comes to SOTA.
It’s also worth noting that most current lawsuits are aimed at how the data is used and not how it’s sourced if I’m not mistaken. The laws coming from these lawsuits won’t be used to bolster anti-piracy laws but copyright laws instead, targeting fair use and transformative clauses imo.
Using existing data on recordings and books we obtain a point estimate of around 15 years for optimal copyright term with a 99% confidence interval extending up to 38 years
Some of us have been waiting for copyright laws to be amended downward for 16 years now.
I’m not promoting that corporations should get a free pass, I just want them to be held to the same standards they held the Pirate Bay to if we’re gonna pretend that current copyright laws are good, since the centerpiece of the court case against the Pirate Bay was that they were making money from what they did. OpenAI is making shitloads of money from what they did.
But I’m all for shortening copyright, but not getting rid of it. Reforms don’t have to be pro-corporate slop.
What pirate bay is doing isn’t exactly transformative. I pirate most of my media and can’t say I’m not for better copyright laws and a better treatment of pirate bay, I just think the situations are different.
I don’t think saying “if pirate bay is illegal, so should training ai without compensations” is exactly fair. (I wish the actual people contributing could be compensated, but how it’s set up, we would be giving a few companies a monopoly while compensating mostly data aggregators.)
Reforms don’t have to be pro-corporate slop.
Sadly, the media and most of the population is practically begging for it. When you couple that with the pressure exerted by record companies, publishing houses, etc, it is clear those are the reforms we get if any.
If you download a movie from a torrent site, you have committed an illegal act in the US. It doesn’t matter if you watch the movie and then write a fanfiction based on the movie. It’s the copying that’s illegal. It seems clear from OpenAI’s statements that they torrented the data they used to build their models.
In our current society, little people can get away with it. I can take whatever style I want and train a model on it. There’s already many ghibli ressources in the open source scene, and a lot of them date from 2 years ago.
This whole situation is rage bait to manipulate the population into cheering for new copyright laws so politicians get little push back when they start writing pro-corporate laws regarding AI.
Did you buy the Ghibli movies you trained on or did you pirate them? Because OpenAI has argued that they are allowed to pirate and no one else.
Mostly youtube, reddit and image search. I guess I could just record a Netflix stream if I needed the whole movie. I guess recording a Netflix stream is pirating? Probably easier with a torrent.
What does it matters? I don’t think pirating is unethical especially when it’s not even redistribution but transformative. Openai has never stopped me from pirating or even asked me to stop. Not sure what you mean with “no one else”.
You ever ask yourself if the memes made from movie scenes used pirated media?
Yes recording at Netflix stream is pirating. That you got away with it doesn’t mean you couldn’t be sued for tens of thousands of someone found out.
You don’t think it’s unethical but it is illegal in the US and people have been sued for thousands of dollars. This is still going on today: https://arstechnica.com/tech-policy/2025/02/isp-sued-by-record-labels-agrees-to-identify-100-users-accused-of-piracy/
OpenAI has said they need to violate copyright. But they didn’t say that the law should be changed. They want an exemption for themselves.
I’m mostly talking about being able to train on copyrighted content. This is on me though, I got mixed up. That’s what I meant in my first comment.
If you think someone can train a model on legally obtained data (Google images, YouTube, internet archive), then that is fair.
Personally, I think using pirated or at least bought content that is ripped (Netflix, DVDs) should be exempt (for everyone obviously, not just OpenAI.) Some data is already behind huge mega corps like record labels, Hollywood, publishing houses, etc. OpenAI can afford the cost but the little guys will be screwed when it comes to SOTA.
It’s also worth noting that most current lawsuits are aimed at how the data is used and not how it’s sourced if I’m not mistaken. The laws coming from these lawsuits won’t be used to bolster anti-piracy laws but copyright laws instead, targeting fair use and transformative clauses imo.
https://rufuspollock.com/papers/optimal_copyright_term.pdf
Some of us have been waiting for copyright laws to be amended downward for 16 years now.
I’m not promoting that corporations should get a free pass, I just want them to be held to the same standards they held the Pirate Bay to if we’re gonna pretend that current copyright laws are good, since the centerpiece of the court case against the Pirate Bay was that they were making money from what they did. OpenAI is making shitloads of money from what they did.
But I’m all for shortening copyright, but not getting rid of it. Reforms don’t have to be pro-corporate slop.
What pirate bay is doing isn’t exactly transformative. I pirate most of my media and can’t say I’m not for better copyright laws and a better treatment of pirate bay, I just think the situations are different.
I don’t think saying “if pirate bay is illegal, so should training ai without compensations” is exactly fair. (I wish the actual people contributing could be compensated, but how it’s set up, we would be giving a few companies a monopoly while compensating mostly data aggregators.)
Sadly, the media and most of the population is practically begging for it. When you couple that with the pressure exerted by record companies, publishing houses, etc, it is clear those are the reforms we get if any.
If you download a movie from a torrent site, you have committed an illegal act in the US. It doesn’t matter if you watch the movie and then write a fanfiction based on the movie. It’s the copying that’s illegal. It seems clear from OpenAI’s statements that they torrented the data they used to build their models.