OpenAI and copyright infringement

OpenAI and its competitors base all the large language models on freely available content, fully ignoring the attached licenses like GPL, Creative Commons and others. The work put into creating that content is meant to make the content available to others and letting them contribute to that content. This is what open source is about. Even when many of these licenses do allow marketing products based the content, many others do not.

OpenAI and their competitors do not discriminate between those when training their models. The difference between training large language models and downloading copyrighted content for private use, is that both are illegal but only the latter seems to be enforced.

That doesn’t sound fair at all.

