What do we lose by letting AI systems fabricate information based on probability?
In order to make “generative artificial intelligence” work, companies exploring the capabilities of neural networks and large language models input massive amounts of material found online into their systems. This was described as “training”, and the companies that used this method have justified their actions in various ways. Surprisingly, they largely admit to having taken copyrighted materials without permission or compensation.
One argument holds that this training was just for the sake of research and was not intended to create a for-profit business that would compete with the creators whose work was used in the training. The implausibility of this claim nearly ripped OpenAI apart, and raises important legal questions:
- Who knew what, when? Did all of OpenAI actually adhere to its original nonprofit mission, genuinely focusing on research for the sake of research? And did the company develop a for-profit model only after “discovering” that there was interest in using the product of its research?
- If this is not true, was anyone defrauded by the misrepresentation of the business plan?
- Does any of that matter? Is it not just as impermissible to decide later on to profit from unauthorized access to copyright-protected material as it would have been to decide to do so ahead of time?
Another argument holds that the AI companies engaged in “fair use”, employing the copyrighted material for a purpose that is so different from its intended purpose it could not constitute a competitive act.
- On the one hand, it is true that printed books, even books republished online, with or without authorization, were not created with the aim of developing artificial intelligence technology. That is a different creative act.
- On the other hand, however, it is clear that the purpose of generative AI is to imitate or replace the role of the human author or creator. Given that the purpose is not only to compete with but to replace the copyright owner, the “fair use” standard seems impossible to apply.
- In its May 2025 report on generative AI, the US Copyright Office found “making commercial use of vast troves of copyrighted works to produce expressive content that competes with them in existing markets, especially where this is accomplished through illegal access, goes beyond established fair use boundaries.”
- The report called for “practical solutions” like “voluntary licensing” to “support ongoing innovation”, and acknowledged that enforcement action might be warranted if “gaps” are not closed.
Tech companies involved in the creation of AI are also arguing that the technology is “too important” to be stifled by something as restrictive as copyright. If AI systems cannot be trained on all the text that is out there in the world, then they cannot “learn” adequately how to imitate human thought and language. If they cannot do that, then they cannot advance optimally, and we will all be denied the professed benefits of this technology.
- This argument also falls somewhat flat. First, generative AI systems are generating serious risks, while not performing at the level its backers claim.
- By some estimates, generative AI platforms make false or erroneous claims as much as 80% of the time. They are more prone to error than people, and they often assert entirely random “guesses” or assertions as if they were proven and absolute fact.
- While tech companies are pushing to have AI infused into all governmental and corporate functions, effectively binding governance and the everyday economy to their own business interests, the risk of catastrophic mistakes remains real.
- Meanwhile, experts at the leading companies themselves warn we may soon lose all ability to monitor AI systems, and by extension to understand or control how they work. Is this a value added that we must all invest in?
We also hear the justification of “economic value”. This argument holds that AI systems should have priority, because they are so much more valuable than the books or other individual creative works that were taken and repurposed without consent or compensation. This argument contains a couple important logical flaws:
- Financial value is not the same as economic value. That a few companies can pull in hundreds of billions of dollars in capital investment does not mean they are doing something inherently valuable or that the overall effect on the wider economy will be a net positive.
- In some ways, the argument argues against itself. The goal is to replace people and labor costs and time-consuming creative work with instantaneouly generated content; that will reduce incomes and job opportunity in direct, material ways, without necessarily creating any new opportunity.
Beyond that, there is reporting suggesting that generative AI systems are making it harder for people to find meaningful and well-compensated work. According to Futurism:
Though the service industry has historically struggled to find workers—thanks in large part to low wages at corporate chains—that trend is now beginning to reverse, with recent college graduates struggling to find work at places like Starbucks and Costco. That’s a stark benchmark, with especially grim implications for the over 45 million US citizens who never had the privilege of attending college.
Joining them in the blender are entry-level white collar workers, graduate students and specialized tech workers such as coders and analysts. According to the Financial Times, massive firms like Microsoft—once a dependable landing pad for STEM workers—are seeing quarterly profits skyrocket by as much as 25 percent, even as it cuts jobs by the thousands.
The evidence for this trend toward reduced employment and economic opportunity is not just in the jobs numbers. It is evident in the financial sector’s promotion of layoffs in 2025 as an opportunity to enhance profits without reducing productive capacity. In other words, everyone involved in the decision-making seems to be clear that the argument is not for a better economy but for better financial returns for investors, at the expense of the mainstream economy.
This is also a value proposition in itself: What do we lose by removing the human being from the process of discovering, sorting, and generating information distilled by experience? Do we ourselves become less relevant to the process of creation of the information we will then be asked to trust?
Here, it is worth revisiting the attempted takeover of government by these same error-prone AI systems. Is this really a winning business strategy for the companies seeking these contracts and this access to sensitive data about American citizens?
- The companies argue that they need the freedom to experiment, even as they insist their platforms are proven and secure.
- They are asking for unfettered access to information that comes with serious legal restrictions and potential liability.
- What the AI contractors seem not to recognize is that the U.S. Constitution makes it unlawful for Congress to shield them from liability.
There is evidence that AI systems “intentionally” conceal both their “chain of thought” and their mistakes, glossing over acts of deception with lazy apologies or deflections. One writer documented an entire “conversation” with a chatbot which repeatedly lied about its homework and fabricated claims about the writer’s own work, despite “knowing” it was chatting with the author.
The First Amendment says, in part: “Congress shall make no law… abridging… the right of the people… to petition the Government for a redress of grievances.” By embedding itself in government services, the AI industry is taking a fast track to legal liability for harmful mistakes, and no law can protect it from liability.
Article I, Section 8, of the U.S. Constitution also specifies that Congress should “promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries…” This is more than a promise of copyright; it specifies that such rights are exclusive. This clearly delineates a much more vast area of potential liability and suggests that the most viable AI platforms will be those that have never used copyrighted material without authorization.
At this still early stage, it may be hard to see that future iterations of AI will look very different from those available now. Even when AI systems are embedded inside other services, they need to remain active in order to remain effectively embedded. Should they evolve dramatically, or fail, those dependent services will need to evolve as well.
Among those future AI systems, it may become rare for any to utilize underlying LLM models or standards that depend, or ever depended, on unauthorized use of copyright-protected intellectual property. AI models might also become extremely adept at citing and linking to relevant reporting or expressive creations by human authors.
Those that transcend the piracy traps set up by early LLM research, and which develop traceable systems that are reviewable and accountable for their precision or failure, will be best positioned to succeed.
Read more from Liberate Systems partner Active Value Trust Ratings.





