AI & Knowledge Control: Corporate Capture Risks

by Priyanka Patel

Teh Ghost of Aaron Swartz and the AI Era’s Knowledge Grab

More than a decade after the tragic death of Aaron Swartz, the United States finds itself grappling with a familiar contradiction – the tension between open access to information and the increasing corporate control of knowledge, now amplified by the rise of artificial intelligence. Swartz, a fervent advocate for the free availability of knowledge, especially that funded by the public, acted on his beliefs by downloading thousands of academic articles from JSTOR. This act led to a felony charge and a relentless prosecution that ultimately contributed to his suicide on january 11, 2013. The unresolved questions surrounding his case are now strikingly relevant in the ongoing debates about AI, copyright, and the fundamental right to access information.

At the time of Swartz’s prosecution, a significant portion of research was publicly funded, yet access remained restricted behind expensive paywalls. Individuals were unable to access work that they had already paid for through their taxes. Today, a similar dynamic is unfolding with AI. Companies are training their systems on vast datasets of publicly available information – research papers, books, articles, code, and more – often funded by government grants and taxpayer dollars.These companies then sell their proprietary systems,built on both public and private knowledge,back to the very people who originally funded that knowledge. However, the government’s response this time is markedly different. Unlike Swartz’s case, there are no criminal prosecutions or threats of lengthy prison sentences. Lawsuits proceed slowly, enforcement is uncertain, and policymakers are hesitant to intervene, citing the perceived economic and strategic importance of AI. Copyright infringement is increasingly framed as an unavoidable consequence of “innovation.”

Recent legal developments highlight this disparity. In 2025, Anthropic reached a settlement with publishers over allegations that its AI systems were trained on copyrighted books without authorization. the agreement reportedly valued the infringement at approximately $3,000 per book across an estimated 500,000 works, totaling over $1.5 billion. Similar plagiarism disputes between artists and alleged infringers often result in settlements ranging from hundreds of thousands to millions of dollars for prominent works. Scholars estimate that Anthropic alone avoided over $1 trillion in potential liability costs. For well-capitalized AI firms,such settlements are likely viewed as a predictable cost of doing business.

As AI becomes increasingly integrated into the American economy, the implications are becoming clear. One analyst noted that judges are likely to “twist themselves into knots” to justify a technology built on the appropriation of creative works. The central question, then, is this: if Swartz’s actions were deemed criminal, what standard are we now applying to AI companies?

The debate extends beyond the simple submission of copyright law. It’s about why the law appears to be applied so differently depending on the entity doing the extracting and the purpose behind it. The stakes are far-reaching, impacting who controls the infrastructure of knowledge and what that control means for democratic participation, accountability, and public trust.

AI systems, trained on vast amounts of publicly funded research, are rapidly becoming the primary source of information for many on topics ranging from science and law to medicine and public policy. As search, synthesis, and description are increasingly mediated through AI models, control over the training data and infrastructure translates into control over the questions asked, the answers surfaced, and the expertise considered authoritative. If public knowledge is absorbed into proprietary systems inaccessible for public inspection, audit, or challenge, then access to information is no longer governed by democratic principles but by corporate priorities.

Like the early internet, AI is frequently enough touted as a democratizing force. Though, as with the internet, AI’s current trajectory suggests a future of consolidation. Control over data, models, and computational infrastructure is concentrated in the hands of a small number of powerful tech companies, who will ultimately decide who has access to knowledge, under what conditions, and at what price.

Swartz’s fight was not merely about access; it was about whether knowledge should be governed by openness or corporate capture, and for whose benefit. He understood that access to knowledge is a fundamental prerequisite for a functioning democracy. A society cannot meaningfully debate policy, science, or justice if information is locked away behind paywalls or controlled by proprietary algorithms. If we allow AI companies to profit from mass appropriation while claiming immunity, we are choosing a future where access to knowledge is dictated by corporate power rather than democratic values.

How we treat knowledge – who can access it, who can profit from it, and who is punished for sharing it – has become a defining test of our democratic commitments. We must be honest about what those choices reveal about our values.

You may also like

Leave a Comment