A collective of activists dedicated to preserving music history has completed a massive backup of Spotify’s music catalog, a project undertaken to safeguard cultural artifacts against potential loss from platform failures or corporate decisions.
Anna’s Archive, a non-profit organization previously focused on archiving books and academic papers, announced it scraped Spotify on a large scale, archiving metadata for approximately 256 million tracks and audio files—around 86 million with accompanying audio, totaling just under 300 terabytes of data.
While most of the archived material predates July 2025, the group claims to have captured 99.6 percent of the music regularly streamed on Spotify, offering a remarkably comprehensive snapshot of the platform’s offerings.
Pirates Say They’ve Copied Almost Spotify’s Entire Music Catalog
The ambitious project raises questions about digital preservation, copyright, and who truly owns our cultural heritage.
The sheer scale of the archive might seem improbable until you consider the dynamics of music streaming. A small percentage of hugely popular songs generate the vast majority of streams, while tens of millions of tracks remain largely unheard, accumulating digital dust. According to Anna’s Archive, Spotify’s most-played songs significantly outweigh the combined streams of its least-popular tracks.
From a preservation perspective, the group argues that prioritizing what people actually listen to is more practical than striving for complete archival coverage. They frame the effort as “cultural insurance,” noting that existing preservation efforts often favor superstar artists, high-fidelity audio, or private collections. What’s currently lacking, they contend, is a robust, accessible archive designed to withstand events like wars, natural disasters, economic downturns, or corporate collapses.
The data will be released in stages, beginning with metadata, followed by audio files ranked by popularity, and ultimately including album artwork.
Spotify has responded to the archiving effort by shutting down accounts involved in what it deems unlawful scraping and implementing new safeguards against copyright infringement, reaffirming its commitment to protecting artists’ rights.
The situation highlights a fundamental tension: platforms present themselves as custodians of culture, while others argue that content locked behind subscriptions and digital rights management isn’t truly preserved at all. Whether Anna’s Archive is viewed as a digital library or a piracy operation may ultimately hinge on which future people trust more—one controlled by corporations or one maintained by archivists with ample storage capacity.
- Anna’s Archive has backed up metadata for 256 million Spotify tracks, with audio files totaling nearly 300 terabytes.
- The archive prioritizes preserving music people actually listen to, rather than aiming for complete catalog coverage.
- Spotify has taken action to block the scraping of its platform and protect copyright.
- The project sparks debate about digital preservation, corporate control, and the future of cultural heritage.
What constitutes true preservation in the digital age? The question is at the heart of this debate, as platforms and independent archivists clash over the best way to safeguard our musical legacy.
