Jeff Duntemann's Contrapositive Diary Rotating Header Image

December 29th, 2009:

10,000 Pirated Ebooks

Ebook-related items have been gathering in my notefile lately, and this is a good time to begin spilling them out where we can all see them. The triggering incident was a note from the Jolly Pirate, telling me that one of my SF stories was present in a zipfile pirate ebook anthology that he had downloaded via BitTorrent. That people are passing around pirated versions of my stories is old news. “Drumlin Boiler” was posted on the P2P networks a few months after it was published in Asimov’s in 2002, and my better-known shorts have popped up regularly since then. No, what induced a double-take was the name of the pirate anthology: “10,000 SciFi and Fantasy Ebooks.”

10,000? You gotta be kidding!

But I’m not. Jolly sent me the 550K TOC text file, which is 9,700 lines long, with one title per line. Not all are book length, and many, in fact, are short stories. Still, the majority of all book-length SF titles I’ve read in the last thirty years are in there, and so was “Borovsky’s Hollow Woman,” albeit not under my byline. (I wrote the story with Nancy Kress, who is listed as sole author.) The only significant authors I looked for but did not find were George O. Smith and Charles Platt. (One howler: Bored of the Rings is said to be by J. R. R. Tolkien. Urrrrp.)

The collection is 4 GB in size. The Jolly Pirate said that he had downloaded it in just under three hours. He attached the file for “Borovsky’s Hollow Woman,” which was a plain but accurate 57K PDF. Intriguingly, the date given under the title is January 28, 2002. The damned thing has evidently been kicking around for at least seven years, if perhaps not in its full 4 GB glory. This suggests that the anthology is not entirely ebook piracy but mostly print book piracy. (“Borovsky” was never released in ebook form.)

Some short comments:

  • I verified the existence of the anthology from the Pirate Bay search engine. It really does exist. (So, evidently, does the Pirate Bay, which surprised me a little, considering recent efforts to take them down.)
  • 10,000 ebooks do not take a great deal of space by today’s standards. (Admittedly, better files with cover scans would be larger.) No one will think twice about a 4 GB download for size reasons, when 750 GB drives are going for $69.95.
  • The PDF is ugly. The lines are far too wide for easy readability and (since this is not a tagged PDF) not reflowable. That said, I did not find a single OCR error.
  • The Windows pathname of the text file from which the PDF was generated is shown at the top of every page. The pathname includes the full name of some clueless Dutch guy, from whose Mijn Documenten folder the file came. Ebook piracy clearly belongs to the common people, not some elite cabal skilfully dodging the **AA.
  • I’ve used a scanner to rip a couple of print books (plus ten years of Carl & Jerry print stories) and it is horrible hard work. However, the anthology demonstrates that if print is a form of inadvertent DRM (which I have long thought) it is not a particularly strong one. After all, as Bruce Schneier has said about DRM systems generally, they only have to be broken once.

This last item is key. A printed book is a worst-case challenge for an ebook pirate. Compared to cutting off the binding and making sure the paper pages all feed straight through the scanner ADF and then fixing the inevitable OCR errors, stripping out an ebook’s DRM is trivial. If ebook piracy is not yet a big deal, it isn’t because it’s difficult. It’s still because reading ebooks is borderline painful. I may not be typical, but if I can buy a used copy of a recent hardcover of interest for $10 or less, I’ll choose the hardcover rather than an ebook at any price. Sooner or later the readers will catch up to paper, and by then, well, we may see a 4 TB file called “10,000,000 EBooks About Everything” on the file-sharing networks, and it won’t even take an objectionable chunk of our 80 TB hard drives.

You think I’m kidding? Let’s compare notes in five or six years.