Book Scanner

2014-01-05

The problem

Law school casebooks are inconvenient to carry around. I discovered this during my first year of law school, lugging a pile of books around between classes. At the time, digital options simply didn’t exist. While this appears to be changing, the textbook printers simply make too much money to allow students to purchase less costly digital versions.

At the time, there were a variety of services that would scan the books by cutting the binding off and running them through a scanner. This would leave a digital version only, destroying the valuable book. I began searching for alternative options.

The solution

Thanks to the great work by awesome folks over at diybookscanner.com, building a simple, non-destructive book scanner is fairly straightforward. I didn’t know how the scanner would work for casebooks, so I tried to keep the project costs at a minimum. A few hours in the garage and a book scanner was assembled:

book scanner

The main element of the scanner are the platen, which consists of two pieces of acrylic (more expensive scanners use glass), the sliding platform, and the cameras. The platen presses down between the pages of the book and flattens them out to provide good scanning quality. The sliding platform adjusts slightly as the book is being scanned to allow the platen to be placed in the middle of the book as the scan progresses.

For the cameras, I purchased an old, inexpensive Canon A560 to match one I already had. I had previous experience using Canon Hack Development Kit (CHDK) for time-lapse photography, so setting up a script to allow the cameras to be triggered with a momentary 5v pulse (from an old phone charger) was simple enough.

The results

The workflow was essentially to image each page, raise the platen, and flip to the next page. While this was a bit tedious, it went faster than I initially thought, and allowed me to scan casebooks that were several hundred pages within 30 minutes. Once the images were acquired, I wrote a simple script to rename all of the files in the even and odd folders to allow sequential assembly of the book. Images were processed using Scan Tailor, excellent open source image processing software, which transformed the images from pictures to straightened, scanned, greyscale pages:

Scanned image
Processed image

To use the books, I packaged them up as PDFs using Adobe Acrobat. I also used OCR software so that the entire casebook was text searchable - a dramatic improvement when looking back to find certain content at the end of the semester. I haven’t needed the scanner since 2013, but it remains in the garage in case a need ever arises.