How to Scan.
In my experience, it is best to scan directly into a pdf file with the scanner set for black and white.
For typed material, 300 dpi is adequate but 400dpi is better; for printed
matter, 400dpi is adequate but 600dpi is better.
First method
The best method is to dismantle the book, and feed it into a fast scanner equipped with a
sheetfeeder. These are available for about 450USD and will scan about 15 sheets
(30 sides) a minute. Dismantling Vol II of AA1988 and scanning it took only
about 40 minutes. I used a Fujitsu Scansnap, which works brilliantly except that
the feeder is not as robust as I would like --- there are similar models
available from Kodak and Xerox. Thus
- Dismantle the book (cut the binding into segments, and trim the binding off the pages in each segment).
- Run it through the scanner.
- Check that all pages are correctly scanned.
- Use acrobat to adjust the pdf page number to agree with that on the pages (display the thumbnails and
choose options > number pages).
- Use acrobat to add bookmarks for at least the chapter headings (Ctrl-B etc..)
- Exploit the invariance of the laws of physics under t -> -t to reverse the first step (not tested).
Second method
If you don't want to dismantle the book, you can use a flatbed scanner that allows you to
- scan material up to 8.5in x 11.7in = max{A4,letter},
- at 400dpi using a black and white (text) setting,
- directly into a pdf file.
Most cheap (c 100USD) scanners can do the first two, but not all allow the
third. I used a Canon CanoScan 8400F, which works well (except that I wish it
were faster, but this is problem with most flatbed scanners, even some that are
much more expensive; I think the 8400F is relatively fast).
Scanning Vol I of AA1988 took about 80 minutes by this method.
If you scan two pages onto one sheet, you need to separate and collate them.
This you can do using TeX! Either use Acrobat's crop tool to produce files even.pdf
and odd.pdf containing the even and odd pages respectively, and use
evenandodd.tex to collate them, or directly
use
two2one.tex to both split and collate them.
The first method is better because it gives you more control.
Third method
Photocopy the material, and then feed the copy into fast sheetfeed scanner. If the
copying is done well, the quality can equal that of the second method, and if the copier
is fast, it may be as fast as the second method, but I find the second method to be more reliable.
Fourth method
In principle, you should be able to use a digital camera to photograph the
pages, but I haven't tried this.
Comments on Copyright and Fair Use Laws
The original US copyright laws protected material for only fifteen years, except
that the author was allowed to request one fifteen-year extension.
Unfortunately, at the request of a few giant media corporations (for example, the
owners of the rights to Donald Duck and Mickey Mouse), the US Congress has
repeatedly extended the period until now most works printed after 1922 are
protected. This appears to violate the US Constitution, which grants Congress
the power only "to promote the progress of science and useful arts, by securing
for limited times to authors and inventors the exclusive right to their
respective writings and discoveries." Unfortunately, countries signing "free"
trade agreements with the US are often required to adopt the same laws.
Fortunately, there is a tradition that, at the author's request, the publisher will
return the copyright to him once the book has gone out of print. Thus it is the
responsibility of the author to make sure that his work remains available by
putting it on the web. Since this is easy, the only out-of-print works not
available on the web should be those whose authors choose not to put it there.
The situation with collections, for example, conference proceedings, is less clear. After
some effort, I was able to get permission from Elsevier to post scans of the proceedings
of the 1988 Ann Arbor conference on my website, but others have been unsuccessful with Springer.
I would urge all editors to make sure that their contract with the publisher
contains a provision allowing them to post the work on the web when it goes out of
print or after a certain period (whichever occurs first). Clearly, if your
purpose in publishing a proceedings is to make the work available to the
mathematical community, then it makes no sense to give it to a publisher who
will print a few hundred copies at an inflated
price, and then block its further distribution for a hundred years.
Once Google completes scanning everything, putting a book on the web will require
nothing more than the copyright holder giving Google permission to display the full text. Thus, the
only works unavailable on the web will be those whose copyright holder chooses to
withhold permission.
Fair use
Under the copyright law of the US (Title 17, US Code), you are generally allowed to copy
- a chapter from a library book,
- a complete work when the work is not available at a "fair price" (this is generally
the case when the work is out of print and used copies are not available at a
reasonable price),
provided the copy is used only for private study, scholarship, or research.
As far as I know, copying or scanning for private use a book that you legally
own, for example, to have it available when travelling, is also considered "fair
use" under US law (but see protectfairuse)
For an excellent website on Copyright and Fair Use Law, see
Stanford University Library