A Research Revolution in the Making: Google Books and More as Sources for Women's History
Weisbard, Phyllis Holman, Feminist Collections: A Quarterly of Women's Studies Resources
If you follow developments in libraries, publishing, or Googleland, you have probably heard about the Google Books project. Google now has agreements with several major libraries in the U.S. and abroad to digitize vast quantities of their holdings. Harvard, Stanford, Oxford, the University of Michigan, and the New York Public Library are charter "library partners" in the enterprise, and as of this writing they have since been joined by the University of Wisconsin-Madison (including material from the Wisconsin Historical Society), the Midwest universities consortium known as the Committee on Institutional Cooperation ("CIC"), which also includes UW-Madison, (1) the University of California, the University of Virginia, the University of Texas, Princeton University, the Bavarian State Library (Germany), Ghent University Library (Belgium), University of Lausanne (Switzerland) and two libraries in Spain. A current list of partners and links to their involvement in the project is at http://books.google.com/googlebooks/partners.html. The project, formerly known as Google Print, was announced in December 2004 and is well under way; thousands of books are already in the database at http://books.google.com/. (2) Google Books and other mass digitization projects have the potential to revolutionize research methods and results. With what's already available, Google Books can now greatly enhance student and scholarly quests.
Before illustrating ways that Google Books can be used for research in women's history, there is a bit more background to consider.
The Google Books project has set off a considerable stir among authors and publishers of works still in copyright who consider the digitization of their works without their permission to be a breach of copyright. The Authors Guild and others sued Google in September 2005, and final ruling awaits court action. Google maintains that because it only displays brief "snippets" from a copyrighted work unless there's an agreement with the publisher to display more, this usage is within the "fair use" guidelines of copyright law. Google also gives a second digital copy of each work to the lending library. The libraries generally plan to keep their digital copies of copyrighted works in "dark archives," (3) although libraries' interpretations of what's in the public domain or falls within fair-use guidelines may differ somewhat from Google's.
Google Books has numerous publisher partners; in fact, many publishers signed up before the library program was announced. These publishers control how much of the content of each title they want displayed, from 20% to 100% of the text. Google Books labels any book with less than 100% displayed as "limited preview." The entry for the book then includes the statement "Pages displayed by permission," and Google displays links to the publisher.
Books that are out of copyright are labeled "full view"; and indeed, each and every page of these can be viewed. In most cases, these books can also be downloaded as PDFs, from which pages can then be printed. Currently, "copy and paste" of text in the PDFs does not appear to be possible (passages can, however, be copied as images); nor can optical character recognition be applied, even when the downloaded PDFs are opened in Adobe Acrobat Professional. This is a feature that library partners might choose to change in the presentation of their digital copies. (The University of Michigan Library catalog is already doing so. For books that have been digitized by Google, there are links to two e-versions: Google Books and "M-Books," for the University of Michigan copy. M-Books can be displayed three ways: as images, as PDFs, or as raw, uncorrected text, labeled "full-text." "Copy and paste" will work from the "full-text" display (although users will want to compare the text to the PDF, because there are errors in character recognition).
Google's is not the only wholesale digitization effort going on. …