Magazine article Computers in Libraries

Using Web Services and XML Harvesting to Achieve a Dynamic Web Site

Magazine article Computers in Libraries

Using Web Services and XML Harvesting to Achieve a Dynamic Web Site

Article excerpt

Exploiting and contextualizing free information is a natural part of library culture. For example, our Web page at Herrick Library contains hundreds of links to free and useful resources. We are always on the lookout for practical links. But, like most librarians, I'm never satisfied. Yes, providing links on your library's Web page is helpful, but certain sites inspire something I like to call Web site envy. "If only I could have that site's enviable content." Wouldn't it be great if you could actually display Amazon bibliographic information or Google search results on your library's site? Well, now you can. In the last several years, such news providers as The Christian Science Monitor, the BBC, and The New York Times as well as e-commerce giants Google and Amazon have started to publish their content as XML. This means that, thanks to liberal usage agreements, librarians can now harvest this information and put it directly on their own home pages.


In our ongoing effort to make Herrick Library's Web site more dynamic and engaging, we set up a page that features international and domestic news. This page refreshes hourly, harvesting and displaying headlines from news services such as Reuters and the Associated Press. Being a rural university, we are sensitive to our urban/suburban students, who may occasionally feel geographically isolated. Our library's news page is just one small effort to help our patrons feel connected to the rest of the world. And we are by no means the only library that sees the value of adding dynamic news headlines to its Web site. (Check out the sidebar to see how other libraries are using RSS feeds.)

News headlines aren't the only useful XML content that you can add to a Web site. Recently, librarians have discovered the Amazon E-Commerce Service, or ECS (formerly known as Amazon Web Services, or AWS). By dynamically harvesting this bibliographic content, librarians are exploring ways to merge e-commerce information with library acquisition lists and even catalog search results. Imagine if your patrons could do a title search in your library's catalog and retrieve not only bibliographic information, but also book images, reviews, and other valuable Amazon content. Patrons could view the book cover online without having to go to the stacks. And these are just a few examples of how librarians are creatively exploiting free XML content and bringing a whole new level of currency and sophistication to their sites.

Like all worthwhile projects, implementing this type of system requires a significant investment in knowledge acquisition and technosweat. My goal is to provide you with a general overview of the XML harvesting process. In the next several paragraphs, I will describe this model, skipping over many of the details that can be overwhelming to XML newbies. My hope is that this info will act as a sort of conceptual springboard that will inspire you to learn more about the details of XML harvesting and, perhaps, even implement this system in your library. Let's get started.

The XML Architecture

XML harvesting is all about server-to-server communication--an idea that is foreign to many Web developers, who are more familiar with the browser/server transaction in which the browser requests a Web page, and the server returns a simple HTML page that is rendered in the browser. In this model, the transaction really only takes place between two computers: the server and the computer with the browser. The XML harvesting model starts out in the same familiar way, with the browser requesting a file from the server. But after that initial request, the XML harvesting transaction becomes somewhat more complex. Instead of returning a simple page to the browser, the server transmits a request to a remote XML repository (such as Amazon, Google, or CNN). This server-to-server transaction is often initiated and facilitated by some flavor of server-side scripting, such as ASP, ColdFusion, PHP, or just good old-fashioned CGI. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.