Magazine article Information Today

Preparing Web Data with XML: Part III

Magazine article Information Today

Preparing Web Data with XML: Part III

Article excerpt

Eli Willner is technical advisor to Data Conversion Laboratory, Fresh Meadows, New York. He can be reached at

Now, let 's talk about Web-specific XML issues and benefits, and tagging

As we detailed in the first article in this series (May 1998 Information Today), you should convert your data one last time to SGML\ XML. Implementation is key to a successful conversion, so we devoted the second article (June 1998 Information Today) to issues of strategic planning. We explained how, using a stepwise approach, you could architect your conversion so that it produces a quality SGML\XML product, on time and within budget.

Now we're ready for some nitty-gritty. We'll examine some Web-specific issues related to XML conversions, and explain how you should structure your data so that users will be able to reformat it on the fly--and on their local machines, to minimize bandwidth consumption. We'll consider industry standard name spaces and discuss how they will affect your XML conversion. And, we'll explain the benefits of "thorough tagging."

Meet an Olympian Challenge

The key benefits of XML, remember, derive from the fact that data elements are identified not by format, as they are in HTML, but by structure. To illustrate how you might profit from this characteristic on a practical level, let's consider the following actual scenario. You've put together a Web site reporting interactively on Olympic results (it might just as well be stock market prices or current book titles). Thousands of users are accessing your site concurrently, so it's vital that the maximum amount of information he squeezed into minimum bandwidth; otherwise response time will slow to a crawl. For the same reason, it's also very important that the burden on the server be minimized by offloading work to the clients (the users' machines) whenever feasible.

The nature of your data, though, is that users like to look at it in different ways. Someone may want to view a list of contestants in a ski event sorted in order of score, for example, then view the same data arranged by country. In an HTML-only world, the user would be presented with a series of buttons representing choices regarding the desired arrangement of data. If the user selected "by score," the server would compile the required information from its database, tag it in HTML, and ship it down to the user. The user would see something like the following:

Name Country Score

Doe, John United States 91

LaDoe, Jean France 89

Doesky, Jan Russia 81 ...

Note, though, that the data would be tagged not to identify its fielding in the server database but to identify its desired format on the client side (see Figure 1).

The precise meaning of the HTML tags in the example above isn't important. The point is that the tags describe nothing more than the data's appearance in a table. There is nothing in the data transmitted by the host to identify "Doe, John" as a contestant, "91" as his score, or "United States" as his country, and so on.

Let's consider what would have to happen when the user requests that the same data be redisplayed in order by country. Although all the data are already at the user's computer, it is powerless to re-sort them locally because it has no idea which field is the "country" field (or even that the data are fielded, for that matter). As a result, another request is sent to the overburdened server, which has to again query its database for the identical information, reformat it in the new sequence, and retransmit it to the user-a disaster as far as conservation of bandwidth and server resources are concerned.

Using XML, though, you could arrange for your server to dish out the data (along with appropriate information to identify the table structure) as follows:

Doe, John<\name>

United States<\country>


In this scenario, when the user hits the button requesting that the same data be displayed in country sequence, the re-sort can happen at the client side, since not only are the data at the client, but they are properly fielded at the client as well. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.