Where Are the Semantics in the Semantic Web?

Article excerpt

The current evolution of the web can be characterized from various perspectives (Jasper and Uschold 2003):

Locating resources: The way people find things on the web is evolving from simple free text and keyword search to more sophisticated semantic techniques both for search and navigation.

Users: Web resources are evolving from being primarily intended for human consumption to being intended for use both by humans and machines.

Web tasks and services: The web is evolving from being primarily a place to find things to being a place to do things as well (Smith 2001). (1)

All these new capabilities for the web depend in a fundamental way on the idea of semantics, giving rise to another perspective from which the web evolution can be viewed:

Semantics: The web is evolving from containing information resources that have little or no explicit semantics to having a rich semantic infrastructure.

Despite the widespread use of the term semantic web, it does not yet exist except in isolated environments, primarily in research labs. In the World Wide Web Consortium (W3C) Semantic Web Activity Statement, we are told that

   the Semantic Web is a vision: the idea of
   having data on the Web defined and
   linked in a way that it can be used by machines
   not just for display purposes, but
   for automation, integration and reuse of
   data across various applications (emphasis
   mine). (2)

As envisioned by Tim Berners-Lee:

   the Semantic Web is an extension of the
   current Web in which information is given
   well-defined meaning, better enabling
   computers and people to work in cooperation
   (Berners-Lee, Hendler, and Lassila
   2001, p. 35) (emphasis mine).

   [S]omething has semantics when it can
   be 'processed and understood by a computer,'
   such as how a bill can be processed
   by a package such as QUICKEN (Trippe
   2001, p. 1).

There is no widespread agreement on exactly what the semantic web is for or exactly what it is. Some good ideas about what the semantic web will be used for have emerged from the W3C effort to define a standard ontology language. (3) From the previous descriptions, there is clear emphasis on the information content of the web as machine usable and associated with more meaning.

Note that machine refers to computers (or computer programs) that perform tasks on the web. These programs are commonly referred to as software agents, or softbots, and are found in web applications.

Machine-usable content presumes that the machine knows what to do with information on the web. For this to happen, the machine reads and processes a machine-sensible specification of the semantics of the information. This approach is robust and very challenging and largely beyond the current state of the art. A much simpler alternative is for the human web application developers to hardwire the knowledge into the software so that when the machine runs the software, it does the correct thing with the information. In this second situation, machines already use information on the web. There are electronic broker agents in routine use that make use of the meaning associated with web content words, such as price, weight, destination, and airport. Armed with a built-in understanding of these terms, these so-called shopping agents automatically peruse the web to find sites with the lowest price for a book or the lowest airfare between two given cities. Thus, we still lack an adequate characterization of what distinguishes the future semantic web from what exists today.

Because the RESOURCE DESCRIPTION FRAMEWORK) (RDF) is hailed by the W3C as a semantic web language, (4) some people seem to have the view that if an application uses RDF, then it is a semantic web application. This is reminiscent of the "if it is programmed in Lisp or Prolog, then it must be AI" sentiment that was sometimes evident in the early days of AI. …