Authorship, Rich Snippets, and the Semantic Web

By Notess, Greg R. | Online, November-December 2012 | Go to article overview

Authorship, Rich Snippets, and the Semantic Web


Notess, Greg R., Online


In the early days of web search engines, the basic idea of building a database was fairly simple. Send a spider out to crawl the web, index the words of each webpage, and then follow all the links to find more pages. The harder part was figuring out a useful ranking for results--and the early relevance ranking often missed its mark.

In those days, many librarians wished for better indexing data beyond just a straight full-text index. We were used to well-structured bibliographic databases, with standard citation information about the document--such as author, title, source, and date. The only fields in early web searching were the URL and the HTML title.

Relevance has improved substantially since the search engines' early days. In an effort to make yet more improvements, Google is now (finally) moving into reading more structured data from webpages. With an authorship initiative and a growing use of rich snippets, Google is taking small steps toward relying on a more semantic web.

THE METATAG DEBACLE

One early failure that illustrates the problems with structured data on the web occurred back in the mid-1990s. The early search engine AltaVista introduced the idea of using metatags in the header of a webpage to describe its content more accurately. In particular, the meta keywords and meta description tags were designed to be used like the author-supplied keywords and the abstract sections of scholarly articles. Such an approach worked well in the scholarly literature for years, so the reasoning was that if webpage creators would describe their own webpages with descriptions and keywords, it could help search engines.

Unfortunately, the web turned out to be a very different and much more commercial environment than the scholarly world. The design of metatags resulted in their being located in the header of the webpage. This content is not visible to most human viewers (unless you choose to view the source of the webpage). With companies making money on the web, based on how well webpages ranked in search results, the incentive to misuse the tags proved overwhelming.

Within a few years, the meta keywords tag was no longer used by most search engines for indexing. The vast majority of webpages using the metatag were just stuffing the field with popular keywords in the hopes of attracting more traffic rather than using the keywords to accurately represent the content. Since its intended use failed, the meta keywords tag story became a lesson in the problem of relying on webpage builders to create honest and accurate descriptions of their own content.

INCREASED META INTEREST

While metatags continue to be used for other functions, the idea of having creators tag their own content was abandoned by search engines for many years. Meanwhile, librarians and others were using metatags successfully with metadata standards, such as Dublin Core, to accurately represent the content, but this use was so small compared to the vast number of webpages misusing metatags that it did not move the major web search engines to index them or make them searchable. [On Sept. 19, 2012, after Greg wrote this column, Rudy Galfi, product manager of Google News, announced the "newly hatched way" for publishers to add metatags to news stories. The news_keywords metatag encourages writers to add descriptive terms that might not actually appear in the story. Searchers will not see these tags, as they are part of the page's HTML code (http:// googlenewsblog.blogspot.com/2012/09/a-newly-hatchedway-to-tag-your-news.html).--Ed.]

In 2001, Tim Berners-Lee co-authored an article in Scientific American describing his hopes for the development of the semantic web that could bring "structure to the meaningful content of Webpages" (Tim Berners-Lee, James Hendler, and Ora Lassila, "The Semantic Web," May 2001, pp. 29-37). The idea was to create more structured documents on the web where the structured elements can be read by software to eventually create a "web of data. …

The rest of this article is only available to active members of Questia

Already a member? Log in now.

Notes for this article

Add a new note
If you are trying to select text to create highlights or citations, remember that you must now click or tap on the first word, and then click or tap on the last word.
One moment ...
Default project is now your active project.
Project items

Items saved from this article

This article has been saved
Highlights (0)
Some of your highlights are legacy items.

Highlights saved before July 30, 2012 will not be displayed on their respective source pages.

You can easily re-create the highlights by opening the book page or article, selecting the text, and clicking “Highlight.”

Citations (0)
Some of your citations are legacy items.

Any citation created before July 30, 2012 will labeled as a “Cited page.” New citations will be saved as cited passages, pages or articles.

We also added the ability to view new citations from your projects or the book or article where you created them.

Notes (0)
Bookmarks (0)

You have no saved items from this article

Project items include:
  • Saved book/article
  • Highlights
  • Quotes/citations
  • Notes
  • Bookmarks
Notes
Cite this article

Cited article

Style
Citations are available only to our active members.
Buy instant access to cite pages or passages in MLA, APA and Chicago citation styles.

(Einhorn, 1992, p. 25)

(Einhorn 25)

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

Cited article

Authorship, Rich Snippets, and the Semantic Web
Settings

Settings

Typeface
Text size Smaller Larger Reset View mode
Search within

Search within this article

Look up

Look up a word

  • Dictionary
  • Thesaurus
Please submit a word or phrase above.
Print this page

Print this page

Why can't I print more than one page at a time?

Help
Full screen

matching results for page

    Questia reader help

    How to highlight and cite specific passages

    1. Click or tap the first word you want to select.
    2. Click or tap the last word you want to select, and you’ll see everything in between get selected.
    3. You’ll then get a menu of options like creating a highlight or a citation from that passage of text.

    OK, got it!

    Cited passage

    Style
    Citations are available only to our active members.
    Buy instant access to cite pages or passages in MLA, APA and Chicago citation styles.

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn, 1992, p. 25).

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences."1

    1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

    Cited passage

    Thanks for trying Questia!

    Please continue trying out our research tools, but please note, full functionality is available only to our active members.

    Your work will be lost once you leave this Web page.

    Buy instant access to save your work.

    Already a member? Log in now.

    Author Advanced search

    Oops!

    An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.