Academic journal article Library Philosophy and Practice

RDFa and Microdata: Is One Better Than the Other?

Academic journal article Library Philosophy and Practice

RDFa and Microdata: Is One Better Than the Other?

Article excerpt

A Research Proposal Submitted in Partial Fulfillment of the Requirements for LIBR 281 - Metadata

San Jose State University 2014

RDFa and Microdata: Is One Better Than The Other?

The resource description framework (RDF) was developed by the World Wide Web Consortium (W3C) a standard for organizing the semantic web. One of the most commonly cited issues with RDF is that many times developers don't think in terms of interoperability, and thus, as that is its primary selling feature, RDF is slow to be adopted (Alam, Khan, & Thuraisingham, 2011; Davis, 2011). This is less an issue when looking at the web in general, as many sites are designed with search engine optimization in mind and are thus keen on adhering to whatever standards make content accessible to search engines. The complexity of RDF, however, led to slow adoption because it was not feasible for the average web developer to implement. To solve this issue, RDFa (RDF in attributes) was introduced to allow RDF expressions within HTML, XHTML, and XML documents. RDFa, like RDF, is a product of the W3C, and was their answer to accessibility and usability by web developers.

RDF and even RDFa (despite being designed for usability) require some knowledge of computer science and/or metadata and study of how the fairly complex systems work, which can be intimidating at best and in many cases a barrier. Microdata was developed by the Web Hypertext Application Technology Working Group (WHATWG) as an alternative model for including metadata in HTML documents. Microdata aimed to be simpler yet in order to facilitate adoption by as many developers as possible. Schema.org has become the home of all things microdata, as it contains a primer on using microdata and the most set of microdata terms (itemtypes and properties) supported by the major search engines. A major criticism of microdata, however, is that is isn't sophisticated enough to adequately and accurately convey resource metadata.

This project seeks to explore and observe differences in RDFa and microdata and their ability to retain proper schematization and syntax when converted back to RDF/XML. Online conversion tools were used to transpose existing RDF/XML files from online data dumps to RDFa and microdata, and then back to RDF/XML, offering some insights into RDFa and microdata's capabilities, as well as a taste of what may happen in the future if major search engines decide to move away from microdata and developers need to convert to a different semantic markup language.

Literature Review

Literature on RDFa and microdata found in academic journals comprises a small segment of available literature on the subject. Likely to the comparatively slow publication cycles compared to the speed at which RDFa and microdata are developing, peer-reviewed literature consists primarily of topical overviews and projections of future technologies that may build on RDFa and microdata. The bulk of information relating to RDFa and microdata can be found on the W3C site (RDFa) and schema.org, the home of microdata. Additionally, the major search engines and companies involved in development of these frameworks publish information on their websites. Individuals who are involved or were once involved in development teams for these schema blog prolifically, as do web developers, metadata specialists, and those with specialized interests in the semantic web. This literature review will address a breadth of writings on the development, strengths and weaknesses and various contexts, and future of RDFa and microdata.

Both RDFa and microdata contain embedded semantic data. Adida (2011) put forth four critera for embedded semantic information: impendence and extensibility, data shouldn't be duplicated, users should be able to access structured metadata along with the data within their browser, and data/structured data should be self-contained (to facilitate modern copy/past culture). Of these four criteria, the several deem extensibility and self-containment the most important (Adida, B. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.