Academic journal article Informatica Economica

Saving Large Semantic Data in Cloud: A Survey of the Main DBaaS Solutions

Academic journal article Informatica Economica

Saving Large Semantic Data in Cloud: A Survey of the Main DBaaS Solutions

Article excerpt

Introduction

One of the main limitations of the World Wide Web (abbreviated WWW) is that it was not designed to be machine-readable, but only human-understandable. In 1994, at the very first International WWW Conference, five years after he invented the World Wide Web, Tim Berners-Lee introduced the idea of a semantic web that can be understood by machines [1]. WWW Consortium (abbreviated W3C) is an international organization which aims to develop standards for the WWW. W3C was founded by Tim Berners-Lee at the same conference where he announced the need for semantic web [1]. Ever since, one of the objectives of W3C is to improve the WWW, by upgrading it from a web of documents to a web of data. [2]

Until the end of the 20th century, the work on semantic web was mostly theoretical; no practical application emerged as an international standard. The first important practical approach consisted in microformats. Instead of using HTML tags for their usual purpose, microformats can be sent as metadata, annexing information understandable by machines. In the meantime, new standards have been developed (e.g. RDF, RDFS, OWL, SPRAQL and JSON-LD) as well as new approaches. [3] In this paper, the authors describe the current state of semantic web, present the main approaches used for semantic web and discuss the most important semantic web solutions for cloud computing. The aim of this article is to present and compare the main solutions for saving semantic data in cloud.

The main semantic web representations are currently based on the semantic triple. A triple is a statement that links two objects, and follows the rule Subject-Predicate-Object. Every part of the triple has a uniform resource identifier (URI) associated with it. Semantic data can be stored as a large graph, where the subject and object are represented as nodes connected by a predicate, in the form of an edge. Any information can be represented by this simplistic model. The maximum potential of this approach can be reached if all the resources on the WWW have an associated URI and are connected with as many other resources as possible by triples. In this scenario, the WWW becomes a large unified database where information from more websites can be automatically extracted and correlated by simple queries. Table 1 shows the format of an URI and describes its components.

The standard for representing a semantic triple is the Resource Description Framework (abbreviated RDF). Through RDF, a triple can be represented as a succession of three URIs. There are more ways of representing triples with RDF, called serialization formats. The description language varies from one serialization format to another. The main serialization formats are RDF/XML and JSON-LD. However, Turtle (abbreviation of Terse RDF Triple Language) and N-Triples are worth mentioning, as they can be easily understood by humans. Both RDF/XML and JSON-LD serialization are based on popular standards used to represent and transfer data between applications. XML (abbreviation from Extensible Markup Language) has the advantage of being easy to interpret by applications, but in general it is considered a difficult writing format. JSON (abbreviation of JavaScript Object Notation) is used to represent data structures and transmit them between applications. All-important programming languages support JSON formats. JSON-LD (abbreviation from JSON - Linked Data) is designed to facilitate the representation of RDF relationships. [4]

RDF triples are implemented with associated models named ontologies. Ontologies are designed with sets of rules, terms and vocabularies. These sets provide definitions of the entities found in reality. Ontologies are used to develop a large number of applications in different areas, such as knowledge management, intelligent information integration, information retrieval, natural language processing, database design and integration, e-commerce, bioinformatics and education [5]. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.