Official documents, and particularly legal ones like law codes, often contain ambiguities and/or inconsistencies, due to linguistic problems like polysemy, as well as ontological problems like underspecification, disagreements and/or false agreements. Such problems can be identified by formalizing the terminology of a domain in terms of an ontology. We show this phenomenon in a particular domain, the definition of different classes of vehicles. Defining accurately these different vehicle types shed light on some of these semantic deficiencies present in two Brazilian legal codes responsible for defining vehicles' categories in an unambiguous manner for many purposes, e.g. tax calculations, and, more importantly, to make e-government systems interoperate while taking laws into account in a Semantic Web scenario. In this work, we define a framework linking the linguistic and conceptual problems to semantic deficiencies and show how these deficiencies were identified during the vehicles' ontology construction.
Keywords: Ontology-based analysis of texts, Semantic deficiencies, Vehicles, E-government, Law, Ontology engineering
(ProQuest: ... denotes formulae omitted.)
It is quite common, even routine, that people have different accounts about a text they have read or even a dialogue they have had. The ambiguities, inconsistencies and other semantic problems arise in readings or talks due to a number of reasons, which are studied in depth by many different branches of knowledge, ranging from human sciences like Philosophy and Linguistics to exact sciences like Mathematical Logics and Artificial Intelligence . The main reason for these misinterpretations on human communication, however, seems to lie in the very nature of natural languages: they emerged as a result of a long - indeed, never lasting - constructive communication process. The process per se is rather a solution than a problem, nonetheless, the different interpretations of terms, phrases, intentions and meanings conveyed in written or spoken communication do cause harmful misunderstandings. Those are due to the fact that natural languages are not the outcome from a formal process. While formal languages are composed of formal syntaxes - that regulate expressions which are valid in the employed language - related to accurate semantics - with which interlocutors and readers can clearly disambiguate among expressions' different interpretations -, in natural languages two interlocutors expect to share a same interpretation about single phrases and entire texts, without relying on any sort of formal semantics, understood by both of them. Their expectations, unfortunately, are not always fulfilled.
In order to avoid the same problem to happen in a higher scale in computer communication, semantic technologies, and particularly the ones related to the Semantic Web and its ontologies , have proven useful for many government related applications and prototypes, such as service configuration and automatic service connection among many others. This is possible because the Semantic Web is based on ontologies, which, in practical words, stands for a detailed conceptualization of a domain and its concepts, relations, constraints and axioms. These specifications must be defined in an unambiguous manner using formal logic. On the other hand, official documents, and particularly legal ones like law codes, often contain semantic deficiencies that are not realized by their authors. The most common among them are ambiguities and inconsistencies ought to linguistic problems like polysemy (i.e., one word with multiple senses), as well as underspecifications and inconsistencies. In this work, we sketch a detailed framework of these deficiencies and show how they were identified during ontology construction. These deficiencies are certainly a source of integration problems and confusion during their usage, when the intended meanings can differ depending upon the stakeholder. …