Academic journal article Informatica Economica

Stylometry Metrics Selection for Creating a Model for Evaluating the Writing Style of Authors According to Their Cultural Orientation

Academic journal article Informatica Economica

Stylometry Metrics Selection for Creating a Model for Evaluating the Writing Style of Authors According to Their Cultural Orientation

Article excerpt

(ProQuest: ... denotes formulae omitted.)


R esearches on the intellectual property rights face determining the level of originality of a research paper, in contract to the action of plagiarism which is defined as the full or partial ownership of ideas, expressions, methods or procedures and their presentation as a personal creation. In the AngloAmerican laws, the economic considerations and those that refer to public politics are prevailed in the elaboration and development of the property rights laws while, in the European point of view, the moral and civil arguments based the elaboration of the same laws.

The legislative framework does not resolve identifying plagiarism and level of originality of a scientific work. The present paper aims to apply the legislative property rights in the context of publishing scientific research papers.

In practice, there are different types of plagiarism, the most common being: copy-paste, paraphrase, plagiarism through translation in different languages, artistic plagiarism, ideas plagiarism, source code and not using the proper citations. Article [1] presents the fact that plagiarism through paraphrase is analyzed, reaching to a classification of the major known types, along with a testing using software detection of plagiarism at the level of percentage of correctness by identifying the paraphrase within a text document.

The present paper is consisted in five chapters, starting from a short introduction of the major aspects debated, along with the principles that are at the base of forming the property rights laws within the European community and the Anglo-American one. Regardless of the community involved, plagiarism is a form of using others research, as it is or modified, and presenting it as a personal creation.

Chapter 2 describes the terms of creativity and plagiarism in an antithesis analysis, reaching to the concept of originality, defined as a property that a creative research paper has when the ideas presented within in are different from the ones already published by different authors. A metric is implemented in order to obtain a measurable value in determining the level of originality of a paper. The main ways of testing a paper of plagiarism, intrinsic and external analysis, are described for choosing the proper methodology for determining originality of scientific papers. The research leads to the stylometric analysis within the third chapter, a field found at the crossroad of plagiarism, originality and author identification. This stylometric analysis is done within the intrinsic plagiarism detection and is formed on the bases of a number of metrics that describe unique a writing style of a specific author.

Within the fourth chapter, eight stylometry metrics are extracted from a number of scientific research papers in order to obtain the best combination that describes best the writing style of an author. For that, Weka tool along with the integration of WordNet lexical ontology analysis are used, obtaining a set of four metrics that can further describe the writing style of an author according to its cultural orientation. Conclusions are highlighted in the fifth chapter along with directions for future research.

2 Creativity and Plagiarism Analysis

Creativity, seen as a form of originality, represents the characteristic of adding something new, original and appropriate to reality, defining the novelty and originality. For that, in order to analyze the level of originality of a scientific paper, it needs to create an antithesis between this component of creativity and the plagiarism one.

Starting from the objects used within the present research, scientific research papers written by Romanian and other European authors, the component of semantic phase is defined as a compact component within a paper, formed out of one or more adjacent phases, which is significantly different from the semantic phases prior or subsequent to it. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.