Academic journal article Informatica Economica

Identifying Software Complexity Topics with Latent Dirichlet Allocation on Design Patterns

Academic journal article Informatica Economica

Identifying Software Complexity Topics with Latent Dirichlet Allocation on Design Patterns

Article excerpt


Software complexity is a key property in software engineering and developing applications. The subject relates to refactoring, reusability, reducing software management project costs, and large infrastructures at low cost. Although there are many software metrics, the scientists still consider there are subjects to improve by scientific studies. Because the modern software engineering relates to object oriented field when designing applications and because of the wide area, that software complexity supposes, we decided to study the subject of software complexity by approaching the field of design patterns, hopping to identify the main topics to study and future trends in research.

Various topics on object oriented design have been proposed over the years. Design patterns are a subject of interest because they offer solutions for the coupling and cohesion between different layers of an application. The problems due to of a design with high coupling are: changes in related classes force local changes; harder to understand in isolation; harder to reuse because it requires additional presence of other classes. The problems due to a design with low cohesion are: hard to understand; hard to reuse or to maintain. High cohesion means that a class has moderate responsibility in one functional area and it collaborates with other classes to fulfill a task. Software complexity can be reduced by designing systems with the weakest possible coupling between modules [1].

Historically, complexity in programs arising because of the number of conditional and iterative statements has been measured using the cyclomatic complexity metric [2]. Refactoring code with design patterns reduces complexity, although it increases the number of classes [3]. The authors show that design patterns do not always improve the quality of systems. Some patterns are reported to decrease some quality attributes and to not necessarily promote reusability, expandability, and understandability. Also, they bring further evidence that design patterns should be used with caution during development because they may actually impede maintenance and evolution. Their study also reveals that object-oriented principles may not be so "good" as they may not necessarily result in systems with good quality.

However, we consider that the subject of studying the effect that design patterns might have on software complexity is not very well represented. The scope of this paper is to identify the main topics, the trends, and to test if there is a correlation between the two subjects.

2 Materials and methods

In this section, we present the research goals and questions to be answered, and we describe the inclusion or selection criteria for the studies chosen to analyze and data collection.

The purpose of this article is to get a broad and current overview of the two subjects considered in this paper: design patterns and software complexity. The analysis was realized on academic journal articles. The search for papers was conducted in 2019 on Thomson Reuters' Web of Science, which has a large interdisciplinary database of academic texts, and limited to peer reviewed articles and reviews in English.

We realized two searches in the title, abstract, and keywords of papers from ISI Clarivate:

* design patterns, which returned 2045 articles;

* "software complexity", which returned 302 articles.

Data selection is presented in Figure 1.

In order to identify the topics from software complexity subject area we applied LDA on the corpus of articles belonging to the design patterns subject. For the LDA analysis, we used the abstracts of the selected papers. Figure 2 presents the approach of our study.

The abstracts are expected to give a sufficient indication of what is the subject of the paper and thus provide an overview of the topics discussed in the respective fields [4]. In natural language processing, latent Dirichlet allocation (LDA) is a generative statistical model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.