Academic journal article Informatica Economica

Mining Product Data Models: A Case Study

Academic journal article Informatica Economica

Mining Product Data Models: A Case Study

Article excerpt

(ProQuest: ... denotes formulae omitted.)

1 Introduction

Processes are all over the world. Even the steps of buying a car may be viewed as a process. Most organizations use information systems (e.g.: Workflow Management Systems (WfMS), Customer Relationship Management systems (CRM), Enterprise Resource Planning (ERP)) to support their businesses. If an information system has the ability to record the actions performed by users within it we speak about Process-aware information systems (PAISs [1]). The actions performed by users recorded by information systems are called (event) logs. An event log consists of several events. An event stores information about the name of the event, the resource performing the action, the timestamp when the action has been started / completed, or data elements recorded within the event (e.g.: the name of a new client from a hotel). Process mining domain [2] analyzes this kind of event logs. Thus, the goal of process mining is to extract information about the process from event logs. On the one hand, process mining aim is to extract process models from event logs (process discovery) and on the other hand one question pops-us: does the discovered model depict the intended behavior of the process?(conformance checking). Moreover, comparing the intended behavior of a process with the real behavior, the future behavior may be discovered (enhancement).

Generally, the discovering algorithms focus on control-flow perspective [3,4,5] or resource perspective [6]. But what can we say about data needed in order to execute process' activities? This paper aims to offer a dynamic data-flow perspective of a process using synthetic event logs from two different sources: Navision ( and YAWL system (

Generally the employees from a company know the internal processes. But after the execution of a particular number of activities, the arising question is what data we need in order to continue our process. Moreover, being in a certain state of a process (after the execution of certain number of activities), the question asked by the employees is "what data is available and can/must be used in the future activities of the process"; or "what data do we still need in order to execute a particular activity". The control-flow perspective doesn't have the ability to provide these answers. Being in a certain state of a process' execution, it may only show the activities which can/must be executed, but it cannot ensure the execution of the activities.

Second section presents some modeling methods and techniques which provide the data perspective of an information system or a process. We briefly presented the shortcomings of each of them, starting with basic data modeling techniques like ERDs until mining techniques using event logs. Third section introduces our approach in order to depict the data-flow perspective of a process. The case studies are depicted in the fourth section of this paper. For each case study we analyzed the way of data collection, then the data source, after that we presented the conversion tools used in order to get the desired event log format and finally we presented the resulted data-models.

2 Related Work

The 1970s and 1980s were flooded by data-driven approaches, while at the beginning of 1990s process driven approaches appeared [2]. The last trend is kept nowadays: the most part of information systems are process-centric.

Basic types of data modeling like Entity Relationship Diagrams (ERDs)[7] underlie the design of relational databases by offering a static data model. Each entity is defined by attributes related to it, while an activity from a process needs data related to different entities. Thus, an ERD cannot depict the data movement through a process.

Another basic approach used in modeling using activities is depicted using UML Activity Diagram. This approach focuses on the activities, not on data; thus it may describe the control-flow perspective of a process. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.