Academic journal article Informatica Economica

Data Processing Languages for Business Intelligence. SQL vs. R

Academic journal article Informatica Economica

Data Processing Languages for Business Intelligence. SQL vs. R

Article excerpt

(ProQuest: ... denotes formulae omitted.)

1 Introduction

Often seen as a reincarnation of Decision Sup-port Systems [1] and sometimes referred as Business Intelligence and Analytics [2], Busi-ness intelligence (BI) is a broad category of applications, technologies, and processes for gathering, storing, accessing, and analyzing data to help business users make better deci-sions [3]. Figure 1 displays a classical BI ar-chitecture [4].

Common business intelligence related tasks are:

* data storage

* data extraction-transformation-load from various sources in a different for-mats, more or less structured, to the stor-age layer

* data processing

* information integration

* visualization

* exploratory analysis

* data mining/data science etc.

Slightly outdated, the schema in figure 1 is still valid in suggesting the vast array of tech-nologies, processes and tools gathered (or re-branded) within BI umbrella. Chen et al [Chen 2012] identified three generations of BI and Analytics (BI&A) systems whore core tech-nologies have been:

* data management and warehousing [5] [6]

* text and web analytics for unstructured web contents [7]

* mobile technologies [8].

Implementation of BI platforms requires vast quantity of organizational resources. Some of the most important current BI solutions are shown in figure 2 [9]. As with Enterprise Re-source Planning applications, BI systems im-plementation requires extensive organiza-tional changes and business expertise and sometimes it requires full vendor participa-tion.

Apart from impressive costs, BI platforms have the drawback of keeping captive the cus-tomer. Every organizational change and also new or updated external data source and ser-vice must be negotiated with BI platform pro-vider, which usually attracts new costs and also delays.

In this paper we scrutinize two languages, SQL and R, involved not only in BI applica-tion development but especially in the "de-mocratization" of BI as they allow various types of data professionals and users to access and process vast quantity of data in an inter-active, ad-hoc, way. Using two reliable sources, their role and popularity in current BI market will be outlined, taking into account job demand and a survey concerning BI tools and languages usage. Next the range of BI ac-tivities that can be supported by each SQL and R will be presented. The main section will compare SQL and R features syntax for the most common data processing/reporting prob-lems, particularly important for BI users.

2 Languages and Tools for Business Intel-ligence

There is a vast array of tools, languages and technologies covering large extents of BI tasks. Some of them target regular users who are unable to write code and scripts in any pro-gramming language. Others are BI application developer's toolbox. But there some technol-ogies that serve both users and developers in data processing, integration, visualization and analysis. Comparison of BI tools and lan-guages is also problematic because they can be available as programming languages, de-velopment environments, ecosystems or inte-grated platforms.

In evaluating the popularity of Business Intel-ligence languages and tools, we gather infor-mation from two reliable sources. Search en-gine www.indeed.com provides data about job trends. Figure 3 compares job demand in 2012-2016 interval for some of the most im-portant data processing and analysis lan-guages [10].

SQL and R share most of the job postings. In 2012 SQL was by far the most demanded data language. Its share decreased slightly and seems to have stabilized since the end of 2014. R grew spectacularly in 2012-2014 interval, overpass SQL in 2014 for a brief period, and then fell back. Since 2014 it has fluctuated around 2% share. After SQL and R, the next popular is Python followed by SAS, SPSS, Stata and Julia. Currently there is still a visible lag between SQL-R group and the rest of the languages/tools, although Python seems to in-crease steadily and might catch up with the leading group. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.