Developing a central platform for overcoming integration challenges in big data exploration

The Statistical Office of the Republic of Slovenia (SURS) has developed general software solutions for statistical data processing by means of the SAS solution in order to overcome the challenges of integrating business processes.

In the past, SURS, which is the main provider and coordinator of the activities of Slovenian national statistics, carried out statistical processing on the central computer of the Government Centre for Informatics. With the transition from the central computer to the LAN environment, SURS was obliged to seek an alternative solution for certain standard software packages that could not be used in the LAN environment. “SURS already used SAS on a central computer; however, its use has expanded enormously with the transition to the LAN environment. SAS played a key role in the smooth transition to the LAN environment with effective and statistician friendly tools,” says Genovefa Ružić, SURS General Director.

The main SAS advantages are user-friendly tools and an open architecture, which follows the latest trends.

Genovefa Ružić
General Director at SURS

SAS as a central development platform for data preparation

With the transition to the LAN environment, SURS started to use different tools, which, according to General Director Ružić, made it an advanced user, compared to other statistical offices that use SAS. “Most of the other statistical offices use SAS for analytical purposes, as a solution for business intelligence. Rare statistical offices use SAS as a platform for data integration; very rare is the use of SAS as a major development platform,” explains Ružić.

SURS developed general software solutions for statistical data processing by means of SAS. Solutions offered by SAS enable it to overcome the challenges of business process integration and introduce service-oriented architecture (SOA) in the statistical production system. “With its open service-oriented development platform, SAS enables rapid and efficient development of generic building blocks of the statistical production system and their integration with other development platforms at SURS,” adds Ružić, and continues that SAS is a central platform for data preparation (which includes cleaning, arrangement, and transformation) and the implementation of statistical methods and analyses. Furthermore, it enables advanced “in-memory” statistical analysis that is especially important in analysing mass data.

For simple and highly complex statistical procedures

Furthermore, the SAS tools support very simple procedures, such as the calculation of basic descriptive statistics, and highly complex ones, which are based on statistical models.

The indicated tools play a key role in the Process and Communication Division and the Information Infrastructure and Technology Division, which prepare samples for research and general solutions. Tools allow substantive methodologists to carry out statistical considerations, data aggregation, and the calculation of quality rates.

SAS® Contextual Analysis for the analysis of big data sources

Each statistical office strives to develop new innovations in the field of statistical services, the same applies to SURS, which also deals with the exploration of big data sources. “One of the characteristics of big data sources its non-structured nature, which prevents their classification and subsequently enables the detection of the characteristics of this data,” says the General Director.

The SAS® Contextual Analysis solution is also intended for those users who are not experts in programming or statistical models. SURS is successfully using the tool in a pilot programme, primarily for collecting data on job vacancies by searching for those URLs, within the given companies’ URLs, which are potentially associated with vacancy advertisements. In the second phase it is using various machine learning methods to detect ads and the number of vacancies and classify texts in a particular group. In this respect, the SAS® Contextual Analysis is seen as a potentially very useful tool, which enables users a relatively easy use with its graphical interfaces.

Better efficiency and usability of procedures

“The main SAS advantages are user-friendly tools and an open architecture, which follows the latest trends,” concludes the General Director. Their application results in significantly improved efficiency and usability of procedures and processes. 

Logo of the Statistical Office in Slovenia

Challenge

SURS needed the centralized solution for Big Data research projects

Solution

SAS Data Management
SAS Business Intelligence
SAS Visual Analytics
SAS Contextual Analysis

Benefits

SAS user-friendly tools and open architecture enables exploration of non-structured big data sources and their detection and classification for further use.

The results illustrated in this article are specific to the particular situations, business models, data input, and computing environments described herein. Each SAS customer’s experience is unique based on business and technical variables and all statements must be considered non-typical. Actual savings, results, and performance characteristics will vary depending on individual customer configurations and conditions. SAS does not guarantee or represent that every customer will achieve similar results. The only warranties for SAS products and services are those that are set forth in the express warranty statements in the written agreement for such products and services. Nothing herein should be construed as constituting an additional warranty. Customers have shared their successes with SAS as part of an agreed-upon contractual exchange or project success summarization following a successful implementation of SAS software. Brand and product names are trademarks of their respective companies.