• home icon
  • Phone icon
  • Email icon
  • Twitter #analytics2014

Program

Session Abstracts


Media and Communications

Forecasting at Telefónica Germany

Verena Braunschober, Dr., Senior Analyst, Telefonica Germany

Patrick Walch, Dr., Head of Advanced Analytics and Models Team, Telefónica Germany

When trying to analyze economic problems, being led by a "gut feeling" or looking into the past through reports, dashboards or even sophisticated statistical analyses might not be enough to make the optimal decision. This holds especially true in the ever-changing telecommunications industry, where making a wrong decision can quickly leave you behind your competitors. Therefore, the Business Analytics department of Telefónica Germany developed a forecasting framework using SAS Forecast Server, which allows it to forecast KPIs for strategic, tactical, financial and operational questions. This presentation will give an overview of the forecasting models used, show possible difficulties in the forecasting process, and give a live demonstration of how Telefónica builds a forecasting model.
Level: Appropriate for all levels of knowledge and experience

Human Mobility, Network Analysis and Link Prediction

Carlos Andre Reis Pinheiro, Visiting Professor, KU Leuven

As a highly competitive industry, telecommunications is continuously evolving. Understanding subscriber behavior is crucial to better compete in such an environment. Social network analysis can play an important role in creating a complete picture about customer behavior in terms of relationships within the network. The analysis of subscriber mobility also reveals relevant patterns in relation to frequent paths and trajectories traveled throughout urban areas. Human mobility analysis can be applied to business and governmental problems, such as understanding how diseases spread, traffic planning, public transportation scheduling and communications network optimization. The combination of subscriber mobility analysis and social network analysis can also help companies to predict links between customers. Link prediction can be used to diffuse product adoption and reduce churn.
Level: Appropriate for all levels of knowledge and experience

Radical Change in 1-to-1 Sales Through Innovative Next Product to Buy Concept

Ketil Sandvand, Head of Customer Intelligence, Telenor

Telenor Norway has implemented an innovative next product to buy concept in all sales channels. This has resulted in more relevant and attractive offers for our customers, leading to increased sales numbers and sales efficiency through a radical change in the way we sell and interact with our customers. In order to achieve this, we had to invest in new analytical competence and tools, and alter our campaign processes with regard to analysis, planning, execution and monitoring of our sales activities.
Level: Intermediate

The Journey to Multimodel Environment: A Turkcell Story From Creation to Measurement

Tamer Cagatay, Senior Business Analytics Specialist, Turkcell

Gonca Gulser, Data Mining Expert, Turkcell

Establishing an intelligent targeting environment for marketing campaigns for numerous products has been a challenge for all companies for years. In Turkcell the need for quickly designed but acceptably accurate and sustainable models to generate scores is answered by SAS Rapid Predictive Modeler. Selected major products were modelled using the SAS solution, and results are pretty satisfactory, considering the time to market and integration to Turkcell's current SAS modelling environment. Next, SAS Model Manager will help Turkcell measure the performance of many models regularly by generating tracking reports and alarms, automating scoring tasks, and providing the ability to retrain the models when accuracy is below acceptable thresholds.
Level: Appropriate for all levels of knowledge and experience

Energy and Utilities

Big Data Analytics in the Energy Market: Mining Customer Profiles From Smart Meter Data

Kai Heinrich, PhD Student, Dresden University of Technology

Andreas Hilbert, Prof. Dr., Full Professor and Chair of Business Informatics, Dresden University of Technology

The introduction of smart meter technology generates enormous amounts of data which needs to be stored and processed for further usage. We focus on the processing of smart meter data with data mining techniques to identify types of households based on the individual load patterns. The main goal of this analysis is to generate additional knowledge for energy utilities to suit the customer's needs and build new business models. Therefore we evaluate different methods for transforming, clustering and comparing smart meter time series. The evaluation is based on simulated data, in order to assess the different methods in terms of big data performance, interpretability and, most importantly, quality of segmentation. We investigate methods of different natures, like Dynamic Time Warping (DTW), Fourier- and Wavelet-Transformations and statistical comparison of time series models with the help of spectral analysis. We then apply the suitable methods to a real-life data set that was gathered from smart meters at a household level. Afterwards we use visualization techniques in order to present the results to different stakeholders for decision support.
Level: Intermediate

Work Area Optimization at a Major British Utility Company

Jeff Day, Sr. Manager, Advanced Analytics R&D, SAS

A British utility company has several thousand service engineers who provide its customers with services that range from performing routine maintenance to handling emergency breakdowns. Each service engineer is assigned to a work area that consists of a set of postal sectors. The company wants to understand how it should configure its work areas to improve customer satisfaction, minimize travel time for its full-time service engineers, and minimize the costs of overtime and subcontractor hours. This presentation describes the use of SAS/OR optimization procedures to model this problem and configure optimal work areas, and the use of SAS Simulation Studio to simulate how the optimal configurations might satisfy the customer service requirements. The experimental results show that the proposed solution can satisfy customer demand within the desired service-time window, with significantly less travel time for the engineers, and with lower overtime and subcontractor costs.
Level: Intermediate

Financial Services, Banking

How to Handle a Large Number of Models

Jan Friis, Senior Business Analyst, BEC

A SAS predictive analytical factory is used to develop, score and monitor PD-models for 27 small banks in Denmark. Models are developed for application and behaviour scoring for different segments. All models are monitored using SAS Model Manager, and the scoring processes are set up using SAS Data Integration Studio. Reports from SAS Model Manager are automated, performed on each bank individually, and distributed to the banks every month. All first-version models are developed outside SAS, but reconstructed in SAS Enterprise Miner in order to ensure traceability and documentation throughout the process. Second versions of all models are developed in SAS Enterprise Miner.
Level: Intermediate

Random Forests and the Risk of Improving Logistic Regression

Gero Szepannek, Dr., Head of Scoring and Rating Models, Santander Consumer Finance

For many business applications of predictive modelling, like credit scoring, logistic regression has been established as the gold standard. Within recent years, an increasing number of competitive methods has been suggested and investigated in the literature. One of the most popular algorithms is random forests, which has recently been implemented within SAS Enterprise Miner.
Level: Intermediate

Stress Tests Automatic Generator

Nicolas Arnoult, Quantitative Analyst, Natixis

The Global Risk Quantitative Department of Natixis uses SAS and Microsoft Office to calculate the impacts on loss given default (LGD), default probabilities (PD), and credit rating migrations in stress testing scenarios. The input interface is simplified by entering an economic hypothesis in Excel. With only one click calling SAS, all the required results and reports (migration matrix, real estate Prices, etc.) are provided to the user. This tool makes it easier for Natixis to calculate the risk-weighted assets and the cost of risk. It is convenient and suitable for non-SAS users and prevents operational risks.
Level: Appropriate for all levels of knowledge and experience

Text Mining Customer Complaints: A Solution for a Generic Problem?

Peter Leijten, Manager for Quantitative Analysis of Operational Risk, ABN AMRO

The presentation will be about how text mining solved several issues surrounding customer complaints data. It will show, with real, live examples, how unstructured data is turned into valuable information for risk management and the improvement of core processes, and, subsequently, customer excellence. Although the context is the financial industry, this approach is very likely useful for any customer-facing industry with a large customer base. The approach chosen was successfully introduced and is being operationalized for all core bank processes.
Level: Appropriate for all levels of knowledge and experience

The Loss Distribution Approach to Economic Capital Modeling: Applications to the Insurance and Banking Industries

Mahesh Joshi, Principal Research Statistician Developer, SAS

The uncertain nature of adverse events makes statistical models appealing for risk management in the insurance and banking industries. A key component of the risk management process is the economic capital model (ECM), which estimates the aggregate loss that a company expects to incur in a particular time period. This talk presents a loss distribution approach that estimates the frequency and severity models separately and combines them to estimate the compound distribution model (CDM) of the aggregate loss. All modeling steps are computationally intensive and often need to handle big data. SAS/ETS software provides high-performance procedures to implement each step efficiently. The HPCOUNTREG and HPSEVERITY procedures build frequency and severity models, and the new HPCDM procedure simulates the CDM. PROC HPCDM also supports what-if analysis and assesses the impact of model parameter uncertainty on the CDM. This presentation steps through the entire modeling and simulation process, and concludes with a summary of the dependency modeling method that uses the CDM estimates of different business lines to compute risk measures and capital requirements that help your company not only comply with regulations but also assess its liquidity and financial health.
Level: Intermediate

The Voice of the Customer and Text Analytics

Anneke Deetlefs, Senior Manager for Social Media Analytics, Standard Bank

Insight into the consumer's needs, fears, interests and brand loyalty are amongst the key drivers for advanced analytics, in particular text analytics. The challenge is to collate the various sources of unstructured data and convert it into structured information, allowing for statement relevance, varying text structures, local languages and terminology. Dictionaries and models are created and the results are combined with structured data. The unstructured data of interest is social media, customer complaints, channel feedback, etc. An overview of the progress thus far is presented, including case studies on complaint root cause analysis, understanding reasons for offer rejection, measurement of above the line marketing success and branch performance.
Level: Appropriate for all levels of knowledge and experience

Government

GOTCHA! Improving Fraud Detection Techniques Using Social Network Analytics

Bart Baesens, PhD, Assistant Professor, KU Leuven

Véronique Van Vlasselaer, PhD Researcher, KU Leuven

Data mining algorithms are focused on finding frequently occurring patterns in historical data. These techniques are useful in many domains, but for fraud detection it is exactly the opposite. Rather than a pattern repeatedly popping up in a data set, fraud is an uncommon, well-considered, imperceptibly concealed, time-evolving and often carefully organized crime that appears in many types and forms. As traditional techniques often fail to identify fraudulent behavior, social network analysis offers new insights in the propagation of fraud through a network. Indeed, fraud is not something an individual would commit by himself, but is often organized by groups of people loosely connected to one another. The use of networked data in fraud detection becomes increasingly important to uncover fraudulent patterns and to detect in real time when certain processes show characteristics of irregular activities. Although analyses focus in the first place on fraud detection, the emphasis should shift towards prevention – detecting fraud before it is committed. As fraud is always evolving, social network algorithms can help keep ahead of new types of fraud and adapt to a changing environment and surrounding effects.
Level: Intermediate

Improving Fraud Detection by Social Inspection

Gaël Kermarrec, Project Manager, Fraud Analyst, Federal Public Service Social Secruity

Social fraud causes a significant loss of income, affecting the financing of Belgium's social security. The detection of these offenses is increasingly complicated due to the changing signals shown by offenders and the sophisticated new types of fraud. Data mining tools are proving to be a very powerful way to valorise the experience of inspectors. The Federal Public Service (FPS) Social Security is succeeding – with unchanged staffing levels – in detecting significantly more cases of fraud by applying predictive modelling techniques.
Level: Appropriate for all levels of knowledge and experience

Leveraging Ticketing Efficiency Through SAS Text Analytics: An Example From the Public Sector

Michael Jungbluth, Dr., Project Manager and Analytics Expert, SAS

The Federal Central Tax Office in Germany was facing a significant change in IT infrastructure and data management while migrating its existing paper-based income tax card to a centralized electronic system connecting enterprise accounting systems and federal tax and registry offices. It anticipated a significant amount of complaints in the transformation process and introduced a ticketing tool to keep track and help solve upcoming challenges in an efficient way. Facing heterogeneous complaints that needed time-consuming input of tax experts, the Federal Central Tax Office used SAS in two ways: (i) SAS Text Miner accurately predicted the ticket category based on a bag-of-words approach and extracted important metadata via contextual extraction. That metadata rendered an automated rule-based pre-processing of tickets possible and kept the necessary headcount of service agents low, while increasing their efficiency through an indication of probable methods of resolution. (ii) SAS established reporting capabilities that supported the early identification of complaint clusters to prioritize countermeasures. From an analytical standpoint this presentation highlights two aspects: (i) The advantages of formal language to extract and assign complex context – i.e., fiscal facts – with SAS Enterprise Content Categorization; and (ii) The use of custom entities in SAS Text Miner to improve and illustrate predictive fit.
Level: Appropriate for all levels of knowledge and experience

Perpetual Profiling Using SAS

Dan Sten Rexen, Senior Advisor, The Danish National Labour Market Authority

The Danish Labour Market Authority is implementing advanced profiling methods for job seekers, with the goal of building a perpetual beta version of profiling. Concern is growing over welfare cost and the long-term consequences of unemployment. But we can do a lot more to encourage, facilitate and support an individual approach for each job seeker. There are many potential benefits of developing a national profiling tool. Evidence shows that earlier intervention can shorten unemployment time and promote sustainable employment. Furthermore, a more effective intervention can save substantial cost in unemployment benefits paid. However, to create yet another profiling tool that merely estimates the risk of long-term unemployment is as old-fashioned as it is ineffective. Such tools have limited accuracy and cannot differentiate between two people in a category. We have a different approach. One of perpetual refinement and correction, and dynamic changes. One based on self-screening, not through dead data, but through actions. And one that resonates with the person on the receiving end, not a mechanical risk calculation that spits out a non-negotiable answer.
Level: Appropriate for all levels of knowledge and experience

Preventing Tax Evasion and Benefits Fraud Through Predictive Analytics

Ian Pretty, Senior Vice President, Capgemini

Today's tax and welfare agencies are increasingly facing new and sophisticated methods of tax evasion and welfare fraud. Increasing digitization means that fraudsters are becoming faster and new types of fraud, such as ID theft, are growing. However, with more and better data available, agencies now have the ability to sharpen their insights at higher speeds. Capgemini's Trouve solution, powered by SAS, helps tax and welfare agencies harness digital to achieve better, faster and cheaper compliance results.
Level: Appropriate for all levels of knowledge and experience

Insurance

Welcome to the Future - The Analytics World of Insurance

Daniel John, Dr., Thought Leader, Information Management, HUK Coburg

Analytics is a key to success for modern insurance business at HUK-COBURG, one of the leading German insurance companies. Using some remarkable examples, we will show how we use business analytics along the whole life cycle of insurance to provide value. Our applications of analytics are not limited to generating new insights. Instead, we apply analytics to rethink business processes. The main goal is the steering of operational systems. But the idea of automation based on analytics brings a lot of new requirements concerning data, systems and, last but not least, analytic teams. We consider these challenges and our solutions from a management point of view.
Level: Appropriate for all levels of knowledge and experience

Retail

Consumer Analytics Faster, Better: the Consumer DNA Factory

Raphael Cailloux, Head of Brand CRM Marketing Intelligence, Adidas

While working on Consumer analytics applications like predictive modeling, most analysts devote a lot of time on peripheral tasks like data preparation (before the analysis) and insights dissemination (after the analysis). Industry experts often report an eighty percent dent on analytical resources. Not only do these tasks deter data scientists from focusing on the actual value-adding insight creation, it also poses many challenges with respect to the quality and timeliness of those insights, and the management of knowledge in general.

adidas has tackled this issue within a project called Consumer DNA, whose goal is to provide the right information in the right context and the right format to various analytical applications like predictive modeling, campaign analysis, or more downstream, to targeting processes. The Consumer DNA Factory is now at the very core of adidas marketing intelligence and dramatically enhance its capabilities.

This presentation will be about revealing the magic behind: the components, functions and (simple) technologies at work in the CDNA Factory. It will also draw the line to other parts of the analytical framework and show how it can systematically be optimized as a whole.
Level: Intermediate

Credit scoring and fraud detection in retail: The story of 10 years of risk analytics at Unigro

Geert Verstraeten, Dr., Managing Partner, Python Predictions

In this session, we highlight credit scoring and fraud detection for a distant-selling retailer with a strategic focus on offering consumer credit. Unigro (part of Groupe 3SI) is a Belgian distant-seller with a large product assortment, including hi-fi and electronics, furniture, kitchen articles and body care. The company has a strong focus on consumer credit, and has used SAS technology for more than 10 years to predict and monitor creditworthiness. Given the strategic importance of accurate credit decisions, the company has favored predictive models that combine good predictive accuracy with great interpretability. Using these techniques, since 2006 Unigro has realized a revenue growth of 25 percent whilst reducing credit risk, despite the financial and economic crises and the increase of (riskier) internet orders.
Level: Appropriate for all levels of knowledge and experience

Promotion Forecasting for a Belgian Food Retailer, Delhaize

Maaike Van den Branden, Analytics Team Leader, Delhaize

"Forecasting future events is often like searching for a black cat in an unlit room, that may not even be there" (Steve Davidson, The Crystal Ball). "It is far better to foresee even without certainty than not to foresee at all" (Henri Poincare, The Foundations of Science). Delhaize has launched a project to improve the availability of promotional products in the supermarkets. These two quotes emphasize both the aim of the project as well as the difficulty. Making forecasts about promotional volumes that will be sold in the future will not only make our customers happier, but also help the ordering teams to order the right amount of products from the suppliers, and the suppliers to produce and deliver these ordered quantities on time. On the other hand, forecasting the future is a difficult subject. However, it is better to use the knowledge from the past to forecast the future, than to ignore what has happened in the past. This presentation outlines how the Analytics department at Delhaize, with the help of 4C Consulting, developed an econometric approach in SAS to improve availability of promotional products in the supermarket.
Level: Appropriate for all levels of knowledge and experience

Shopping and Entertainment: all the figures of Media Retail

Giovanni Monopoli, Strategic Analysis and Financial Planning Manager, QVC

QVC (Quality, Value, Convenience) ranks second in revenue production out of all US broadcasting channels. In October 2010, QVC launched a new business model in Italy: the combination of entertainment and retail. With live broadcasts 17 hours a day, QVC needs real-time feedback on offer satisfaction to help it know its customers better and offer the right products. Since there are many different platforms and systems behind the scenes of its TV shows, a business analytics platform that's flexible enough to integrate information from multiple areas is essential. As a result, QVC partnered with SAS to develop reporting and analytics capabilities to gauge the performance of communication channels and product preferences, and perform customer segmentation. QVC uses SAS Forecast Studio to generate inbound calls forecast models. Now, QVC has highly accurate forecasts that enable it to better serve customers – at a lower cost.
Level: Intermediate

The New Analytical Mindset in Forecasting: Nestlé's Approach in Europe and Case Study of Nestlé Spain

Aleyda Andreo, SAS Demand Analyst, Nestlé

Rathindra Soti, Business Solutions Expert, Demand Planning, Nestlé

Forecasting customer orders is key for a food business like Nestlé. In our make-to-stock environment, with a high promotion rate and short product life cycles, reliable demand plans are essential to optimise customer fill rates and on-shelf availability, whilst keeping our inventory levels under control. In this presentation, we will show how SAS Forecast Server will change our forecasting process to become more analytical. With a focus on Nestlé in Europe, we will outline the changes in the forecasting process and the creation of a new role, the demand analyst. Afterward we will explain our rollout approach and how we want to bring the new approach to our markets. Nestlé's demand analyst for Spain will explain how this works in her market. In order to build trust in forecasts and encourage teamwork with demand planners, we have created new tools that we will show in this presentation. We will also explain our planning, implementation, and how SAS is adding value to Nestlé's S&OP process. The first steps in our gradual implementation have been to automate the nonvolatiles products, allowing the demand planners to spend more time managing the rest of the portfolio. We will also share information about the pilot project, the first phase of the rollout and the next steps using causal models. The presentation concludes with a look at our European Analytical Competence Network, the collaboration model among demand analysts in Europe.
Level: Appropriate for all levels of knowledge and experience

Life Science

SAS Launch Revenue Optimization

Bahadir Aral, PhD, Senior Operations Research Specialist, SAS

Andrew Pease, Principal Business Solutions Manager, SAS

Pharmaceutical companies face unprecedented pressure to stay profitable in a difficult global pricing environment that increasingly demands the ability to respond swiftly (and smartly) to changing government requirements. A myriad of price referencing rules complicates the already difficult business and mathematical challenges of achieving optimal launch sequences and prices across all markets. This presentation will follow up on the Analytics 2013 Conference presentation with a deeper dive into the SAS approach, including management of the scheduling and pricing components simultaneously while allowing for extensive what-if analysis to identify the most appropriate course of action. The talk will conclude with a demonstration that illustrates the combined value of SAS/OR, SAS Visual Analytics and SAS pricing expertise.
Level: Appropriate for all levels of knowledge and experience

Cross-industry

Application of Data Analytics for National Security and Crisis Management (Lessons from ePoolice and Athena Projects)

Babak Akhgar, Professor of Informatics and Director of CENTRIC (Center of excellence in terrorism, resilience, intelligence and organized crime research), Sheffield Hallam University

ePOOLICE is a project involving early pursuit against organized crime, while ATHENA encourages new media users to contribute to security in crisis situations. Both apply text mining and analytical methods to extract insights from a variety of open-source intelligence repositories. Both projects aim to inform decision makers through real-time processing of multilingual textual data. At the most basic level, this involves employing text mining to discern insights from these vast, varied content sources.

In addition, more focused techniques are applied to indicate the source's relevance to a particular event or location, and to extract occurrences of specific concepts such as the content's author, originating location, and relation to specific organized crime (OC) activities. In both projects, techniques are being developed to process platform-specific syntax (such as Twitter hashtags), and taxonomies and ontologies modelled to identify the relationships between, and assess the strength and credibility of, sources indicating emergent OC threats.

The intelligence gained can also be applied to combat emergent threats. Further, sentiment is extracted using both rule- and statistically based models to indicate negative or positive "feeling" in geographical communities towards the emergent threat of new OC activity. ATHENA will also monitor public morale towards crisis response, producing strategic intelligence assets to inform decision makers, the public and first responders.
Level: Appropriate for all levels of knowledge and experience

Applications of Text Analytics and Sentiment Mining

Goutam Chakraborty, Professor of Marketing, Oklahoma State University

The proliferation of textual data in business is overwhelming. Unstructured textual data is being constantly generated via call center logs, emails, documents on the Web, blogs, tweets, customer comments, customer reviews and so on. While the amount of textual data is increasing rapidly, businesses' ability to summarise, understand and make sense of such data for making better business decisions remains challenging. This presentation takes a quick look at how to organise and analyse textual data from a large collection of documents. The data holds valuable customer intelligence that can improve business operations and performance. Multiple case studies that use real data and demonstrate applications of text analytics and sentiment mining using SAS Text Miner and SAS Sentiment Analysis Studio will be presented.
Level: Intermediate

Big Text Analytics Using SAS High-Performance Analytics and SAS Visual Analytics

Saratendu Sethi, Senior Director, Advanced Analytics R&D, SAS

Enterprise data assets continue to grow exponentially, and big data adoption has become critical in all industries. SAS currently offers: 1) SAS High-Performance Analytics, a suite of highly scalable, distributed in-memory processing technologies that allow users to process large volumes of data, including unstructured, big text data; and 2) SAS Visual Analytics, a robust business intelligence platform to conduct ad hoc data analysis, visually explore data, and develop reports and dashboards to share insights through the Web and mobile devices. This paper highlights integration and interplay between SAS High-Performance Text Analytics and SAS Visual Analytics by presenting a case study of analysing large text data. We will demonstrate simplified topic generation and exploration in SAS Visual Analytics; the ability to export and customize the topic model into SAS Contextual Analysis for advanced rule and taxonomy definition; and finally, the ability to deploy the model into SAS High-Performance Analytics environments.
Level: Advanced

Discrete-Event Simulation Modeling and Analysis with SAS Simulation Studio

Manoj Chari, Senior Director in the Advanced Analytics R&D Division, SAS

Ed Hughes, Product Manager for SAS/OR, SAS

Discrete-event simulation is a powerful method for analyzing the behavior and performance of operational systems, especially those featuring complex interactions and random inputs. SAS Simulation Studio, a component of SAS/OR, provides a rich and versatile graphical environment for discrete-event simulation modeling and analysis. Valuable insights arise from modeling and assessing different scenarios, comprising changes in operating conditions, configuration choices and resource assignments.

We'll illustrate the benefits of SAS Simulation Studio by looking at three recent customer case studies – clinical trial portfolio planning in pharmaceuticals, staffing in a hospital's intensive care unit, and personnel allocation for home service operations at a public utility. We'll demonstrate the benefits of key features and recent enhancements to SAS Simulation Studio, and we'll also show how discrete-event simulation can be used to evaluate policy decisions arising from optimization modeling.
Level: Appropriate for all levels of knowledge and experience

Explaining the Past and Modelling the Future: Overview of Econometrics Tools From SAS/ETS

Ken Sanford, Senior Research Statistician, Advanced Analytics R&D, SAS

The importance of econometrics in the analytics toolkit is increasing every day. Econometric modelling helps to uncover structural relationships from observational data. This presentation highlights the many recent changes to the SAS/ETS portfolio which give users more power to explain the past and predict the future. We will show and provide examples of how Bayesian regression tools can be used for price elasticity modelling, how state-space models can be used to gain insight from inconsistent time series, how panel data methods help control for unobserved confounding effects, and much more.
Level: Appropriate for all levels of knowledge and experience

Impact of the 11 V's of Big Data on Questions of Ethics, Trust, Stewardship and Governance of Analytics

Richard Self, Senior Lecturer, School of Computing, The University of Derby

Big data is typically defined by the three V's of volume, variety and velocity. However, a range of sources identify additional V's, which, together with the classic three V's, pose a range of very important questions about the governance and delivery of big data analytics. This presentation will use a range of topical examples to identify some of the critical challenges that can be identified by using the 11 V's. Many of the currently unanswered challenges relate to important aspects of information governance. The questions of ethics, trust and stewardship will be directly addressed. An aspect of the overall governance of big data analytics that must be addressed is the predicted worldwide shortage of analytics expertise and whether it can be fully addressed. The questions posed will be challenging and of interest and value to data scientists and analysts, to those who sponsor and manage analytics projects, to those who use the results for the purpose of decision making, and, finally, to training and educational institutions worldwide. The presentation is partly based on the research carried out by 110 of the author's final year students as part of their final semester capstone modules, which addressed the topics of "Big Data for SMEs: Questions of Ethics, Trust, Governance, Security, Audit and Provenance" and "Big Data for SMEs: Questions of Opportunities, Challenges, Benefits and Operations." The best articles will be published in two e-books in the autumn of 2014.
Level: Appropriate for all levels of knowledge and experience

Machine Learning at Scale

Herbert Bucheli, Senior Director, Aduno Group

Wayne Thompson, Manager of Data Science Technologies, SAS

Machine learning helps develop deep insights from data assets faster and with greater precision, leading to an improved bottom line, reduced risk, better customer understanding and more metrics of success. Machine-learning applications are predominately focused on clustering, classification and prediction. SAS In-Memory Statistics for Hadoop delivers deep inferential statistics and machine learning. You can use the product to manage data, perform exploratory analysis, build models and score data with unlimited amounts of data in Hadoop, the open-source framework considered by many as the new big data storage platform. The in-memory architecture of SAS In-Memory Statistics offers unprecedented speed – a critical requirement for applying analytics to massive amounts of data stored in the Hadoop Distributed File System. This session provides an example-based overview of select machine-learning methods delivered with SAS In-Memory Statistics for Hadoop:
  • Density-based clustering.
  • Feed forward and recurrent neural networks.
  • Association rule mining.
  • Random bootstrap forests.
  • Recommendation systems based on nearest neighbors, matrix factorization and hybrids.
Level: Advanced

Maximizing a Churn Campaign's Profitability With Cost-Sensitive Predictive Analytics

Alejandro Correa Bahnsen, , Luxembourg University

Andres Gonzalez, Decision Analytics Manager for Customer Experience, DIRECTV

Predictive analytics has been applied to solve a wide range of real-world problems. Nevertheless, current state-of-the-art predictive analytics models are not well aligned with business needs since they don't include the real financial costs and benefits during the training and evaluation phases. Churn modeling does not yield the best results when it's measured by investment per subscriber on a loyalty campaign and the financial impact of failing to detect a churner versus wrongly predicting a non-churner. This presentation will show how using a cost-sensitive modeling approach leads to better results in terms of profitability and predictive power – and is applicable to many other business challenges.
Level: Intermediate

Meeting Your Modeling Challenges With Big Data: An Ever-Growing Portfolio of High-Performance Procedures

Robert Cohen, Senior Director, Advanced Analytics R&D, SAS

Exploiting large data to build and deploy accurate predictive models often involves some difficult challenges. Among these are building parsimonious models when your data includes a large number of potential predictors and handling predictors whose influence is nonlinear but where the nature of the nonlinearity is initially unclear. This presentation provides an overview of the ever-growing arsenal of high-performance procedures that you can use to tame these and other challenging modeling problems.
Level: Appropriate for all levels of knowledge and experience

Methods, Models, Motivation and More: Recent Developments in SAS/STAT Software

Robert Rodriguez, Senior Director, R&D, SAS

SAS/STAT software is expanding in response to emerging statistical needs in areas as diverse as business analytics, government statistics and clinical trials. This presentation provides an overview of major enhancements in SAS/STAT 13.1 and SAS/STAT 13.2, emphasizing the practical motivation for novel methods, and models the problems they solve and the benefits they offer. You will gain insights about new procedures and features for predictive model building with generalized linear models, quantile regression and multivariate adaptive regression splines; Bayesian discrete choice modelling; analysis of missing data; survival analysis with interval-censoring and competing risks; item response models; and high-performance statistical computing.
Level: Intermediate

Mobile Marketing: Building New Business Models Based Upon Big Data and Real-Time Analytics.

Gery Pollet, Chief Executive Officer, ZapFi

We are living in an always-on world. The rapid proliferation of smart devices not only leads to new consumer behavior but also transforms the way we – consumers – interact with brands. Learn from this presentation how big data in combination with real-time analytics is being used by brands and merchants to build more intimate one-on-one relationships with consumers while respecting privacy.
Level: Appropriate for all levels of knowledge and experience

Non-industry specific

A New Face for SAS Analytical Clients

Udo Sglavo, Director, Advanced Analytics R&D, SAS

SAS development teams are completing initial releases of the next generation of SAS analytical clients. Building on customers' favourite features with innovative enhancements, we offer a fresh interface for SAS Enterprise Miner and SAS Forecast Server. We will also be taking access to analytical procedures in SAS to the next level. Come see a preview of the latest build of these clients and a road map for future delivery of the analytical client user interfaces.
Level: Intermediate

Big Data, Data Mining and Machine Learning: Value Creation for Business Leaders and Practitioners

Jared Dean, Director, R&D, SAS

In this era of big data, organisations are hungry to make better use of the information at their disposal. A new book, Big Data, Data Mining and Machine Learning: Value Creation for Business Leaders and Practitioners, will help ensure you are knowledgeable about hardware platforms and their evolution; predictive modeling methods, including both statistical and machine learning; and the power of segmentation that will help you gain a solid foundation in these essential areas. Come and hear case studies from several industries; straightforward, applied examples from library records to make recommendations; and game show results to illustrate the power of text mining. These concepts and more are addressed in this session, as well as an opportunity to meet the author.
Level: Appropriate for all levels of knowledge and experience

Forecasting Value Added and the Limits of Forecastability

Steve Morlidge, Independent consultant, author,

Forecasting practitioners realize that there is no such thing as the perfect, 100 percent accurate forecast. Some level of forecast error is unavoidable. But there is now a way to determine how much forecast error is avoidable, and the maximum level of forecast accuracy you are likely to achieve. In this talk Steve Morlidge will describe a simple way of measuring forecast performance that achieves all these things. He will share the results of the use of this approach across a range of industries and outline the practical implications for professional forecasters. Learn how to: -Determine the unavoidable level of forecast error. -Make objective judgments about the quality of individual forecasts. -Make just comparisons between forecasts made for different products, industries and geographies. -Assess the value added by a forecast process. -Create a useful metric and communicate your results to non-experts. -Identify ways to radically improve forecast performance.
Level: Intermediate

What Drives Advanced Analytics R&D at SAS?

Radhika Kulkarni, Dr., Vice President, Advanced Analytics, SAS

The simplest answer to this question is: We do what's needed to enable our customers to solve their most complex problems. This presentation sets the stage for the talks in the R&D track by describing some of the common themes underlying the research initiatives in the analytics areas. We keep one eye on customers and their business problems and the other on the research frontiers, and bring both into focus to develop new features most likely to get the job done. Not only is data growing, but its complexity along multiple dimensions is growing as well. This talk will consider the new opportunities for insight we provide, ranging from the scalability offered from distributed computing, methods now feasible with high-performance computing, how to make sense of unstructured data, and the power to address increasingly complex problems by combining sophisticated, scalable techniques across multiple disciplines.
Level: Appropriate for all levels of knowledge and experience

This list was built using SAS software.