Big Data Analytics - The Revolution Has Just BegunEach day we read about "big data." Companies are spending more on technology in an attempt to uncover hidden gold to enhance profits and refine customer knowledge. But big data analytics is more than just servers and software. It is a mindset shift. It's a new organizational alignment. It's a change in who we hire. It's even Big Brother on steroids. Done well, big data and the analytics that come will forever alter the landscape across virtually every business imaginable. The barriers to success are immense, but so is the promise. Learn why we're still in the first inning, and how smart businesses are positioning themselves for the revolution ahead.
Using Data Mining in Forecasting ProblemsIn today's ever-changing economic environment there is ample opportunity to leverage the numerous sources of time series data now readily available to the savvy business decision maker. This time series data can be used for business gain if the data is converted to information and then knowledge. Data mining processes, methods and technology oriented to transactional-type data (data not having a time series framework) have grown immensely in the last quarter century. There is significant value in the interdisciplinary notion of data mining for forecasting when used to solve time series problems. The intention of this talk is to describe how to get the most value out of the myriad of available time series data by utilizing data mining techniques specifically oriented to data collected over time; methodologies and examples will be presented.
Session AbstractsThe following sessions will take place at Analytics 2012. This page is updated often so please check back frequently for the most up-to-date information.
View a list of talks by industry (.pdf): Communications and Entertainment, Education, Financial Services, Government, Healthcare, Retail and Manufacturing, Utilities, and Talks of General Interest.
Using SAS to Calculate Advanced Variance and Other Statistics Based on Complex Weighted Survey Data
Techniques Developed for Surveys of Scientists and Engineers Performed by the National Center for Science and Engineering Statistics, National Science FoundationThis panel will present SAS techniques to calculate and present variances and medians based on complex sample survey data. These calculations were previously performed using specialized tools such as SUDAAN.
The presentation will describe SAS techniques to calculate variance using replicate weights with multiple jackknife coefficients and medians using the cumulative distribution function. In addition, it will describe the construction of data cubes that enable SAS users to integrate the results of these calculations into complex tables constructed using PROC TABULATE. It will also detail performance tuning techniques that enable SAS users to produce results of some complex variance calculations by an order of magnitude faster than calculations using traditional procs and tools.
The presentation will demonstrate working code; attendees will receive working sample programs (SAS).
Advanced Analytics at Manheim AuctionsManheim Auctions is the world's leading wholesaler auto auction by size and sales volume, with more than 118 locations worldwide (69 in North America). A subsidiary of Cox Enterprises, Manheim's used vehicle remarketing services include auction lanes, online channels, specialty and heavy truck, salvage, consulting, financing, and exports.
Using examples from the wholesale auto auction market, this presentation will demonstrate how advanced predictive and prescriptive analytics are used in market and customer segmentation, sales and facility location optimization, and lifetime value analysis. The presentation will also highlight Manheim's use of the SAS procedure PROC OPTMODEL, the role of SAS/STAT procedures in vehicle valuation reports, and a short demo on how much your car is really worth.
Analytical Appetizers: A Sampling of Analytical Consulting ProjectsOngoing trends in computer hardware and software include an expanding capability to process increasingly larger quantities of data in minimal time. Businesses and information-driven organizations require implementable policies derived from analyses of operational data. There is significant opportunity for applying analytic methodologies to convert warehouses of data into information, thus revealing useful relationships for business advantage. The job of the analytical practitioner is to sift through the data and algorithmically convert individual observations into useful information for decision making. This presentation includes several case studies that demonstrate the analytic mindset and process applied to a variety of industries.
Growth Projections and Product Assortments Across Multiple Stores: Combining Data Analytics and Optimization to Connect Global Patterns with Local ConstraintsFirms with multiple branches frequently set growth targets on an ad hoc basis, and disregard branch-specific constraints or advantages. This leads to skewed management-set incentives and results. We use a nationwide plastics wholesaler with over $500 million in revenue to show how data mining helps to find global knowledge from branch sales patterns. We develop metrics and algorithms to overcome well-known data mining problems, which also helps to compare branch performance and provide differentiated growth projections. Finally we optimize using the global rules from data mining and local market conditions of each branch. We identify stores that are at the top of their game and those that can improve revenue up to 100 percent. The methodology can also determine locations to open new branches. Our solution offers important insights for supply chain designers and merchandising managers on product portfolio selection, including complements versus substitutes and product bundling.
Using Structured and Unstructured Data for Better Weight of EvidenceClaims behavior modeling provides prescriptive decision making for investigators, prioritizing cases and informing overall outcomes. At Accident Fund, we have found that secondary conditions, buried in adjuster's notes are important to accurately identifying characteristics that significantly affect predictive model performance. In this presentation, we will examine logistic regressions methods for weight of evidence that include mined factors from unstructured text details, illustrating the benefits of this additional data source for better claim triage.
Combined Forecasts: What to Do When One Model Isn't Good EnoughSAS High-Performance Forecasting 4.1 offers a new, innovative process for automatically combining forecasts. Forecast combination, also called ensemble forecasting, is the subject of many academic papers in statistical and forecasting journals; it is a known technique for improving forecast accuracy and reducing variability of the resulting forecasts. By integrating these methods into a single software system, SAS High-Performance Forecasting 4.1 surpasses the functionality of any existing software system that incorporates this capability. This paper describes this new capability and includes examples that demonstrate the use and benefits of this new forecast combination process.
Case Study - ING Belgium: Successes and lessons learned in a five-year growth path in predictive analyticsThis session will cover ING Belgium's journey toward becoming a convinced user of predictive analytics. While the first successes were well received by a small group of early adopters, they remained sporadic, and the impact of analytics on ING's communication policy was limited. This presentation will illustrate which major developments ? organizational and analytical ? influenced the usage of predictive analytics throughout the organization, and eventually the success of this business unit. As a result, predictive models are now the main engine of automated campaigns toward a large variety of target audiences and channels. We present the evolution from initial model developments to industrialization of model development and model scoring, automated monitoring and, finally, optimization. From a non-technical point of view, we emphasize our hits and misses in achieving involvement, buy-in, and usage of a battery of predictive models. We conclude by presenting some challenges that still remain.
Risk Factor Correlation Modeling in Finance Using the COPULA ProcedureMeasuring and controlling the risk inherent in financial securities is of vital importance. Two keys to risk management are understanding the volatility of economic factors that affect the value of the portfolio, and understanding how changes in those economic factors are related to each other. Recent progress in the mathematical technique of "copula" functions offers a powerful new approach to modeling dependencies among numerous risk factors. This presentation explains copula modeling using the new COPULA procedure and shows examples of using copula models for risk management problems.
Banking Is Retail: Data Without Use Is OverheadData without use is overhead. That is a statement that I strongly believe in. Are you enjoying a strong ROI from your data investment? Retailers are collecting a tremendous amount of data from many sources today: POS, online, real estate, etc. The volume of data being stored is only going to increase; some say as much as 50 percent over the next three years. When you include online data such as social media, this is very likely.
There are global retailers making better use of their data through advanced analytics and strategic planning. These retailers are positioned well to take market share and grow their businesses. To battle these data-driven retailers, you need to make better use of your data, to begin implementing fact-based decisions. You can create your own secret weapons driven by the deep data resources that you have been collecting, and use these weapons to gain a positive ROI from your data. You can bridge the gap between retail and banking analytics.
I will show the similarities between retail stores and banking: how cross-sell, next best offer and attrition modeling are applied to both industries and why they overlap; and I will touch on big data and how it will influence decisions.
Business Analytics in Healthcare: A Strategic AgendaThe role of business analytics in the transformation of the health care industry is strategic. Health care represents a substantial portion of the US economy, and reform in the US is an inevitable reality. While challenges to certain Affordable Care Act provisions persist, health care is on the path to reform, and forward-thinking business leaders are preparing to be ahead of the changes.
The most visible trends are driving financial and operational change. In addition to improving efficiency and controlling costs, current trends also involve: quality of care, consumer satisfaction, safety, data management (security, collection, integration, mining and modeling), performance measurement, predictive analytics, collaboration, and electronic health records/electronic medical records. Success will depend on understanding these trends by analyzing the underlying data to make informed and insightful decisions.
Predictive modeling and forecasting provide businesses with new insights into current trends and future opportunities. Selected examples will illustrate successful initiatives using SAS and underscore the importance of data resources. These cases demonstrate the application of SAS software and services as a strategic tool for risk management and health care analytics. Successful organizations place increasing emphasis and reliance on business analytic tools to develop organizational competencies that can become the source of competitive advantages.
Big Data Analytical Trends, Methods and Opportunities Linked to SASBig data is creating big opportunities, enabling enterprises to find new ways to harness data to make decisions. Big data is a disruptive technology, as organizations now have access to the type of analytics horsepower that previously could only be performed by expensive, high-performance machines. Third parties are developing analytical methods and visualization techniques to help companies manage and interpret all data types connected with SAS.
Attendees will learn during this session:
- More about the big data analytical trends, how SAS plays a part and where SAS is deficient.
- Three key macro analytic trends and ties to SAS, including: high-performance computing, usability and visualization, and intelligent systems.
- A framework for big data thinking in their enterprises: DataSet, Toolset, Skillset, and Mindset.
- Big data in flight and at rest, two bifurcated use cases that require unique toolsets and mindsets.
Safety Analytics: The Business of PreventionWorkplace safety is key to the sustainability of our most important resource our people. Workers' compensation and other injury-related costs also give companies a bottom-line motive for keeping employees safe. Safety Analytics is a powerful advanced analytic tool that can illuminate the human behavioral characteristics of workplace accidents in unprecedented detail. We'll discuss:
- How applying analytics to workplace and workforce data can help decrease incident frequency and severity.
- Why an analytical view into employee injury claims can help reduce costs.
- How to make the leap between analytical results and effective action plans to improve results.
Analytic Infrastructure Design ConsiderationsDid you know that half of productivity gains realized across all industries can be attributed to IT? Going forward, SAS expects analytics to complement IT innovation in driving additional productivity gains. In this session, we'll make sense of the buzz around cloud, big data, high-performance computing and approachable analytics, and share how you can design the right infrastructure to support a range of analytic disciplines that boost productivity and competitive advantage.
Taming Big Data: Is It Actually Possible?In today's hype-filled, big data world, there is no easy button for big data. In fact, in many ways, big data is going to be quite difficult to tame. Technologies and tools can help, but the onus is still on organizations to develop and implement the required analytical processes as they have always done in the past. Join analytic experts Bill Franks of Teradata and Jeff Livermore of SAS for an interactive discussion regarding the challenges big data is placing on your organization. Franks and Livermore will frame a topic as a conversation starter and then turn it over to the audience to comment or ask questions before moving on to the next topic. Come learn from your peers. You never know what tips you might pick up!
Multi-Stage Behavior ModelsThis presentation will dive into two multi-stage behavior models that have been or currently are being implemented at Georgia Power to aid collections efforts.
The first model prioritizes accounts that are eligible to cut in an unbiased and objective way using a binary logistic regression model. This model uses 29 behavioral predictors to model whether a customer will pay their 60-day arrears in full ("success") or not ("failure"). Predicted probabilities of this event are subsequently combined with each customer's 60-day arrears amount in order to obtain an expected loss that is then transformed into a collections score.
Once a customer is no longer receiving electric service, but still owes an outstanding balance, the account is sent to final collection. This model prioritizes accounts to be pursued by modeling whether a customer will pay any of their outstanding balance using a binary logistic regression model with 35 demographic and behavioral inputs. These predicted probabilities are multiplied by the outstanding balance and then transformed into a collection score.
Integrating Online Behavior into Your Analytics DatabaseThe volume of data that can be obtained from Web traffic can be overwhelming. Getting this data into a usable and comprehensive database for analytics is just the first step toward answering some important business questions. This paper will take you through the steps to integrate Chico's Web traffic data into our SAS analytics database, which works well with our existing customer and transaction databases. I will also walk through some of the business cases that can now be addressed as a result of having this data in our SAS analytics environment ? including cart abandonment; comparing online browsing with actual purchasing activity; calculating the true time between online orders and shipping; overall experience; the true effectiveness of new customer acquisition by marketing vein; and shopping cart analysis.
Constructing a Credit Risk Scorecard Using Predictive ClustersTraditionally the cluster analysis has been used as a descriptive tool in which the algorithm is used to create groups of observations based on their characteristics. In this presentation the use of cluster analysis as a part of a predictive algorithm is proposed. This methodology is applied by first determining to which cluster a prospect client belongs, and then calculating a specific credit risk scorecard for each cluster. Results will show that this approach provides better results than using a single scorecard for all the prospect clients.
Competing on Customer Analytics Using Data Mining TechniquesMany companies have sought a competitive advantage by gaining deeper insights into their customer base. Unfortunately, the first-mover advantage has largely evaporated in most sectors, forcing companies to look for innovative strategies to make better decisions and maximize the value of customer analytics. What's more, as massive amounts of data about all dimensions of customer behaviors become readily available, the need to rapidly incorporate customer data into traditional business and customer intelligence efforts becomes increasingly important. This presentation will outline how sophisticated data mining techniques should be used to create a competitive advantage in the increasingly fragmented marketplace. If you've ever struggled with your customer intelligence efforts, either because you didn't trust the data, didn't understand the data, couldn't get the data you actually needed, or couldn't analyze the data you had been given, come see this presentation.
Missing Data What to Do With All Those NULLS?We all know that missing data can cause issues with several of SAS' statistical procedures, but too often we use a one-size-fits-all, automated strategy for replacing the missing data.
Replacing missing values is an essential part of the data preparation and analysis process of model development, but many times it's overlooked.
There are two important questions to ask about missing data: "How much is missing?" and "Why is it missing?" The answers to those questions can determine a strategy for replacing it.
The presentation will include:
- How the one-size-fits-all strategy can lead to models that are unstable and suboptimal.
- How much missing data is too much?
- Strategies for replacing data, depending on why it is missing.
Predicting Consumer Behavior: Where Creativity and Objectivity Intersect in the Online Automotive MarketplaceThere are many decisions that influence how a website is developed. Some of these decisions are driven by objective observation, while others are driven by feel and industry knowledge. In this session, we will focus on using analyses based on objective observation to inform the decision-making process. We will provide examples of how predictive analytics has helped inform decisions for an industry leader.
When Predictive Analytics Meets Smart Grid: Electric Load Forecasting with Geographic HierarchyElectric load forecasting, a well-established problem in the utility industry, has received a lot of attention from the forecasting community due to some of the key features of the electricity demand series, such as hourly or subhourly interval, multiple seasonal patterns and high dependency on explanatory variables. Prior to the massive deployment of smart grid technologies, most scientific efforts on electric load forecasting were devoted to forecasting one series at the corporate level. In today's world, many utilities are collecting hourly data from millions of smart meters, a factor which introduces the geographic hierarchy as an additional dimension to the problem. This presentation will answer a challenging question: How can we take advantage of a geographic hierarchy and weather station data to improve the load forecast accuracy?
Applying Data Mining in Raw Materials ForecastingAccurately predicting the cost of raw materials in the current dynamic business environment becomes a must for high profitability. Finding statistically significant economic drivers is critical for raw materials forecasting. This presentation will demonstrate how data mining methods can be applied for effective variable reduction and selection in a large-scale raw materials forecasting project. Several examples will be given and some important lessons learned will be summarized.
Quality Improvement in Managed Care Plans: Can Analytics Drive Performance or Will Health Plans Continue to Roll the Dice?Most managed health care plans produce very little reporting about clinical outcomes, quality improvement and illness burden across the enterprise. The reporting is generally confined to small distinct patient populations and often stored in silos, inaccessible to many health plan managers. Clinicians are particularly uninformed about the status of their patients.
Health care reform is changing the analytic landscape inside many health care plans. For many enterprises, millions of dollars in revenue are now at risk and tied to the efficient performance of the health plan. Plans that can improve quality and deliver services efficiently to patients with different levels of illness burden will thrive; other plans are likely to struggle. Population-oriented (person-level) analytics are required, but the source data is often inadequate or expressed at many different granularities that cannot be easily integrated.
This presentation will focus on:
- How analytics in the reforming health care system are essential to improving financial performance of health care plans.
- Types of analytics needed by managed care health plans that serve Medicare, Medicaid or individual working-aged members.
- The role of data cleaning and harmonization required for the meaningful application of predictive modeling.
- Generating quality metrics and estimating illness burden at the personal level.
- Why "out-of-the-box" models rarely work in understanding managed health care.
Using Segmentation and Predictive Analytics to Combat AttritionDavid Liebskind will cover how to effectively monitor and measure attrition as well as the development of analytical solutions to reduce customer attrition. The presentation will focus on how your business can implement segmentation and predictive models to construct a customer retention framework. Liebskind will also present several case studies that highlight the impact of these analytical tools as well as the development and deployment of a marketing optimization framework.
Exploring the Role of Talent Analytics in Achieving Competitive AdvantageIn their seminal 2007 book, Competing on Analytics: The New Science of Winning, Tom Davenport and Jeanne Harris argued the competitive value of analytics in today's global economy. Their underlying premise was that organizations that most aggressively pursued and achieved analytic competency would attain competitive advantage. As a logical follow-up, Davenport, et al. reasoned that analytic methods should also be extended into how organizations manage their greatest asset and largest expense ? employees. This presentation will explore the nature of talent analytics and the challenges of pursuing these methods in a business enterprise. Combining perspectives from business and academia, this presentation will focus on how analytic talent is developed, nurtured and evaluated to achieve superior business performance.
Predicting Risks: Utilizing Subject Matter Knowledge, Factor Analysis, and Clustering to Estimate Environmental HazardIn the world of energy generation and natural resource extraction, the industry leaders own and maintain thousands of properties, both new and historical, used in the production of oil and gas products. Older retired sites can harbor undetermined risk from ill-documented or unknown past spills or accidents. Their challenge is to identify potential risk from observable characteristics and determine where clean-up funds should be targeted to best mitigate potential damage to the local community and environment.
This presentation describes the creation of a model to evaluate environmental risk, the validation of that model using subject matter expertise, and how factor analysis and clustering were applied in order to predict which sites were likely to conceal risks and to determine appropriate remediation funds.
You Can't Manage What You Can't Measure: The Volatility of Lending and its MitigationNew Zealand's Kiwibank has embarked on a journey to implement a wide array of tools and systems that will allow it to measure credit risk from origination to portfolio through to macroeconomic scenarios. The ongoing market turmoil has impressed upon banks and their regulators the importance of measuring, understanding and mitigating the volatility of lending.
This presentation will discuss:
- Kiwibank - who, why and its place in the market
- Volatility - why measure credit, operational and market risk
- Bank balance sheets and the NZ capital adequacy environment
- Credit risk models - retail and wholesale PD, LGD, EAD and ratings models, origination scorecards, collective provisioning models, regulatory and economic capital methodology
- Credit risk data and data quality challenges
- Capital and loss calculation engines implemented in SAS
- Bankwide data and "big data" models.
Behavioral Analytics and Politics: Strange Bedfellows or a Match Made in HeavenThe use of data analytics in political campaigns is nothing new. However, the recent widespread use of the Internet and social media technologies by voters as well as political campaigns have brought about significant changes, including the use of behavioral analytics. It has been said that effective use of behavioral analytics was essential in Barack Obama's 2008 campaign's ability to successfully challenge and win over a well-funded primary challenger, Hillary Clinton, and to ultimately win the general election. Behavioral analytics is rooted in the traditional view of business analytics, where customer data is used to discover patterns in consumers' behavior in order to develop targeted marketing strategies. However, it goes one step further in that its main objective is to affect behavior and not merely to understand it. Behavioral analytics is defined as the use of business analytics specifically targeted to understand and ultimately to modify behavior of an individual voter. It can be used to determine profiles of different voters who would respond positively to a particular campaign message. This presentation provides an in-depth discussion of behavioral analytics and presents guidelines and strategies for its successful development and use, framed by describing a specific, successful application during the 2012 primary season.
Solving Business Problems with Operations Research Techniques in SASOperations research (OR) provides the ability to model and solve a wide array of decision making problems that are often referred to as prescriptive analytics. This talk will to give an overview of the key techniques and tools that are currently available in SAS/OR and provide a preview of future plans for this product. The discussion will be accompanied by real examples of the customer business problems that these tools have been used to solve. These use cases, which are derived from work done by SAS Advanced Analytics and Optimization Services, will demonstrate how operation research methods work in conjunction with other analytical techniques to address important business challenges, often providing quantifiable business value. Examples will be chosen from recent engagements that have included application to scheduling for cash replenishment in an ATM network, ordering and replenishment of inventory in retail, and allocation of workloads of delinquent loans to customer relationship managers.
Why Analytics Is Worthless: Making the Numbers Tell the StoryStories talk to us. They surprise us. They motivate us. They become a guiding light for us. Some of the best stories we hear stay with us for a lifetime. Analytics must tell a story ? it's not about the numbers, it's about the story the numbers tell.
The traditional basis for competitive advantage has disappeared, so companies need to separate their performance from the pack. Analytics is the differentiator. Use analytics to make better overall decisions and to extract maximum value from established business processes. Or, in other words, use analytics to tell your story better.
Gain the most opportunity and value from analytics by
- knowing which customer relationship to focus on
- determining when your strategic focus needs to be recalibrated
- understanding how to make and implement a valued-based segmentation
- boosting customer acquisition processes through data warehousing, questionnaire data and more
- knowing how to keep your customers - and how to keep them happy!
Human Capital Analytics: How to Harness the Potential of Your Organization's Greatest AssetIn this knowledge economy, where the majority of an organization's value is in its people, human capital is a major driver of business success. Workforce demographics are changing, with boomer retirements, the new digital workforce, globalization, ubiquity of mobile technology, and social learning ? all of which require understanding our workforce behaviors like never before. Organizations are responding by investing in programs such as leadership development, social learning, diversity initiatives and employee engagement. But which investments have true business impact and which employee populations get the most benefit? For answers, companies must go beyond surveys and dashboards and embrace human capital analytics, which translates business and talent metrics into useful information. Knowledge gleaned from human capital analytics can drive management action and behavior change, and inform key decision making. Even more exciting, organizations can use predictive analytics to look beyond the past and apply statistical certainty to the success of future programs. This interactive presentation walks participants through the measurement process, using client case studies to explore why it is important to measure human capital investments ? and how to most effectively do so. Tie your human capital programs directly to business impact and financial results, and open the door to a new competitive advantage.
SAS High-Performance Analytics: Big Data Brought to Life on the EMC Greenplum Data Computing ApplianceUnitedHealthcare has partnered with SAS and EMC Greenplum to pilot SAS High-Performance Analytics on the EMC Greenplum Data Computing Appliance (DCA) using multiterabyte-scale data. This presentation will describe the proof-of-concept project to apply high-performance analytics (HPA) to call center and other data in an effort to quickly identify and act on customer service opportunities. Discussion will include functionality and performance metrics of SAS High-Performance Analytics procedures, the new SAS DS2 language, the fast-loading capability of the Greenplum DCA, and the ability to deploy models built on the DCA to other databases. Since some of the most valuable data is unstructured, such as the free-form text notes entered by call center staff, the presentation will describe how SAS Text Miner is used in conjunction with the HPA DCA to include unstructured data in analyses and modeling. There will also be a discussion on UnitedHealthcare's intent to deploy the HPA DCA to provide data analytics as a service (DAaaS) to data scientists across UnitedHealthcare.
Do You Know or Do You Think You Know? Creating a Testing Culture at State FarmExperimental design is becoming more common in business settings, both in strategic and tactical arenas. Firms are beginning to recognize the broad applications, including initial product design, marketing, logistics, customer retention, profitability analysis, strategy optimization, and website design, just to name a few. Surprisingly, many still resist fully integrating designed experiments into their strategies.
As with any analytic change, driving an organization to adopt strategic testing can be difficult. The presenter will share what he's been doing at State Farm to move the firm from minimal business experimentation to a more ambitious culture of testing, as well as provide a step-by-step guide for changing or creating a testing culture. He will share the simple example from State Farm's online presence that demonstrated the value of multivariate testing to the firm, and also the ambitious comprehensive optimization test this success launched in late 2011.
The presentation will provide some tangible takeaways for business practitioners in the crowd desiring to improve their own strategies through designed experiments, and is expected to be beneficial to both experienced experimenters and those who want to champion testing in their companies but are unsure how to get started.
Increase Efficiency In Your Organization with SAS and IBM NetezzaIn this IBM Netezza presentation at Analytics 2012 to learn how to:
- Reduce the time of long-running SAS jobs.
- Take advantage of the Netezza appliance to score your SAS models in parallel for lightning-fast scores.
- Free up time for your analysts and statisticians to create more models and additional business value.
- Shrink the time it takes to cleanse and transform your data.
- Learn how SAS is being redeveloped to execute inside a Netezza massively parallel database.
- Use the same SAS interface and syntax you already know to get faster analysis by reducing run time from hours to minutes.
- Understand how Netezza works with SAS Grid Computing and IBM Infrastructure - Power Virtualization, Storage Backup/Recovery and Platform Computing.
Quick Start for Text AnalyticsWhile text analytics solutions offer the hope of beginning to tame the growing information overload, few organizations have the experience to confidently assess what text analytics is, what it can do for their business, or how to select the right tools and right methodology to get the most out of text analytics. Without this experience, organizations can either miss out on a powerful new tool or, worse, pick the wrong tool and the wrong implementation strategy and waste their time and resources without solving their information overload problems.
This session presents a methodology for getting started with text analytics that ensures that organizations will get the most out of text analytics. It consists of three elements:
- An audit that assesses the organization's content and content structure, the information behaviors and needs, and its people and technology. This produces a text analytics strategy.
- A text analytics software evaluation process that helps the organization navigate the ever-changing vendor landscape.
- An initial pilot project that creates the foundation for future text analytics applications development, including initial taxonomy development or assessment and training of internal resources.
Data, Directions and Development: An Inside Perspective on the Growth of SAS/STAT SoftwareThis presentation begins with a behind-the-scenes look at recent development of SAS/STAT software in response to increasingly complex customer data and new directions in the field of statistics. The presentation then takes a high-level tour of important features in recent releases of SAS/STAT (9.22 and 9.3) and coming attractions in SAS. These features include new procedures and enhancements for Bayesian analysis, design and analysis of survey data, exact Poisson regression, finite mixture models, frailty modeling in survival analysis, generalized linear models for over-dispersion, missing data methods, post-fitting inference, quantile modeling, structural equations modeling, and variable selection.
More Than Meets the Eye: Using SAS Text Miner and SAS Sentiment Analysis to Unveil Net Promoter ScoreFor financial institutions, knowing the extent to which their customers will recommend them is a key testament to how well they are performing. Surveys are one method to acquire such customer data, but New Point of Sales alone does not necessarily represent an accurate picture. Only through analysis of text commentaries will a bank truly know if its customers would promote its services. This presentation will describe how BBVA Bancomer uses the discovery methods of text mining in conjunction with sentiment analysis insights to gain an in-depth understanding of its customers.
SAS Revenue Management and Price Optimization Analytics: A New Approach to Revenue Optimization in the Travel and Hospitality IndustryPricing and revenue management techniques are recognized as important components in the management toolkits of successful companies. They make it relatively easy to adjust strategies to changing business environments, which is effective because changes in prices have a direct impact on the bottom line. And they have to be dynamic, since business models and the associated sales channels and conditions can change quickly.
Our customers use SAS solutions to mine through transaction-level data, forecast demand patterns, calculate price sensitivities, and optimize pricing strategies and availability controls. Successful applications require advanced skills in data mining, forecasting, econometrics, and mathematical optimization. SAS supports customers by offering a comprehensive bundle of analytic tools and developing and sharing best practices.
SAS Revenue Management and Price Optimization Analytics is a set of analytic modules that address travel and hospitality industry-specific revenue management and price optimization problems. These industries share the need for advanced analytics that handle pricing for business with fixed capacity, perishable products and time-variable demand, where prices can vary over the booking horizon. In this presentation we will review the business problems that can be addressed with SAS Revenue Management and Price Optimization Analytics and illustrate the SAS approach for customizing this cutting-edge pricing application.
The Path to Stakeholder EnlightenmentThis presentation is a follow-up to the well-attended presentation Your Stakeholders Don't Care About R-Squared from SAS Analytics 2011. Stakeholders typically neither understand nor care to understand typical model fit metrics such as r-squared or AIC. Instead, they want to know about the impacts of the models on their processes. This session will dive into the details on how to implement a framework for converting model fit metrics to stakeholder-friendly model performance metrics and process metrics. Examples of the framework in use, and the resulting metrics, will be provided. The design and fine-tuning of models to optimize process metrics, rather than typical model fit metrics, will also be discussed.
SAS Time Series Studio: Understanding Time Series"Forecasting" immediately brings to mind the development of complex models and generation of forecast values, but equally important (if not more so) in the forecasting process is the analytic step prior to generating forecasts: understanding the structure of your time series data. SAS Time Series Studio is a new Java client in SAS Forecast Server: a graphical user interface for the interactive exploration and analysis of large volumes of time series data prior to forecasting. With SAS Time Series Studio, the forecast analyst has tools for identifying data problems, including outlier detection and management, missing values, and date ID issues. In addition, basic and advanced time series characterization, segmentation of the data into subsets, and structural manipulation of the collective time series (hierarchy exploration) all contribute to faster forecast implementation and better modeling due to increased understanding of the data.
Maximizing Medicare Revenues with SAS Enterprise MinerHealth insurance companies that participate in the federal Medicare Advantage program are reimbursed based on the chronic conditions in the population they cover, which are measured using diagnosis codes. But what happens when a patient has an illness that a hospital or physician's office doesn't code properly? Or when a member does not seek services for that condition in a particular year? Insurers risk losing thousands of dollars per case. Supplementing our current process with SAS Enterprise Miner, Highmark dramatically reduces that risk. This presentation will explain the process for modeling the 70 chronic conditions to predict members with undiagnosed conditions and the results of utilizing the SAS Enterprise Miner predictions versus another methodology.
Data Quality for Analytics and the consequences if it is not as good as you thoughtData quality is getting a lot of attention in the market. However, most of the initiatives, publications and papers on data quality focus on classic topics of data quality: elimination of duplicates, standardization of data, lists of values, value ranges and plausibility checks. Meanwhile, there are many aspects of data quality that are specific to analytics so it is important to determine if data is suitable for analysis or not.
While analytics often puts more stringent requirements on data quality, it also offers more options for measuring and improving data, such as calculating representative imputation values for missing values. SAS' offering is perfectly suited to analyze and improve data quality.
Since not all data quality problems can be solved, this paper examines the consequences of poor data quality. Citing simulations studies, the consequences of poor data quality on model performance are examined, thus demonstrating whether analysis should be performed on specific data. Using a business case calculation, the presentation also addresses how to improve data collection and data quality for better forecast quality.
Concentrating on predictive modeling and time series forecasting, the presentation also analyzes several data quality criteria.
The Luminary System: Blending Concept Extraction, Semantic Wikis, Ontologies, and Semantic Data FusionWe present the Luminary System, a prototype designed to identify and extract relevant content from unstructured text and present it in a summarized form to users. Guided by an OWL ontology, the system extracts relevant entities and concepts using semantic lexical parsing. The Luminary system can parse news articles, Word documents, and other unstructured text, looking for relevant information. It expands any relevant nuggets of information into an OWL object composed of RDF triples. Luminary also harvests information from structured and semistructured event streams, correlating any semantic information in event streams with knowledge discovered in documents. To represent and store the semantic information, the Luminary system automatically creates or updates the relevant information in a semantic wiki, incorporates information known from other sources, and derives new information. The wiki maintains templates, forms and categories for each class of objects within the OWL ontology, ensuring a consistent data model. Luminary approaches the semantic content extraction from a novel viewpoint, blending several different techniques to discover and derive more information. It provides a unique approach, blending many disciplines within the semantic community into a single application architecture.
Big Data Mining Is a Big DealToday's businesses are challenged to analyze BIG data volumes to improve and accelerate information delivery. A new era of SAS High-Performance Data Mining has emerged, enabling you to measure and act on data in greater detail than ever before possible. SAS High-Performance Data Mining will deliver massively parallel end-to-end data mining computing completely integrated into SAS Enterprise Miner. A full spectrum demo of SAS High-Performance Data Mining will be presented, covering data selection, exploratory analysis, dimension reduction, linear and nonlinear modeling, and model selection.
How Walmart is Competing on Analytics: Customer/Associate Style!Walmart has always competed in analytics, been committed to its customers and recognized its associates as its greatest asset. The company is now on a journey to enhance its analytics platforms to focus on Walmart customers, Sam's Club members and Walmart associates.
For all groups, analytics is enabling Walmart to drive relevancy through data mining, exploratory data analysis and predictive analytics - all done at the speed and scale of retail. The retailer quantifies this through a holistic lens by incorporating and quantifying thoughts, beliefs, and measurement. By doing so, Walmart is developing critical linkages between insights and customer/associate connections, where customer/associate IQ and EQ are being nurtured and developed.
Through examples and illustrations, Thorpe and Ormanidou will discuss how Walmart is addressing this: via organizational alignment, recruiting and engagement, and analytical prowess - all reducing decision risk and maximizing value.
Incorporating Association Rules into Predictive ModelsAssociation rules have been around for a long time and are a powerful tool to find relationships among transactional data. They are typically used to find products often purchased together. This can provide insight into what a customer is likely to purchase next. Association rules analysis is usually used as an exploratory, undirected analysis technique when you do not know what patterns to look for. After an association analysis has been run, the analyst sifts through hundreds or thousands of rules looking for interesting insights, which can lead a company to bundle certain products or place products on store shelves close together. Online recommendation systems have grown in popularity using similar concepts that analyze what groups of people have purchased in order to make additional product recommendations.
This is where association analysis typically ends, but we will demonstrate how to extend the use of the association rules and their output statistics (support, lift and confidence) as inputs to other modeling techniques. This allows us to gain additional insight into customer purchases based on the most appropriate association rule. Association rules, when combined with other demographic and behavioral data, can allow you to build more powerful models.
A High-Performance Analytics Infrastructure for Data-Intensive SAS EnvironmentsBig data is appearing everywhere. With the volume and variety of data always increasing, organizations are seeking more cost-efficient strategies to store and process vast amounts of unstructured and semi-structured data, increasingly feeding analytic models developed in SAS. Furthermore, many organizations compete based on the speed of their models, needing to run more sophisticated analytics over larger data sets, in less time - all while keeping a lid on IT infrastructure costs.
As users of SAS Grid Manager know, for appropriate types of workloads, distributed computing is a proven strategy for improving performance, containing costs and improving service levels for business users. Much as distributed compute grids make SAS workloads run faster and more reliably, grid computing techniques are increasingly being applied to data-intensive problems as well.
In this session, we'll discuss practical approaches to dealing with big data challenges in SAS environments by explaining how SAS and SAS Grid Manager users can extend their existing analytic infrastructures to run both SAS and big data workloads on a common, shared infrastructure. By taking advantage of unique resource-sharing and low-latency scheduling capabilities, SAS users can run both compute- and data-intensive analytic models faster and more reliably with a reduced investment in infrastructure.
Panel Discussion: Big Data - Hype or RealityPanel Participants:
- Chris Twogood, Teradata
- Vince Dell'Anno, Accenture
- Tony Hamilton, Intel
What's Your Retention Number? An Approach to Predicting Employee RetentionCompanies all over the world struggle with retaining their top employees. When a top employee leaves, the company takes a significant financial hit trying to recover and replace the lost talent. Employee data captured in UltiPro (Ultimate Software's people management solution that unites all aspects of HR, payroll and talent management) can be used to determine factors that might trigger an employee's termination. By using this data in innovative ways, predicting human capital management events, such as employee retention, is now possible. This talk will focus on how using advanced analytics with this data can improve the likelihood that a company retains its best talent.
Risk Adjustment Methods for Health Care Provider Performance MeasurementMeasurement of health care providers' quality and efficiency of care is becoming more prominent with health reform and greater consumer demand for affordable health care. Measuring provider processes and outcomes is critical to improving the delivery of care. However, significant challenges exist in conducting comprehensive, reliable and valid measurement of health care providers. This session will provide insight and lessons learned on measurement issues related to case-mix and risk adjustment when using claims-based measures of quality and efficiency.
Topics to be covered include: disease-specific versus general risk models; rate standardization versus regression for case-mix adjustment; choice of severity markers; defining peer groups for provider benchmarking; and software packages that support risk adjustment (e.g., Prometheus, Symmetry).
Case Study: Using SAS Enterprise BI for Multidimensional Analytics of SAS' Private Cloud Computing EnvironmentThe increasing pressure to justify and explain costs, utilization, and capacity for a complex computing cloud to a CIO, CFO, or IT customer demands a holistic view of the system. This view must convey critical intelligence by explaining who or what is spurring consumption, what that consumption is costing, where it has been, and where it is likely to go. This panoramic view calls for the multidimensional analytic capabilities of SAS Enterprise BI. Information is gathered from multiple sources within the SAS' private cloud, formulated into a SAS data mart, correlated, and then analyzed using SAS Enterprise BI.
The presentation will show the vital intelligence that has been derived from the multidimensional analytics and other examples of the analytic results. It will also give an overview of how SAS Enterprise BI has been used to implement an analytic appliance that reaches into dimensions of the cloud (storage, servers, network, HR systems, finance systems, etc.) to form a business-oriented view of resource consumption and to forecast the future.
Optimal Binning of Quantitative Inputs for Business Analytics Modeling Using the SAS SystemQuantitative inputs are routinely used in statistical and data mining techniques in business analytics. Often, issues related to the existence of outliers, unusual distributions, and the assumption of linearity implicit in many models suggest that "binning" or categorizing these inputs into binary or multicategory expressions is best for production-level use (Berry & Linoff, 2000). This type of variable transformation is useful in that it simply obviates issues with outliers, unusual distributions, and the linearity assumption. Unfortunately, this binning is most often performed arbitrarily or, at best, based on the analyst's "best guess." Here we introduce a SAS macro program that uses a decision tree-style algorithm to perform this transformation using the data at hand to optimize the relationship between the quantitative input and a single target variable. The OBINNING macro can be used as a preliminary modeling step with either continuous or categorical target variables. In the case of continuous targets, OBINNING uses the ANOVA F-test method for maximizing differences between the bins of the transformed quantitative variable with respect to the target. Furthermore, the OBINNING macro can accept a large number of variables to be transformed at once. Finally, OBINNING can also be used to collapse non-significant levels of ordinal and/or nominal inputs together, thereby simplifying subsequent models.
Insights to the Value of Consumer Marketing DataMost organizations measure success by how well they manage customer relationships. And often that comes down to their ability to maximize valuable customer data. In today's omnichannel environment, marketers must rethink how they leverage and analyze customer data to make better, more accurate business decisions. Marketers must move beyond demographics and area-level data to sustain a business.
In this presentation, learn about how key data measures - such as coverage, relevance, redundancy, accuracy, verification, predictive power, and level of specificity - play a role in determining consumer data value. Understand which applications are crucial to help you glean insight from transactional and compiled data and how to characterize potential gains from new data sources.
Explaining Analytics to Others SimplifiedThe widespread use of analytics throughout business creates a greater need to present and explain the analytical products. For the analytics to be useful in the decision making process, managers and executives must not only understand what is presented to them but develop a comfort level and ownership of the results.
In this presentation, a 30-year veteran of analytical consulting presents a simple set of field-tested rules to achieve these goals. After outlining some traditional expected approaches, some important additions and some unexpected new thoughts will be presented. These techniques work regardless of the audience's previous analytical experience.
For the analyst, learning and applying these techniques can be as helpful as any new analytical process. For the manager or executive, asking your staff to use these techniques will improve overall productivity.
AgendaThe full Analytics 2012 agenda will be available in August. Below is an outline of the agenda.
Monday, October 8
|7:30 a.m.||Registration Open; Breakfast in Exhibit Hall|
|8:30 - 8:45 a.m.||Welcome from Conference Co-Chairs|
|8:45 - 9:45 a.m.||General Session Keynote|
|9:45 - 10:00 a.m.||Break; Exhibit Hall Open|
|10:00 - 11:00 a.m.||General Session Keynote|
|11:00 - 11:30||Break; Exhibit Hall Open|
|11:30 a.m. - 12:30 p.m.||Featured Speaker||Topic-specific Breakout Sessions|
|12:30 - 1:45 p.m.||Lunch|
|1:45 - 2:45 p.m.||Featured Speaker||Topic-specific Breakout Sessions|
|2:45 - 3:00 p.m.||Break; Exhibit Hall Open|
|3:00 - 4:00 p.m.||Featured Speaker||Topic-specific Breakout Sessions|
|4:00 - 4:30 p.m.||Break; Exhibit Hall Open|
|4:30 - 5:30 p.m.||Featured Speaker||Topic-specific Breakout Sessions|
|5:45 - 7:15 p.m.||Conference Reception in Exhibit Hall|
Tuesday, October 9
|7:30 a.m.||Registration Open; Breakfast in Exhibit Hall|
|8:15 - 8:45 a.m.||Welcome from Conference Co-Chairs|
|8:45 - 9:45 a.m.||General Session Keynote|
|9:45 - 10:00 a.m.||Break; Exhibit Hall Open|
|10:00 - 11:00 a.m.||General Session Keynote|
|11:00 - 11:30||Break; Exhibit Hall Open|
|11:30 a.m. - 12:30 p.m.||Featured Speaker||Topic-specific Breakout Sessions|
|12:30 - 1:45 p.m.||Lunch; Roundtable Discussions|
|1:45 - 2:45 p.m.||Featured Speaker||Topic-specific Breakout Sessions|
|2:45 - 3:00 p.m.||Break; Exhibit Hall Open|
|3:00 - 4:00 p.m.||Featured Speaker||Topic-specific Breakout Sessions|
|4:00 - 4:30 p.m.||Break; Exhibit Hall Open|
|4:30 - 5:30 p.m.||Featured Speaker||Topic-specific Breakout Sessions|
*Agenda is subject to change
Poster competition sponsored by
The call for posters is now closed.Here's your chance to be recognized by the analytics community for your work in the field. Poster presentations provide an excellent opportunity for students and analytics practitioners to present their projects in a one-to-one setting and receive professional feedback from leaders in the field of analytics.
The Analytics 2012 Poster session is open to all analytics practitioners from corporate or academic fields.
Posters will be located inside the Exhibit Hall and accessible throughout the conference. We request that authors make themselves available during dedicated Exhibit Hall hours to speak with attendees and answer questions.
View the list of posters presented at Analytics 2011.
Student Poster Contest Win a scholarship for a free trip to Analytics 2012! The six most impressive poster abstracts submitted by students will be awarded with an all-inclusive trip to Las Vegas to attend Analytics 2012. The prize includes airfare, hotel, meals and a free conference registration. A committee will judge the abstracts and decide the winners by September 7, 2012. You must be a full-time student at an accredited university or college to be considered. Students who were full-time students in the 2011-2012 academic year are also eligible. For consideration, abstracts must be received by August 22, 2012 and final posters must be received for judging by August 30, 2012. Read our official contest guidelines for more information.Poster submission guidelines:
- To participate, attendees must submit a poster abstract
- The abstract must include a description of how you have used analytics to improve your processes and/or analyze your work
- You must define your problem/research goal and show the application of analytics methodology
- You must be able to document the steps and show your results.
- The content of the poster must be either a class assignment (non-research), a research project, or a business application
- Poster abstracts must be 250 words or less
- No abstracts will be accepted after August 22, 2012.
- For students participating in the contest, posters will be judged by a committee and applicants will be notified by September 7, 2012. Final posters must be received by August 30, 2012 for judging.
- SAS will provide a display board and a header denoting the poster title and author. SAS will print the header and poster for display at the conference.
- Poster presentations that are accepted from academia (faculty and full-time students) will allow the primary presenter to attend the conference for free. You must be currently enrolled in or employed by an accredited university or college to be eligible for the free conference registration. After being informed of the poster's acceptance, please simply note in the "additional comments" field of your registration your participation as a poster presenter and your college affiliation. (You will be required to fax a letter with your department head's signature as verification of your affiliation.)
- Poster presentations that are accepted from the business world will allow the primary presenter to attend the conference at the early bird rate ($500 off the regular fees). After being informed of the poster's acceptance, please simply note in the "additional comments" field of your registration your participation as a poster presenter.
|Successful Poster Presentation Tips||Poster Presentation Template (.zip)||Questions? Contact us.|
RoundtablesNumerous roundtable discussions will take place on Tuesday during lunch. A wide variety of topics will be discussed. Take advantage of this opportunity to talk to an expert in a small setting.
Exhibit HallMeet representatives from top technology companies to learn about the latest products and services that can move your organization forward in the world of analytics. Enjoy complimentary breakfasts and morning and afternoon breaks, located in the Exhibit Hall. An Internet café will also be available. View a listing of current exhibitors and sponsors or become an exhibitor or sponsor of Analytics 2012.
Poster SessionBe sure to visit the poster session inside the exhibit hall to view the innovative ways others are using analytics to solve real-world problems.
BookstoreWhile visiting the Exhibit Hall, stop by the bookstore to browse the latest titles from SAS Press. Books topics include data mining, SAS software and more. Attendees will receive a 20% discount on all purchases!
Monday Evening ReceptionDon't miss out on the networking event of the year. Located in the exhibit hall, socialize with attendees, exhibitors, conference presenters and staff at our annual reception. A delicious array of appetizers and beverages will be served.
Demo TheaterLocated in the Exhibit Hall, the Demo Theater is a great way to learn more about products and solutions offered by SAS and Analytics 2012 sponsors. The Demo Theater agenda will include several short technical presentations. These presentations will take place during conference breaks. View schedule.
Sixth annual Data Mining Shootout, presented by SAS and and The Institute for Health and Business InsightListen to presentations from the winners of the Data Mining Shootout where students competed to solve a complex predictive modeling case study. Interested in competing in the Data Mining Shootout? Submissions are being accepted until July 9, 2012.
Roundtable DiscussionsThe following roundtable discussions will take place on Tuesday, Oct. 9 during lunch. Additional topics will be added weekly. No pre-registration is required to participate.
- Information Management in Big Data Analytics Platform
Bheeshma Tumati, Deloitte
- What's "in" to Managing Big Data Analytics: the Impact of In-Database and In-Memory for the SAS User
Tho Nguyen, Teradata
- The Analytic Challenges of Unstructured Data
Chris Twogood, Teradata
- Considerations for Determining Analytic Architecture(s) within your Organization
Mike Rote, Teradata
- High Performance Analytics Infrastructure for Structured and Unstructured Data, Q&A
Rohit Valia, Platform Computing/IBM
- Enabling the future of your enterprise with SAS Grid Manager
Adam Diaz, Platform Computing/IBM
- Engineering Analytics
Frank Payne, PQC International, Inc.
- Capital Measurement and Business Use Under Basel III
Dave Morgan, Kiwibank
- Analytic Training and Development
Gene Grabowski, SAS
- Marketing Analytics
Wouter Buckinx, Python Predictions
- New Risk Modeling
Glenn Bailey, Manheim Auctions
- Analytics in Operations/Supply Chain
Sudip Bhattacharjee, University of Connecticut
- Customer Analytics
Paul Grasso, Chico's FAS, Inc.
- Model Management and Validation
Wayne Thompson, SAS
Ivan Oliveira, SAS
- SAS in Financial Services
Dudley Gwaltney, SunTrust Bank
- Data Quality for Analytics
Gerhard Svolba, SAS
- Social Media Analysis and Beyond
Tom Reamy, KAPS Group, LLC
- Customer Segmentation
Goutam Chakraborty, Oklahoma State University
- Discovering, Characterizing and Predicting Fraudulent Motor Vehicle Insurance Claims Using SAS Analytics
Mark Schneider and Chase Zieman, Louisiana State University
- A Combination of Methodologies in Credit Scoring: Using Logistic Regression and Cluster Analysis to Improve Profitability
Chris Cusumano, Kennesaw State University
- Multistep Sales Forecasting in the Automotive Industry Based on Structural Relationship Identification
Akkarapol Sa-ngasoongsong, Satish T.S. Bukkapatnam, Jaebeom Kim, Parameshwaren S. Iyer and R.P. Suresh, Oklahoma State University
- P&C Underwriting Economic Capital Model
Alan Kessler, Research and Development Center, University of Illinois
- Analytics in Profiling Winning NBA Teams
Michelle Mancenido, Arizona State University
- Should We Audit or Not? Predicting Assessments at the IRS
Matthew Lawrence, University of Alabama
- Effective Analytics
Emmett Cox, BBVA Compass
- The Value of Deep Data
Denis Cremisio, Epsilon
- SAS and Hadoop
Alex Infanzon, Greenplum, A Division of EMC
- The Commoditization of Models
Matthias Kehder, Modern Analytics
- Predicting Risks: Utilizing Subject Matter Knowledge, Factor Analysis and Clustering to Estimate Environmental Hazards
Don Monson, Deloitte
- Human Capital Analytics
Gene Pease, Capital Analytics
- Student Programs at SAS
Julie Petlick, SAS
- Social Media Resources for Education
Mantosh Sarkar, Oklahoma State University
- Monetizing Big Data
Christopher Stephens, Greenplum, A Division of EMC
- Modeling in Property and Causality Insurance
Frank Travisano, Chubb Insurance
- Business Value of Analytics and Its Delivery
Robert Woodruff, SAS
- Consumer Marketing Data
Peter Zajonc, Epsilon
- Maximizing Efficiency and Performance of SAS with Netezza
Tracy Zerbin, IBM