Featured news from SAS.


View as a web page
SAS The Power to Know logo

Things are upside down here in the States, weatherwise. Eighty-three degrees in the Raleigh area as I am writing this, while the Boston area braces for its third snowfall in a matter of days. Presumably the weather was more appropriate for the Boston Tea Party, which occurred in December 1773, although those original Patriots were a hardy lot and I doubt a little snow would have stopped them. I automatically searched to learn the actual weather for that day, but of course the usual sites only go back to 1920 or so. I did learn that there are droves of weather-related data at the National Archives, so presumably you could dig out information going back to the 1700s. It looks like there are substantial nautical data for ships weathering storms.

We statisticians love our data! If anyone knows the value of accessible data, it’s us.

Did I mention that the Patriots won the Super Bowl?  

Note that early-bird registration ends February 27 for SAS® Global Forum 2017, which will take place April 2–5 in Orlando. So that you can plan your schedule the rest of this year, regional SAS user conferences will be held in the fall in Long Beach, St. Louis, and Raleigh, and the SCSUG Educational Forum will be held in Dallas.

And note that SAS Statistical R&D is exhibiting at the Conference for Statistical Practice and ENAR in the next few weeks, so please come by and see us.

In the spirit of earlier times, I thought I would highlight some of the most popular technical papers on the SAS website downloaded in the last quarter of 2016, both SAS and customer authored. If you missed these gems the first time around, they still offer plenty of value.

Hope your year is going well so far.

Maura Stokes

Senior R&D Director, Statistical Applications 



Technical Highlights

SAS Global Forum 2017

About SAS Global Forum 2017, Statistically Speaking 

If you are still debating whether to attend this year, consider what you might miss:  

Statistical Tutorials on Sunday include R&D instruction on Bayesian hierarchical modeling, weighted GEE analysis, power and sample size computations, structural equation modeling, penalized regression methods, and advanced ODS graphics examples.

Over a dozen presentations by SAS staff on new capabilities, such as estimating causal effects from observational data, ROC analysis with the PHREG procedure, propensity score analysis, advanced hierarchical Bayesian models, factorization machines, spatial econometric models, stacked ensemble models, and automated hyperparameter tuning for effective machine learning.

Numerous emerging technology presentations for a glimpse into future software from SAS for analytics in areas such as cognitive computing, new offerings on the SAS® Viya™ platform, deep learning, and open SAS®.

Over 40 scheduled “super demos” on the exhibition floor, ranging from what’s new in SAS® University Edition to the RAREEVENTS procedure in SAS/QC® to the new data structures in the SAS/IML® language.

Ready access to R&D staff

Practical presentations by other users. A few talks that caught my eye concern fitting flexible models for longitudinal data by using the NLMIXED procedure, geospatial analysis, and what statisticians should know about machine learning.

You can browse all the presentations to see what else might interest you. 


blog icon

The DO Loop

Distinguished Research Statistician Developer Rick Wicklin shows you how to simulate many samples from a linear regression model and how to solve mixed integer linear programming problems by using either the OPTMODEL procedure in SAS/OR® or the MILPSOLVE function in SAS/IML.


Are You a SAS/STAT User Still Moving to the SAS® 9.4 Platform? 

If you are moving up to SAS 9.4 and would like to catch up on the recent SAS/STAT releases on that platform, this handout is for you! Get an overview of our new additions in missing data analysis, modern survival data analysis, statistical modeling, spatial point pattern analysis, Bayesian analysis, item response analysis, classification and regression trees, and performance enhancements. There’s truly something here for everyone. And if you aren’t currently on the move, feel free to use this handout however it helps you get into the passing lane! 

blog icon

Graphically Speaking

Another popular blog that focuses on SAS ODS Graphics is Graphically Speaking. Distinguished Research Statistical Developer Warren Kuhfeld has written several posts that might be of interest. Check out the ones on editing the template that PROC SGPLOT writes, a title change macro, and basic and advanced axis tables.


Top Technical Papers of 2016


Chi-Square and T Tests Using SAS: Performance and Interpretation

Check out this popular paper from Jennifer Waller and Maribeth Johnson.

Data analysis begins with data cleanup, calculation of descriptive statistics, and examination of variable distributions. Before more rigorous statistical analysis begins, many statisticians perform basic inferential statistical tests such as chi-square and t tests to assess unadjusted associations. These tests help guide the direction of the more rigorous analysis. How to perform chi-square and t tests is presented. The paper examines how to interpret the output, discusses where to look for the association or difference based on the hypothesis being tested, and proposes next steps for further analysis using example data.


Measures of Fit for Logistic Regression

Author Paul Allison provides great information here. 

One of the most common questions about logistic regression is “How do I know if my model fits the data?” There are many approaches to answering this question, but they generally fall into two categories: measures of predictive power (like R-square) and goodness-of-fit tests (like the Pearson chi-square). This presentation looks first at R-square measures, arguing that the optional R-squares reported by PROC LOGISTIC might not be optimal. Measures proposed by McFadden and Tjur appear to be more attractive. As for goodness of fit, the popular Hosmer and Lemeshow test is shown to have some serious problems. Several alternatives are considered. 


Whitepaper Icon

Introduction to Predictive Modeling with Examples

David Dickey presents a great introduction to what is meant by predictive modeling today.

Predictive modeling is a name given to a collection of mathematical techniques that have in common the goal of finding a mathematical relationship between a target, response, or “dependent” variable and various predictor or “independent” variables, with the goal of measuring future values of those predictors and inserting them into the mathematical relationship to predict future values of the target variable. Because these relationships are never perfect in practice, it is desirable to give some measure of uncertainty for the predictions, usually a prediction interval that has some assigned level of confidence like 95%. Another task in the process is model building. Typically there are available many potential predictor variables, which you might think of in three groups: those unlikely to affect the response; those almost certain to affect the response and thus destined for inclusion in the predicting equation; and those in the middle, which might or might not have an effect on the response. For this last group of variables, techniques to test whether to include those variables have been developed, and research on this “model building” step continues today. This paper addresses some basic predictive modeling concepts and is meant for people new to the area. Predictive modeling is arguably the most exciting aspect in the emerging and already highly sought-after field of data analytics.



Creating and Customizing the Kaplan-Meier Survival Plot in PROC LIFETEST

This paper is a must for those working in survival analysis. 

If you are a medical, pharmaceutical, or life sciences researcher, you have probably analyzed time-to-event data (survival data). One of several survival analysis procedures that SAS/STAT® provides, the LIFETEST procedure computes Kaplan-Meier estimates of the survivor functions and compares survival curves between groups of patients. You can use the Kaplan-Meier plot to display the number of subjects at risk, confidence limits, equal-precision bands, Hall-Wellner bands, and homogeneity test p-value. You can control the contents of the survival plot by specifying procedure options with PROC LIFETEST. When the procedure options are insufficient, you can modify the graph templates with SAS macros. This paper provides examples of survival plot modifications using procedure options, graph template modifications using macros, and style template modifications.


Item Response Theory: What It Is and How You Can Use the IRT Procedure to Apply It

This new procedure from SAS/STAT went up the hits chart very quickly! 

Item response theory (IRT) is concerned with accurate test scoring and development of test items. You design test items to measure various kinds of abilities (such as math ability), traits (such as extroversion), or behavioral characteristics (such as purchasing tendency). Responses to test items can be binary (such as correct or incorrect responses in ability tests) or ordinal (such as degree of agreement on Likert scales). Traditionally, IRT models have been used to analyze these types of data in psychological assessments and educational testing. With the use of IRT models, you can not only improve scoring accuracy but also economize test administration by adaptively using only the discriminative items. These features might explain why in recent years IRT models have become increasingly popular in many other fields, such as medical research, health sciences, quality-of-life research, and even marketing research. This paper describes a variety of IRT models and shows how to fit them with the IRT procedure.



Whitepaper Icon

CONTRAST and ESTIMATE Statements Made Easy: The LSMESTIMATE Statement

This is a paper to read again and keep around for reference. 

In many SAS/STAT modeling procedures, the CONTRAST and ESTIMATE statements enable a variety of custom hypothesis tests, but using these statements correctly is often challenging. The new LSMESTIMATE statement, available in 10 procedures in SAS/STAT 9.22, greatly simplifies the use of these statements. The LSMESTIMATE statement enables you to sidestep parameterization issues and to specify custom tests in terms of population quantities of direct interest (the LS-means). The LSMESTIMATE statement also implements a new nonpositional syntax for specifying contrasts. This paper discusses these new features and demonstrates them with examples from actual user questions to the Statistical Procedures group in SAS Technical Support.



Tech Support Points Out

Tech Support Points Out icon

Estimating Relative Risks in a Multinomial Response Model

There are two types of relative risks that might be of interest when you are modeling a multinomial response. You might want to compare two populations with respect to an individual response level probability (P(Y=i|X=j)/P(Y=i|X=k)), or you might want to compare response level probabilities in a given population (P(Y=i|X=j)/P(Y=k|X=j). Both situations are discussed in this note. In the multinomial case, relative risk estimates are nonlinear functions of the parameters in a generalized logit model, which can be fit using PROC LOGISTIC and a macro, the CATMOD procedure, or the NLMIXED procedure.


Talks and Tutorials


Upcoming Conferences

2017 American Statistical Association Conference on Statistical Practice 
Feb. 23–25, 2017
Jacksonville, FL

ENAR 2017
Mar. 12–15, 2017
Washington, DC

SAS Global Forum 2017
Apr. 2–5, 2017
Cincinnati, OH

May 14–17, 2017
Baltimore, MD


Quick Links


SAS® Statistics and Operations Research News

Don't miss important updates from SAS! Please add sas.com as a domain in your safe sender list.
Want to get more out of your relationship with SAS? Update your SAS profile

About this e-mail:
If you do not wish to receive future SAS® Statistics and Operations Research News editions, please unsubscribe.
SAS places great value in fair information practices and in connection with the management of our contact database, we would like to remind you that SAS Institute Inc. may use your personal contact details for marketing purposes, as stated in its privacy policy. To contact SAS via postal mail: SAS, SAS Campus Drive, Cary NC 27513 USA. ATTN: Legal Division/Privacy Manager.

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. Copyright © SAS Institute Inc. All rights reserved.