Featured news from SAS.

 

View as a web page
 
 
 
banner-statistics-operations-news-text

I am writing this on the day of the March for Science. Previous commitments kept me away, but thanks for marching if you did. I saw many great signs held by current and upcoming scientists. Two I especially liked were:

Girls Just Want to Have FUNding for Their Scientific Research

and

Facts Are Stubborn Things (John Adams, 1770)

I think you call the latter historical data…or pre-presidential data, since Adams didn’t serve as president of the United States until 1797. The many marchers wearing their lab coats and carrying equipment like test tubes and pipettes reminded me of why I am not in lab sciences (college chemistry lab partners were nearly driven insane by my cavalier approach to experiments) and why the statistical sciences were a much better fit for me! I know many statisticians who represented our areas at various events.

Hopefully people felt energized, despite the rain in some locations, and they met other great folks as they marched on Earth Day to emphasize the importance of science and scientists to the entire planet. Our home has already seen a difference—this afternoon, the puppy pulled a book on Marie Curie off the shelf and started, ah, reading it.

My next international scientific meeting will be the Joint Statistical Meetings, July 29 through August 3 in Baltimore. For those planning to attend, early-bird registration for hotels begins May 1. With over 6,000 people from 52 countries planning to attend and over 600 sessions (plus a full slate of courses and plenary sessions), registering early is not a bad idea! Colleagues from SAS will be teaching tutorials, and we will be part of the exhibition hall, as usual.

Before JSM, I will be presenting at the Michigan Users Group day on June 8, so I hope to see some of you there.

Hope the spring is going well for you.

Maura Stokes

Senior R&D Director, Statistical Applications 

 

Technical Highlights

 
SAS Global Forum 2017

About SAS® Global Forum 2017

I really enjoyed this past SAS Global Forum. I didn’t present as much as I usually do, so I was able to attend other presentations. I especially enjoyed Amanda Farnsworth talking about how she brought visual journalism to the BBC, including how they use data to tell stories in a visually compelling way. Many talks were recorded, including this one, so check it out! The analytical groups doubled the number of Super Demos they presented, and we hope to convert many of them to videos for your viewing pleasure later in the year. The demo floor was buzzing and we had lots of useful conversations with customers, so think about coming next year in Denver. The call for papers goes out in the summer. 

 

What’s Changed? 

As you know, we avoid changing defaults like the dickens, knowing that much of our software ends up in production jobs. But occasionally we do change defaults for the better, and more likely we change the appearance of the results, or improve a method that is used, and so on. These changes are listed in a separate section of the “What’s New” chapter in the analytical software documentation, and it’s useful to review it for each new release. For example, in SAS/STAT 14.2, the following is listed in the “What’s Changed” section:

In SAS/STAT 14.1, when cross validation is used to assess cost-complexity pruning and the response is a classification variable, PROC HPSPLIT displays the average square error (ASE) by default on the cost-complexity plot. In SAS/STAT 14.2, PROC HPSPLIT displays the misclassification rate by default in this case. If you want to display the ASE instead, specify the new PLOTS=CVCC(ASE) option in the PROC HPSPLIT statement.

The functionality of the Power and Sample Size application has been replaced by tasks in SAS Studio.

 

blogs blue icon

The Do Loop

Distinguished Research Statistician Developer Rick Wicklin shows you how to work with quantiles and medians in SAS®, including how to use the QUANTREG procedure to produce confidence intervals for a weighted analysis. 

Another post discusses how to compute Monte Carlo estimates of joint probabilities. 

 

SAS/STAT User Still Moving to the SAS 9.4 Platform? 

If you are moving up to SAS 9.4 and would like to catch up on the recent SAS/STAT releases on that platform, this handout is for you! Get an overview of our new additions in missing data analysis, modern survival data analysis, statistical modeling, spatial point pattern analysis, Bayesian analysis, item response analysis, classification and regression trees, and performance enhancements. There’s truly something here for everyone. And if you aren’t currently on the move, feel free to use this handout however it helps you get into the passing lane! 

 

blogs blue icon

Graphically Speaking 

Distinguished Research Statistician Developer Warren Kuhfeld has written several posts that might be of interest. Learn how to add lines, curves, and fit functions to graphs, and find out how to ensure consistency in the ordering of your graph components.

Note that Kuhfeld will be presenting a paper on multipage adverse events reports using the SGPLOT procedure with Mary Beth Herring of Rho Inc. at PharmaSUG, May 14–17 in Baltimore. R&D Director Sanjay Matange will give a presentation there on graphical techniques for patient profile reporting. 

 

Technical Papers

 
White Paper - Icon

Step Up Your Statistical Practice with Today’s SAS/STAT® Software

Has the rapid pace of SAS/STAT releases left you unaware of powerful enhancements that could make a difference in your work? Are you still using PROC REG rather than PROC GLMSELECT to build regression models? Do you understand how the GENMOD procedure compares with the newer GEE and HPGENSELECT procedures? When should you turn to PROC ICPHREG rather than PROC PHREG for survival modeling? This paper will increase your awareness of modern tools in SAS/STAT by providing high-level comparisons with well-established tools and explaining the benefits of enhancements and new procedures. The paper focuses on new tools in the areas of regression model building, generalized linear models, survival analysis, and mixed models. When you see the advantages of these tools, you will want to put them into practice. The paper also points out resources that will guide you to new tools in other important areas, such as Bayesian analysis, causal inference, item response theory, methods for missing data, and survey data analysis.

 

Factorization Machines: A New Tool for Sparse Data

Factorization machines are a new type of model that is well suited to very high-cardinality, sparsely observed transactional data. This paper presents the new FACTMAC procedure, which implements factorization machines in SAS® Visual Data Mining and Machine Learning. This powerful and flexible model can be thought of as a low-rank approximation of a matrix or a tensor, and it can be efficiently estimated when most of the elements of that matrix or tensor are unknown. Thanks to a highly parallel stochastic gradient descent optimization solver, PROC FACTMAC can quickly handle data sets that contain tens of millions of rows. The paper includes examples that show you how to use PROC FACTMAC to recommend movies to users based on tens of millions of past ratings, predict whether fine food will be highly rated by connoisseurs, restore heavily damaged high-resolution images, and discover shot styles that best fit individual basketball players.

 

White Paper - Icon

Using a Dynamic Panel Estimator to Model Change in Panel Data

Panel data, which are collected on a set (panel) of individuals over several time points, are ubiquitous in economics and other analytic fields because their structure allows for individuals to act as their own control groups. The PANEL procedure in SAS/ETS® software models panel data that have a continuous response, and it provides many options for estimating regression coefficients and their standard errors. Some of the available estimation methods enable you to estimate a dynamic model by using a lagged dependent variable as a regressor, thus capturing the autoregressive nature of the underlying process. Including lagged dependent variables introduces correlation between the regressors and the residual error, which necessitates using instrumental variables. This paper guides you through the process of using the typical estimation method for this situation—the generalized method of moments (GMM)—and the process of selecting the optimal set of instrumental variables for your model. Your goal is to achieve unbiased, consistent, and efficient parameter estimates that best represent the dynamic nature of the model.

 

Multivariate Time Series: Recent Additions to the VARMAX Procedure

Recent advances in computing technology, monitoring systems, and data collection mechanisms have prompted renewed interest in multivariate time series analysis. In contrast to univariate time series models, which focus on temporal dependencies of individual variables, multivariate time series models also exploit the interrelationships between different series, thus often yielding improved forecasts. This paper focuses on cointegration and long memory, two phenomena that require careful consideration and are observed in time series data sets from several application areas, such as finance, economics, and computer networks. Cointegration of time series implies a long-run equilibrium between the underlying variables, and long memory is a special type of dependence in which the impact of a series’ past values on its future values dies out slowly with the increasing lag. Two examples illustrate how you can use the new features of the VARMAX procedure in SAS/ETS 14.1 and 14.2 to glean important insights and obtain improved forecasts for multivariate time series. One example examines cointegration by using the Granger causality tests and the vector error correction models, which are the techniques frequently applied in the Federal Reserve Board’s Comprehensive Capital Analysis and Review (CCAR), and the other example analyzes the long-memory behavior of US inflation rates.

 

Technical Support Points Out

 
Tech Support Points Out icon

R2 and Partial R2 for Generalized Linear Models Based on the Variance Function

R2 is a popular measure of fit used for ordinary regression models. The RsquareV macro provides the R 2 V statistic proposed by Zhang (2016) for use with any model based on a distribution with a well-defined variance function. This includes the class of generalized linear models and generalized additive models based on distributions such as the binomial for logistic models, Poisson, gamma, and others. It also includes models based on quasi-likelihood functions for which only the mean and variance functions are defined. A partial R2 is provided when a full model is compared to a nested, reduced model. Partial R can be obtained from this when the difference between the full and reduced model is a single parameter. A penalized R2 is also available, adjusted for the additional parameters in the full model.

 

Talks and Tutorials

 

Upcoming Conferences

PharmaSUG
May 14–17, 2017
Baltimore, MD

Joint Statistical Meetings
July 29–August 3, 2017
Baltimore, MD

Michigan Users Group
June 8, 2017
Livonia, MI
 

Quick Links

 

SAS® Statistics and Operations Research News

Don't miss important updates from SAS! Please add sas.com as a domain in your safe sender list.
Want to get more out of your relationship with SAS? Update your SAS profile
.

About this e-mail:
If you do not wish to receive future SAS® Statistics and Operations Research News editions, please unsubscribe.
SAS places great value in fair information practices and in connection with the management of our contact database, we would like to remind you that SAS Institute Inc. may use your personal contact details for marketing purposes, as stated in its privacy policy. To contact SAS via postal mail: SAS, SAS Campus Drive, Cary NC 27513 USA. ATTN: Legal Division/Privacy Manager.

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. Copyright © SAS Institute Inc. All rights reserved.