Customers / Customer Stories

SAS Canada
280 King Street East
Suite 500
Toronto, Ontario, M5A 1K7
Media Contact: John Quinn, Marketing Communications Manager
E-mail: john.quinn@sas.com
Tel: (416) 307-4537

Customers

Printer-FriendlyPrinter-Friendly

Customer Stories

 

Fill up on SAS® with the Macro Function Sandwich

Function-style macros are macros that can be used within a SAS statement. Although such macros are extremely flexible, they can only use macro language code.

So what do you do if you need a macro that can be embedded within a SAS statement but the underlying task requires the execution of one or more DATA or PROC steps?

Mike Rhoads, a Vice President at Westat, one of the foremost research and statistical survey organizations in the country, asked himself that question – and found the answer.

Rhoads has devised a method that allows you to leverage the capabilities of user-written SAS functions to circumvent this limitation – meaning that your macros can now have the flexibility of a function-style interface while still being able to execute one or more SAS steps in the background.

No more restriction 
Rhoads' approach was inspired by his search for an effective way of taking a macro parameter value containing a SAS variable list in any form (e.g., A1-A3 FIRST - LAST VARA-NUMERIC-VARZ) into a simple list of all of the referenced variables (e.g., A1 A2 A3 FIRST MIDDLE LAST VARA VARM VARN VARZ).

A number of possible solutions returned the variable names in the same order in which they appeared in the original data set, while others maintained the order of the specified variable list. Of the latter, Rhoads' favorite approach, due to its simplicity, was one that utilized a PROC TRANSPOSE with OBS=0 as an option on the input data set:

proc transpose data=sashelp.class(obs=0) out=varlist;
     var name-numeric-weight name-character-weight; /* or some other   
        specification */
     run;

This can then easily be followed by a PROC SQL invocation to produce a macro variable containing the expanded list of variables in the desired order:

proc sql noprint;
     select _name_ into :varlist separated by ' '
        from varlist;
     quit;

"Why write code to manually parse the string and handle all of the possible types of variable lists," Rhoads asks, "when you can get a SAS PROC to do the work for you?"

But Rhoads found this approach to have one major drawback.

Say, for example, you're developing a macro that runs a report on each of a specified set of variables from a data set. You invoke the macro as follows:

%RunReports (data=mydata, var=A1-A3 FIRST—LAST VARA-NUMERIC-VARZ)

Within the %RunReports macro, you want to call a standard utility macro to expand the user's specified list. Ideally, this would be invoked from within %RunReport as follows:

%let varlist = %ExpandVarList(data=&DATA, var=&VAR);

This style of invocation requires that %ExpandVarList be a function-style macro, meaning that it's used by embedding it within some other SAS statement or SAS macro statement – in the example above, a macro %LET statement. The name comes from the resemblance of such macros to SAS functions; they return text that becomes part of the statement within which they're called, just as SAS functions are called as the right-hand side of an assignment statement or as part of an expression.

"Since calls to function-style macros are located within SAS statements and return text that will become part of that statement when it's parsed by SAS," Rhoads says, "their big limitation has always been that they couldn't submit any code themselves. All of their processing had to be accomplished using the SAS macro language."

Although the macro language has functions that allow some limited processing of SAS data sets and external files, they're much less powerful than the integrated set of capabilities available with the full panoply of SAS PROCs and DATA step functionality.
But now the problem's been solved.
The 'sandwich'
SAS has long had a rich set of built-in functions for manipulating one or more numeric or character arguments and returning a result. These functions provide a reliable, tested mechanism for implementing simple and complex algorithms.

"Even though the SAS system has several hundred of these built-in functions, the SAS development team couldn't anticipate every possible function that someone might want.

"So," Rhoads asked, "what can we SAS programmers do when we need a function that doesn't exist in our licensed SAS components?"

SAS provided a solution. SAS programmers can now create functions, and CALL routines, in PROC FCMP that can be invoked from DATA steps in a very similar manner to using built-in SAS functions. These functions are written using DATA step syntax, support parameter passing and allow the use of variables that are local to the function. Since they can be called recursively, they allow problems to be solved that were difficult to handle with traditional DATA step programming.

"But since these new user-written functions are primarily designed to be called within DATA steps," Rhoads says, "they don't seem to directly address our interest in expanding the capabilities of function-style macros."

But Rhoads found that two little-noticed features of this enhancement provide the key:

1. Functions created in PROC FCMP can be called from macros using %SYSFUNC.

2. PROC FCMP functions can use a special function called RUN_MACRO to invoke a SAS macro.

"Taken together," Rhoads says, "these two features allow you to apply a simple three-step technique for implementing function-style macros that can use the full power of SAS DATA steps and procedures."

The first step involves the outer macro. This is the "visible" macro that users will reference in their programs. It doesn't perform any actual processing, except perhaps for error checking. Its purpose is to invoke a user-written function with %SYSFUNC, passing the user's arguments to the user-written function and providing the user with the returned value.

Next comes the user-written function. Like the outer macro, it doesn't perform any actual processing. Its main purpose is to accept parameters from the outer macro and pass them to an inner macro, which it invokes using RUN_MACRO. It also accepts the text generated by the inner macro and passes it back to the outer macro.

Then comes the inner macro.

"This is where the actual work gets done," Rhoads says.

Using the parameter values specified by the user in the outer macro and passed along by the user-written function, the inner macro may contain whatever sequence of DATA or PROC steps is necessary to generate the desired results. Once these steps have been executed and the desired text has been obtained, the inner macro sends the text back to the user-written function.

"It seemed to me that this technique was useful enough to deserve a catchy name," Rhoads says, "but I was singularly unsuccessful in coming up with one. Fortunately, my longtime colleague Michael Raithel came to the rescue.

"Since we wind up with two macros with a user-written function in the middle, he suggested we call it the 'Macro Function Sandwich.'"

A good example
Rhoads offers an example of how you can use the Macro Function Sandwich to implement some useful macro-style functions. It involves expanding a user-specified variable list.

With this example, you want to implement a general utility macro, %ExpandVarList, that takes as arguments the name of a SAS data set and a list containing the names of some of the variables in the data set.

The macro is to return an expanded version of the user-specified variable list, which lists the name of each variable individually, with a blank space between each one.

"We want this to be a function-style macro," Rhoads, "but since the desired implementation strategy uses PROC TRANSPOSE and PROC SQL, we will have to utilize our technique, with a user-written function serving as the intermediary between the outer macro that the user calls and an inner macro that actually does the work."

Outer macro 
The code for the outer macro shown below is trivial. The macro accepts two keyword-style parameters: DATA= to indicate the data set, and VAR= to specify the variable list. _LAST_ (referring to the most recently created SAS data set) and _ALL_ (which specifies all of the variables in a data set) serve as default values if the user omits the corresponding argument in the macro invocation.

If the DATA= parameter has the value of _LAST_, either by default or explicitly, the macro resolves this value by assigning the value of the automatic macro variable &SYSLAST to &DATA.

"I found that simply passing _LAST_ through didn't work reliably," Rhoads says.

The only other action the macro has to perform is to invoke the user-written function. This is done using the %SYSFUNC function, just as if you were calling a built-in SAS function. (Rhoads notes that character arguments to functions aren't quoted when the function is called in macro via %SYSFUNC.) The call is nested within the %TRIM function to make sure that any trailing blanks are removed.

Since the call to the user-written function stands alone rather than being embedded within a macro-language statement, the text returned by the user-written function will also be returned as the result of the macro call.

Rhoads points out that the invocation of the user-written function shouldn't have a semicolon at the end, since if it does the semicolon will be included in the returned value.

"This is a relatively subtle error, since the macro will still work correctly if it's called at the end of a statement, which is frequently the case. It won't work, however, if it's followed by other text that's intended to be part of the original statement, since the statement will be ended by the returned semicolon."

"It is, of course, possible to make the macro more involved," Rhoads says. "In a production environment, it might be desirable to do a certain amount of error-checking, such as making sure that the specified data set actually exists."

Additional parameters could also be specified and processed, such as the desired delimiter to separate the variable names when the list is returned. (For use in a PROC SQL SELECT statement, for example, the user would probably prefer the result to have comma-delimited names.)

%MACRO ExpandVarList (
      data=_LAST_,
      var=_ALL_
);
   %if %upcase(%superq(data)) = _LAST_
       %then %let data = &SYSLAST;
   %trim(%sysfunc(ExpandVarList(&DATA,&VAR)))
%MEND ExpandVarList;

User-written function
The code for the user-written function that will serve as a go-between for the outer and inner macros is as follows:

proc fcmp outlib=sasuser.mysubs.utility;
   function ExpandVarList(data $,var $) $32000;
       length data $ 48 var $ 2048 varnames $32000;
       rc = run_macro('ExpandVarList_Inner',data,var,varnames);
       if rc = 0 then return(trim(varnames));
       else return('*** ERROR ***');
  endsub;
run;

User-written functions and subroutines are stored in packages within SAS data sets, where a package is defined as a set of routines that each have unique names. On the PROC FCMP statement in this example, the OUTLIB= argument indicates that the function will be created in the data set SASUSER.MYSUBS, within a package called UTILITY.
 
The function definition begins with a FUNCTION statement and ends with an ENDSUB statement. The parenthetical on the statement specifies that the function accepts two arguments, DATA and VAR. Each of these is a character string, specified with the $ sign. DATA and VAR don't have to be the same names as were used for the outer macro's parameters, although they are in this case. Similarly, the user-written function does not have to have the same name as the macro that invokes it.

The $32000 at the end of the FUNCTION statement indicates that the function will return a character string that can have a length of up to 32,000 characters.

The LENGTH statement assigns lengths for the character variables. First specify lengths for the two arguments listed in the FUNCTION statement: DATA and VAR.

"You want to use the variable VARNAMES to hold the expanded variable list that will be returned when calling the inner macro, so give it a length of 32,000," Rhoads says. "All of these variables are local in scope to the user-written function; they'll cease to exist once the function finishes executing."

In the following statement, the RUN_MACRO function is used to invoke the inner macro, which is called ExpandVarList_Inner. The three character variables listed in the LENGTH statement are passed to the macro. DATA and VAR pass along the values specified when calling the outer macro. VARNAMES is the variable into which the macro will place the expanded list of variables.

The RETURN statement returns the result of the user-written function back to the calling program (in this example, the outer macro). If RUN_MACRO was able to successfully invoke the %ExpandVarList_Inner macro, the returned value (RC) will be 0, and you return the trimmed and expanded variable list as supplied by the macro. If RUN_MACRO didn't succeed, you return a predefined error string.

"A zero value for RC only indicates that RUN_MACRO was able to successfully invoke the macro," Rhoads says. "It might still be the case that the macro itself identified some sort of error."

A more robust implementation would pass an additional variable from the user-written function to the inner macro, which the latter could use to flag any errors it may have found.

Inner macro 
The following is the code for the macro that actually does the work:

%macro ExpandVarList_Inner;
      %let data=%sysfunc(dequote(&data));
      %let var =%sysfunc(dequote(&var));
      %local temp_varnames;
      proc transpose data=&DATA (obs=0) out=ExpandVarList_temp;
         var &VAR;
      run;
      proc sql noprint;
         select _name_ into :temp_varnames separated by ' '
             from ExpandVarList_temp
          ;
         drop table ExpandVarList_temp;
      quit;
      %let varnames = %trim(&temp_varnames);
      %mend ExpandVarList_Inner;

The first three statements of the macro reflect the way the RUN_MACRO function actually works. When the RUN_MACRO call includes variables that are to be passed to the macro, RUN_MACRO actually creates a corresponding macro variable for each variable in the RUN_MACRO call and copies the value from the variable in the user-defined function into the macro variable.

Once the macro completes, RUN_MACRO copies the values back from the macro variables into the user-written function's variables.

"This explains why the %MACRO statement doesn't include a parameter list, as you would otherwise expect," Rhoads says.

Another characteristic of RUN_MACRO is that it puts quotation marks around the values of character variables as it copies them into the corresponding macro variables. This explains the need to DEQUOTE the values of DATA and VAR before using them.

Most of the remaining code in the inner macro actually implements the algorithm to expand the variable list. At the end, it trims the text in the local macro variable TEMP_VARNAMES and places it back into the variable VARNAMES, from which RUN_MACRO will copy the value back into the variable declared in the user-written function.

Putting it all together
Once you've defined the outer and inner macros and the user-written function, you can use code such as the following to invoke the outer macro:

options cmplib=sasuser.mysubs;
data d1;
    retain CHAR ' ' X1 X2 X3 X4 X5 Y Z JUNK1 JUNK2 AGE 0 NAME ' '
       SEX ' ' HEIGHT WEIGHT JUNK3 0;
run;
%PUT Final result is %ExpandVarList(data=d1, var=junk2 x3-x5 age-numeric-weight);

The only new concept in this block of code is the CMPLIB= system option, which tells SAS where to look to find any user-written functions that are referenced in your program. In this case, SAS will look in the SASUSER.MYSUBS data set.

If more than one package within the data set contained a function called ExpandVarList, the call to it in the other macro would have to be qualified with the package name (e.g., %trim(%sysfunc(Utility.ExpandVarList(&DATA,&VAR)));).

The important thing, Rhoads stresses, is that the macro that you called within a %PUT statement has caused a PROC TRANSPOSE and a PROC SQL to be executed behind the scenes.

"This enables you to use the full power of SAS to implement the variable-list expansion."

Branching out
"The addition of user-written functions to the SAS programmer's toolkit has expanded our options for creating powerful and flexible SAS programs that are clearly written and easy to maintain," Rhoads says.

"In particular, the RUN_MACRO function now allows us to branch out from a SAS DATA step and execute other SAS data and PROC steps that are contained in a SAS macro."

Rhoads' Macro Function Sandwich technique eliminates the previous limitations of function-style SAS macros.

"There are a lot of situations in which this technique can effectively simulate features that are missing from the SAS language," Rhoads affirms.
"If you've ever complained about some missing capability in SAS, consider whether you may be able to effectively implement it yourself using a Macro Function Sandwich."

Copyright © SAS Institute Inc. All Rights Reserved.

Westat

Business Issue:
What do you do if you need a macro that can be embedded within a SAS statement but the underlying task requires the execution of one or more DATA or PROC steps?
Solution:
Mike Rhoads, a Vice President at Westat, one of the foremost research and statistical survey organizations in the country, has devised a method that allows you to leverage the capabilities of user-written SAS functions to circumvent this limitation.
Benefits:
Your macros can now have the flexibility of a function-style interface while still being able to execute one or more SAS steps in the background.

What can we SAS programmers do when we need a function that doesn't exist in our licensed SAS components?

Mike Rhoads

Vice President, Westat

Read more: