www.sas.com > SAS UK > In the Know Homepage Search | Contact Us    
SAS UK Newsletter Banner SAS - The power to know(tm)  

How Can I Use SAS to Compare Demographic Profiles

(written by Eli Y. Kling, GeoBusiness Solutions Ltd)

Comparing a profile of a subset to the population profile is a standard practice for demographers. For instance, comparing the car ownership profile of inhabitants of a certain city to the national car ownership distribution. Another example is the comparing the ethnicity profile of customers that purchased a certain product to the full customer base ethnicity profile.

It is common practice to compare the profiles using a statistic generally referred to as 'Index'. The following tip describes how to create a meaningful plot of this statistic. Rather plotting the values of the Index it is shifted by 100 to represent the relative difference. A format if automatically created for the presentation of the correct index values.

Lets start with a representative demonstration data set:

*Create Demo data;

data demo;
 input level $ 1-10 Subset_Percent Base_Percent;
 datalines;

0 - no car 50 20
1 car      10 40
2 cars     15 10
4+ cars    25 30
;

run;

*Now calculate the index and plot the profile and Index ; 
*Calculate the index;

data _Plot;
 set demo;
 Index=Subset_Percent/Base_Percent*100;
run;

Proc sort data=_Plot;
 by Level;
 run;

*set up for the plots;
PATTERN1 color=CX0F3E93;
axis1 label=(f='Verdana/bold' h=12 pt "Percent") minor=None value=(f='Verdana' h=10 pt);
axis2 label=(f='Verdana/bold' h=12 pt "Index") minor=None value=(f='Verdana' h=10 pt);
axis3 label=(f='Verdana/bold' h=12 pt "Car Ownership") minor=None value=(f='Verdana' h=10 pt);

*Plots 1 & 2;
proc gchart data=_Plot;
 title 'Plot1: Profile';
 hbar level/sumvar=Subset_Percent type=sum DISCRETE raxis=axis1 maxis=axis3 Description="Profile" nostats ;
 run;
 title 'Plot2: Index {not formatted}';
 hbar level/sumvar=Index type=sum DISCRETE raxis=axis2 maxis=axis3 Description="Index" nostats ;
 run;quit;
 
 
The resulting plots are:

Plot 2 does not emphasis the relation to the base (100%).

The following creates a format, a shifted index and the desired plot:


*Create the format;
data Index_Fmt;
 fmtname="Index";
 do start=-400 to 400;
 end=start;
 label=compress(start+100)||"%";
 output;
 end;
run;
proc format cntlin=Index_Fmt;run;
* Calculate the index for plotting;
data _Plot;
 set demo;
 Index=Subset_Percent/Base_Percent*100-100;
 format Index Index.;
run;
*Plot 3;
proc gchart data=_Plot;
 title 'Plot3: Index {formatted}';
 hbar level/sumvar=Index type=sum DISCRETE raxis=axis2 maxis=axis3 Description="Index" nostats ;
 run;quit;

The above produces the desired plot: