GEOCODE Procedure

Understanding Range Geocoding

Overview of Range Geocoding

Range geocoding matches individual address values to lookup data containing a range of values. This method requires specifying a second data set with the RANGEDATA= option, in addition to the LOOKUP= option. There are no default variable names with this method. You must specify all data sets and variable names.
IP address data is a form of range data. IP data was not designed to be geographic, like street addresses. For this reason, the process of adding coordinates for IP addresses is usually called geolocating rather than geocoding.
Generally, IP addresses are collected from visitors to Web sites and indicate the connection the visitor used. IP address lookup data contains information that matches ranges of IP addresses to particular geographic locations. A range of IP addresses usually belongs to a company or an Internet provider. The location found will not be at the street or even ZIP code level, but might indicate the city, state, or country where the IP address is registered.

About Range Lookup Data

Range geocoding requires a lookup data set and an additional range data set.
  • The lookup data set contains geographic coordinates (latitude and longitude).
  • The range data set identifies the ranges (of IP addresses or of other items).
A KEY variable links the two data sets. Both data sets must contain this variable in order to identify locations for each IP range. Internally, the proper range is found, and then the key value is used to access the lookup data set to find the latitude and longitude for that key.
The lookup data set must contain the following variables:
  • an X variable that contains the longitude value of the center coordinate. The default variable name is X.
  • a Y variable that contains the latitude value of the center coordinate. The default variable name is Y.
  • a key variable that corresponds to a key variable in the range data set.
The range data set must contain the following variables:
  • a variable that specifies the beginning value of a range of IP addresses
  • a variable that specifies the ending value of a range of IP addresses
  • a key variable that corresponds to a key variable in the lookup data set
You can obtain lookup and range data from third-party vendors. One vendor is MaxMind, Inc. at www.maxmind.com. You can use the %MAXMIND autocall macro to convert comma-separated value (CSV) files from MaxMind into SAS data sets.
You can specify that non-geocoding variables from the lookup data set be added to the output data set by using the ATTRIBUTEVAR= option in the PROC GEOCODE statement.

%MAXMIND Autocall Macro

Overview of the %MAXMIND Autocall Macro

The %MAXMIND autocall macro enables you to convert IP geocoding data from MaxMind, Inc. into SAS data sets. The %MAXMIND autocall macro supports MaxMind's IP data in comma-separated value (CSV) format.
The %MAXMIND macro uses the following macro variables:
CSVPATH
specifies the path where the MaxMind CSV files are located. You must extract the files from the ZIP archive before using the %MAXMIND autocall macro.
IPDATAPATH
specifies the path where the output SAS data sets are created. You must have write-permission for this path.
CSVBLOCKSFILE
specifies the filename for the CSV file that contains IP address range values. The file that you specify must contain the startIpNum and endIpNum variables.
CSVLOCATIONFILE
specifies the filename for the CSV file that contains longitude and latitude values.
CSVCOUNTRYFILE (Optional)
specifies the name of the optional MaxMind CSV file that contains country names.
WORKPATH (Optional)
specifies the path where temporary files are written. The default path is the path for the WORK library.
The %MAXMIND macro creates the CITYBLOCKS and CITYLOCATION data sets in the path that you specified for the IPDATAPATH variable. The libref IPDATA is created automatically for this path.

Usage Example for the %MAXMIND Autocall Macro

In this example, data from MaxMind was extracted to C:\Mydata. The output SAS data sets are created in the directory C:\Geocode.
The following code imports the data:
%let CSVPATH=C:\Mydata;
%let IPDATAPATH=C:\Geocode;
%let CSVBLOCKSFILE=GeoLiteCity-Blocks.csv;
%let CSVLOCATIONFILE=GeoLiteCity-Location.csv;
%let CSVCOUNTRYFILE=GeoIPCountryWhois.csv;
%MAXMIND
The imported data sets are IPDATA.CITYBLOCKS and IPDATA.CITYLOCATION.

Tips for Range Geocoding

The following table contains suggestions and comments for the RANGE geocoding method.
Category
Suggestions and Comments
Most recent lookup data
Obtain the most recent lookup data from the MaxMind site and import it with the MaxMind AUTOCALL macro. The Web site is located at http://www.maxmind.com/.
Correct data
If an input IP address fails to match, the IP address might contain transposed digits.