Recommended Data Source

The recommended data source is the SAS page tag log.
Although the SAS Data Surveyor for Clickstream Data processes standard Web server log files, these files are limited in the following ways:
  • They provide a limited set of data.
  • The data is captured only from the perspective of the Web server.
  • The data includes every request to the Web server, even for files that are typically not of interest (such as image requests and spider or robot requests). This situation results in larger data volumes and a need to perform a great deal of filtering of the files.
  • Some user actions are not captured. For example, browsers commonly cache pages. In that case, the use of the forward and back buttons in the browser does not result in a new request to the Web server. This processing results in user activity that is missed in the Web log.
These limitations of standard Web logs can be overcome with the use of a method of client- side (browser) data collection called page tagging. The page tagging method does not rely solely on the information that a Web server can gather. Instead, it uses the Web browser to gather data not normally logged by the Web server. The browser can gather this data because a small piece of code has been inserted into each page for which data is desired. This piece of code is known as a page tag. SAS provides a page tag solution with this product, which is referred to as the SAS page tag.
The SAS page tag runs inside of the user’s Web browser when the user accesses a tagged page. The SAS page tag code has access to additional information from within the browser that is not normally available in a standard Web log. Once this data has been accessed in the browser, it is collected by sending it to a Web server. The Web server then stores in its Web log file only the requests for those pages that were tagged. When a Web server is used in this way (to collect clickstream data from tagged pages), it is referred to as a clickstream collection server. For a list of the data collected by the clickstream collection server, see Inserting SAS Page Tag Code.
Working together, the SAS page tag code and one or more Web servers configured as clickstream collection servers provide a framework for client-side data collection. The actual data that is tracked is controlled with the SAS page tagging code that you insert. For more information, see the SAS Data Surveyor for Clickstream Data 2.2 Page Tagging JavaScript Reference at http://support.sas.com/clk22.