Best Practices for Preparing Standard Web Log Data

Review Best Practices for Clickstream Jobs

You should review the practices covered in Best Practices for Clickstream Jobs in addition to the following best practice.

Use the Same Character Set on Web Pages

When you process standard Web log data, it is the character set defined on the Web page that determines how the data sent to the server is encoded. Character set information is not included in a standard Web log record. Therefore, we have to assume that all records come from Web pages using the same character set. The recommended approach is to ensure that all Web pages whose accesses are logged to the same Web log use the UTF-8 character set. If they do not use UTF-8, then they should all use the same character set regardless and that value should be selected in the Encoding of incoming data option on the Clickstream Log transformation Options tab.
If the Web pages do not use the same encoding, then the Decode incoming data? option should be set to No.