DSCF1255 smallThere are many different options at our disposal when it comes to collecting web clickstream data. We can use web logs, web beacons, javascript tags and packet sniffers. Each methodology comes with its own unique set of benefits and challenges.

[Read this entry in wikipedia for pro's and con's of web logs and javascript tags, Dr. Stephen Turner has done a great job there. Aurélie's post and Juan's post have great insights into packet sniffing as a source of clickstream data.]

But if one takes a quick pulse of the Practitioner conversations around data capture it becomes clear very quickly that the largest number of current implementations (shear volume) use either web logs (usually due to history) or javascript tags (usually due to recent evolution of most vendors simply abandoning all other methods except this one).

The secondary level pulse is around people debating which of these two methodologies is “better” and hence which one should they be using. There are lots of conversations that outline benefits of one methodology or the other. There are even more technically nuanced geeky conversations by one party bashing the other.

What is missing is someone risking their neck and going out on a limb to make one recommendation when it comes to choosing web logs or javascript tags (assuming that you have ruled out the others). Never one to miss the opportunity take a unnecessary risk I’ll go out and make a recommendation:

    You should use JavaScript tags as your weapon of choice when it comes to collecting data from your website.

The only assumption is that you don’t have a website that is so amazingly unique that there is no other website with a web serving platform on the planet like yours.

Here are four important reasons for picking a side (that has not hurt Fox News and I am hoping it won’t come back to bite me either, their slogan is: We Report. You Decide):

Separating Data Serving & Data Capture (gaining efficiency and speed):

    With web logs data serving (web pages with data going out from your web servers upon user requests) is tied completely with data capture (as the web pages go out the server logs information about that in web log files). Every time you want a new piece of data you are tied to your IT organization and there ability and structure to respond to you. In most companies this is not a rapid response process.

    With javascript tags data capture is separate from data serving. Web pages can go out from any where (from the company web server, from the visitors local cache or from a akamai type, or ISP, cache farm) and you will still collect data (page loads, javascript tag executes, data goes to server – asp or in-house).

    The beauty of this is that the company IT department and website developers can do what they are supposed to do, serve pages, and the “Analytics department” and do what they are supposed to do, capture data. It also means that both parties gain flexibility in their own jobs, speaking selfishly this means the Analytics gals/guys can independently enhance code (which does not always have to be updated in tags on the page) to collect more data faster.

    The reliance on IT will not go down to 0%, it will end up around 25%, but it is not 100% and that in of itself opens up so many options when it comes to data capture and processing.

Type and Size of Data:

    Web logs were built for and exist to collect server activity, not business data. Over time we have enhanced them to collect more and more data and store it with some semblance of sanity to meet the needs to business decision makers. They still collect all the technical data as well as the business data (often from multiple web servers that support a single website each of whom has a log file that then needs to be “stitched back” to give the complete view of each user).

    Javascript tags were developed to collect clickstream data for business analysis. In as much they are much more focused about what they do and only collect data that they need (though admittedly not all the javascript tags running around are smart and they do collect unnecessary data). What this means is that with javascript tags you have a much smaller amount of data to capture, store and process each night (or minute or hour or days) and it can be a much saner existence (logically, operationally and strategically).


    For better or for worse most vendors are moving away from supporting versions of their products that support web logs as a source of data. Many only offer javascript tag (or packet sniffer) versions of their products. History will decide if this was a good thing but the practical implication of this is that most innovation that is happening in terms of sophistication of data capture, new ways of reporting or analyzing data, meeting the needs to Web 2.0 experiences etc are all happening in the javascript data capture environments.

    This presents us with a stark choice of having to build and own our own company only customized means of capturing this new data and keeping pace with other innovations or relying the on the expertise that is out there (regardless of which Vendor you prefer) and keeping pace with all the innovation.

    Often this is a easy choice to make of any company that considers its core competency to be to focus on its business and not developing web analytics solutions (though admittedly if you are Wal-Mart you can absolutely do that – for example they have invented their own database solution since nothing in the world can meet their size and scale).


    Increasingly we are heading towards doing a lot more measurement and customer experience analysis beyond just clickstream. Two great examples of this are experimentation and testing (especially multivariate testing) and personalization / behavior targeting. In both cases “add-on” solutions are tacked on to the website and testing / targeting happens. Often these solutions come with their own methods of collecting and analyzing data and measuring success.

    But as we head for a integrated end to end view of the customer behavior, for optimal analysis, we have to find ways of integrating data from these add-ons into the standard clickstream data (else you are optimizing just for each add-on which is not a great thing).

    Integrating with these add-on solutions (which often also use javascript tags and cookies and url identifiers) is significantly easier if you use javascript tags. It is easy to read cookies in web logs etc, but he pace at which you can integrate and the ease at which you can integrate is faster if you are using javascript tags.

It is important to point out that you should consider your choice in the context of your own unique needs. Please read carefully for detailed pros and cons of each data capture methodology (because javascript tagging does have important con’s that need to be considered carefully, web logs also have their benefits, including obvious ones like they are the only place your find search robot data).

In the end though if you have to make a choice between web logs and javascript tags and 1) you need some “advanced non-standard” considerations you should think through then they are above 2) if you want someone else to make the choice for you then it is above.

If you love Web Logs, what's wrong with the recommendation above? If you swear by Packet Sniffing, is it superior to tags in the four ways outlined above? If you can’t live without JavaScript tags, what else is missing above? If are are confused and this helped, please share that via comments!!

[Like this post? For more posts like this please click here.]

Social Bookmarks:

  • services sprite
  • services sprite
  • services sprite
  • services sprite
  • services sprite
  • services sprite
  • services sprite
  • services sprite
  • services sprite
  • services sprite
  • services sprite