big and smallThis week history is on everyone's mind, especially in our little world of web analytics.

Yes history when it comes to the economy (no one likes a recession!), the war (ok no one like this for sure!), the elections (raise taxes! no, no, cut taxes!) or the Super Bowl (go Giants!!).

But with all that I am positive this week the thing top of mind for many web analytics practitioners is what to do with their historical web analytics data.

Do we switch or do we not switch? What happens to my tags? What about my contract? Where do I log it? Is it time to panic? I have seven years of history, help!!

For the first few questions I am afraid you'll have to answer for yourself.

[Although: While old habits / tools are hard to give up, it "takes about three weeks for a new habit to be hard wired in your brain" (source). That should give you some solace!]

This post tries to take a position on your last worry, historical data and its value.

We have all been brought up to cherish data. To love it, to adore it, to propose marriage, to stick together for better or for worse, to build increasingly vast and complex systems to keep it around and to tap into cloud computing along with a small army of people in your company to keep that data happy.

Most of the time while this sounds like a good mindset (/marriage), especially in the traditional world of ERP and CRM etc systems, on the web unfortunately this is can be a deeply sub optimal mindset (/marriage).

My proposal for you is to divest yourself of this mindset of keeping Web Analytics data around forever. If you, or your HiPPO's, have this mindset then a quickie divorce might be greatly helpful.

The über thought to keep in mind is that from the moment you collect it your web analytics data starts to decay and lose value. Oh it is useful on the first day, and the first month, but less so in six months, and click level data is nearly valueless in a year or so.

Why you ask?

  • Your Visitors "change" too much.

    Remember that at the end of the day almost all of us collect anonymous non-personal data from our Visitors. They swap browsers and machines and upgrades (if not outright blow your cookies away every day!) the data is less useful in identifying any usage trends and patterns tied to people.

    This is less a problem in traditional Data Warehouse environments.

rapid change 1
  • Your computations change too much.

    We were all on third party cookies and then you moved to first party cookies (say yes to this one!!). Most of your visitor stats just became uncomparable.

    You shift from logs to tags to tags of a new vendor to tags of your newest vendors and now you are going tagless! You are now comparing a bowl of chopped apples to fruit salad.

    Vendors and practitioners have changed basic formulas for measuring the core stats every every so often. They rarely reprocess history (too hard!) making it hard to provide continuity.

  • Your systems change too much.

    At the end of the day three things are captured by your analytics tool. The Referrer. The page URL. The Cookie.

    As you evolve your web site platform, from Interwoven to ATG, or move it around, hosts / servers, or add remove functionality like internal search or recommendation engines or multivariate testing or behavior targeting or other such things it usually impacts all three of those critical things that make up your data.

    Resulting impact can make your data disjointed.

    And I am not even touching changes from static html to dynamic html to personalized content to flash to flex to ajax to RIAs (Rich Internet Applications) etc etc (all of which again impact the three pieces you collect).

  • Your website changes too much.

    This is perhaps the biggest thing that most of us don't reconcile with. Most websites are on the Yahoo! "paradigm" and not the Google "paradigm". . .

yahoo google home page evolution 1

      . . . and that is quite ok (though the image above might suggest otherwise).

      Your home page three months ago is not your home page now (is it? hopefully not!). You killed your have product line pages last year, opting for product detail pages for SEO reasons. There was no PayPal last week. Maybe 2007 was your first year with the support and ecommerce sites merged into one.

      In the last six months you have learned so much about your business, about your data, about your visitors, about how fast you are being left behind (or are ahead of everyone else!). . . and your web presence has changed accordingly.

      Every change above changes the data you have and what value you can get from me three months from now (when you will have changed even more). It is important not to forget this.

  • Your people change too much.

    Sad as it might be the hardest people to find now are web people. Not just great web analysts, which we know are scarce (!), but web people in general. Front end, back end, middle tier, thin, overweight, rich, poor, newbies, experienced, all kinds are hard to find.

    As people come and people go their actions have a subtle but important impact on all aspects of your data ecosystem.

You have a few years of historical web analytics data. Give me the benefit of the doubt, for just five minutes, and think of the above five items with a kind of sort of open mind. Would you still keep terabytes of data from two years around?

The pace of change on the web is so tremendous (pages, sites, business rules, experiences, applications, data capture, what's right and what's wrong, what's doable). In this hyper fast environment all the detailed data, perhaps you'll agree, is not very useful because it has decayed too much.

opportunity chineseIt's the decay that is the root cause.

But at the same time it is also an opportunity. Because it means that you are not tied to the past in a egregious manner. It means you can think smart and move fast. If what you have now will be of less value soon then you will cherish the now more try to get something out of it.

It also means that it gives you the freedom not to be tied to legacy systems or legacy tools or legacy data. You can move forward to the next and better much faster than our Sisters and Brothers in the traditional world have been.

It means a lot more fun because you get to learn and adapt and get value and move on. It is damn exciting and damn liberating!

Yes, yes, yes, you knew this was coming . . . . .

Keep some history around. Aggregated data. Historical markers.

Weekly trend (counts) of Visits and Unique Visitors. Top ten referrers to your website by month. Monthly Bounce Rate. Weekly + monthly trends for Revenue and Products Sold. Perhaps Conversion Rate trends for your site Overall and important campaign categories. Top groupings of content consumed on the site.

Aggregated data, for your critical few metrics (that won't become less important with time!). And some revenue stats just to prove that you're worth it!

aggregrated data

Keep that around as long as you have it. It will all fit on one tab in a Excel Spreadsheet. That is all you'll need. They might be slightly different for you than above, but I assure you it will fit in a spreadsheet.

Or if you prefer here is another suggestions. . . . .

Keep your "click level" (detailed) data around for a year (assuming seasonality!) and your "session level" (aggregated) data for as long as you want to / have to (and it will fit in a spreadsheet).

In closing:

It is extremely difficult to get anything out of your web analytics that you can action right now. I humbly recommend that in the drive to conquer history that you don't forget the present and ignore the price that you'll pay every day that will come for every day that has gone by.

History is important in other context, but in web analytics tools, for now, change on the web reduces value from old data. This might cease to be the case at some point, but that point is some ways away.

Before you cut a big chq for your consultant, consider the above, consider what you are actually buying.

Oh and those of you worrying about switching tools & losing data, worry not (too much): Go forth and prosper!

As always it is now your turn. . . .

Agree? Disagree? What am I missing? Am are in Antarctica all by myself on this one? Am I wrong to think this is our own version of "an inconvenient truth"? What's your experience?

Please share your perspectives, critique, bouquets and brickbats via comments.

[Like this post? For more posts like this please click here, if it might be of interest please check out my book: Web Analytics: An Hour A Day.]

Social Bookmarks:

  • services sprite
  • services sprite
  • services sprite
  • services sprite
  • services sprite
  • services sprite
  • services sprite
  • services sprite
  • services sprite
  • services sprite
  • services sprite