May 2009

18 May 2009 01:33 am

off center 3It seems absolutely dumb to argue that while the quality of data used to make decisions is important, it is actually not that important to have the highest data quality.

Generations of Analysts, Data "People", Decision Makers have grown up with the principle of GIGO. Garbage in, garbage out.

It made a lot of sense for a very long time. Especially because we used to collect so little data, its lack of even a little quality crapified the decision a lot.

GIGO also fueled our every expanding quest for data perfection and data quality. There are entire companies built around helping your "clean up" your data. Especially if you look at the offline traditional business intelligence, erp, crm, data warehouse worlds.

The web unfortunately threw a big spanner into the works.

Couple important reasons.

First, it is important to realize that we collect a lot of data on the web (type of data, elements of data, what not).

Second, our beloved world wide web, remember still a little baby, is imperfect at every turn. We use data collection methodologies that reflect our efforts to do the best we can, but they are inherently flawed. Just take javascript as an example. It is good at what it does. But not everyone has javascript turned on (typically around 2-3%). Zing: imperfection.

A lot of data. Imperfect data collection system.

Here is the most common result of this challenge: The "Director of Analytics" spends her meager resources in the futile quest for clean data.

Money is spent on consultants (especially the "scarady cats" who deftly stir this issue to favor their personal businesses). Everyone tries to reconcile everything across systems and logs. Omniture gets kicked out and WebTrends gets put in, supposedly for it "far superior" data quality (!!).

Makes me sad.

In the debate for perfect data is is important to realize that the reality is a lot more nuanced.

incomplete puzzle

No Possible Complete Data on Le Web.

I humbly believe that the world of data perfection ("clean auditable data") does not exist any more. It did for a long time because life was cleaner, mistakes were human made, sources were fewer and there wasn't enough data to begin with (sure terabytes of it, but of what 300 fields? 600?).

On the web we now have too many sources of data. Quantatitive, qualitative, hearsay (sorry, surveys :), competitive intelligence, and so much. [Web Analytics 2.0 ] But these sources are "fragile".

Sometimes because of technology (tags / cookies / panels / ISP logs). Sometimes because of privacy reasons. Sometimes because we can't sample enough (surveys, usability tests). Sometimes because it is all so new, we don't even know what the heck we are doing and the world is changing too fast around us.

Killing the Holy Cows.

The old people who did BI (me for sure, maybe you?) and moved to the web have had to come to the realization that the old rules of making decisions are out of the door. Not just because that mental model of what now counts for "data" means but also because what counts for "decisions" has changed, the pace at which those decisions need to be made have changed. It took companies a long time to die in the past. That process happens at "web speed" now.

Given all that if I don't change, I'll become a hurdle to progress. If I don't change, I can't help my company make the kind of progress it should.

human evolution

You need to fundamentally rewire your brain, like I have had do rewire mine (it was painful): The data is not complete and clean, yet it is more data of more type and it contains immense actionable insights.

If you would only get over yourself a little bit.

So how to do this if you really do want to be God's gift to web analysis?

The Six Step Soul Cleansing Process.

Based on my own personal evolution in this space I recommend you going through the following six step cleansing process to ensure that you are doing this right, and you move beyond the deeply counter productive data obsession.

1) Follow best practices to collect data, don't do stupid stuff.

2) Audit your data periodically to ensure you are collecting as complete a data set as possible (and as accurately as possible, #1).

3) Only collect as much data as you need: There is no upper limit to the amount of data you can collect and store on the web.

4) Ditch the old mental model of Accuracy, go for Precision (more here: Accuracy, Precision & Predictive Analytics). It might seem astonishing but your analysis will actually get more accurate if you go for precision.

5) Be comfortable, I mean really freaking comfortable, with incompleteness and learn to make decisions.

6) [In context of decision making] It used to be Think Smart, Move Fast. I think the next generation of true Analysis Ninjas will: Move Fast, Think Smart. Remember there is an opportunity cost associated with the quest for perfection.

web data quality cycle

An example of #1 is if you are using third party cookies in your web analytics tool like Omniture or CoreMetrics or WebTrends etc then you deserve the crappy data you are getting. For #2 use various website scanning tools for ensuring complete implementation, each vendor has their own, just ask. #3 is the reason more attempts to data warehouse web analytics data end up as massive expensive failures, or why you then get trapped constantly "mowing the grass".

You are not going to believe me but in #4 if you actually go for precision your analysis will actually get more accurate over time (whoa!).

#5 is the hardest thing for Analysts (and for many Marketers) to accept. Especially those that have doing data analysis in other fields. They are simply not comfortable with 90% complete data. Or even 95%. They work really really hard to get the other 5% because without that they are unable to accept that they could make business recommendations. Sometimes this is because of how their mental model is. Sometimes is is because the company is risk averse (not the Analyst's fault). Sometimes it is out of a genuine, if misplaced, desire to give the prefect answer.

Of course the net result is that lots of data collection, processing and perfection exercises happen. The business is starved for any insights to make even the most mundane decisions. I have had to layoff Analysts who simply could not accept incompleteness and had to have data that was clean and complete. Very hard for me to do.

#6 is a huge challenge because it requires an experience that most of us don't possess. Of having been there. Because of working in companies that plug us into the tribal knowledge and context. Because we work in massively multi layered bureaucracies in large companies. In my heart of heart I believe, sadly, that it will take a new generation of Analysts and a new generation of leaders in companies. Still we must try, even as I accept the criticism that the 10/90 rule is not followed and that we don't have enough Smart Analyst.

So: Best practices that collect as complete a data set as possible precisely allowing you to look beyond the incompleteness resulting you in moving fast while thinking smart.

woman saying no Before You Jump All Over Me and Yell: Heretic!

Notice what I am not saying.

I am not saying make wrong decisions.

I am not saying accept bad data.

I am not saying don't do your damdest to make sure your data is as clean as it can be.

What I am saying is that your job does not depend on data with 100% integrity on the web. Your job depends on helping your company Move Fast and Think Smart.

I am also not saying it is easy.

Reality Check:

We live in the most data rich channel in the universe, we should be using data to find insights, no matter how a little bit off the perfect number they might be.

Just consider this.

How do you measure the effectiveness of your magazine ad? Now compare that to the data you have from doubleclick. How about measuring the ability of your TV ad to reach the right audience? Compare that with measuring reach through Paid Search (or Affiliate Marketing or …..). Do you think you get better data from Neilsen's TV panel of between 15k – 30k US residents to represent the diversity of TV content consumption of 200 million tv watching Americans?

faith based initiatives

There is simply no comparison. So why waste our life trying to get perfect data from our web sites and online marketing campaigns? Why does unsound, incomplete, and faith based data from TV, Magazines, Radio get a pass? Why be so harsh to your web channel? Just because you can collect data here means you won't do anything because it is imperfect?

Parting Words of Wisdom:

Stuart Gold is a VP at Omniture. Here's a quote from him:

"An educated mistake is better than no action at all."


The web allows you to make educated mistakes. Fast. With each mistake you become smarter. With each mistake your next step becomes more intelligent.

Make educated mistakes.


Ok now its your turn.

What do you think of the web data quality issue? What are the flawed assumptions I have made in making my recommendation above? How do you ensure your data is as complete and as precise as it can be? Got tools or horror stories to share? What is the next data collection mechanism on the horizon that will be our salvation on the web?

I look forward to your comments and feedback. Thanks.

Couple other related posts you might find interesting:

04 May 2009 01:07 am

merlot rose A few weeks back I had asked this question on Twitter: Inspire me: If there is one web analytics question you want answered what would it be? What's your juiciest / mundane, daily, challenge?

The result was this post: Top Web Analytics Questions, Twitter Edition.

Those 16 questions (!) were just one part of the story.

My twitter account is linked to my facebook account , so my tweets get posted as my status updates.

That means I got a bunch of questions on the facebook account as well. . . .

facebook analytics question

Here is a summary of the 9 questions / topics that are addressed in this blog post:

  1. Twitter's impact on bounce rates.

  2. Does complete information translate into absolute action?

  3. The most important business questions addressed by Web Analytics.

  4. How to judge someone's talent/ability in being a Web Analyst?

  5. The mystery of "Returning Visitors" having 1 Visit to Purchase! <- Important.

  6. Reliability, and effectiveness, of Predictive Web Analytics.

  7. How to measure impact of Branding activities?

  8. Metrics / Key Performance Indicators to check Daily (!), for any site (!!).

  9. Tips and best practices for Filters and Expressions in Google Analytics.

So here we go, replies to my facebook friends, things that keep them up at night. . . .

#1: Dror Zaifman:

How do you think Twitter will effect bounce rates on web sites ? Meaning do you think that someone reading a Twitter post will get more excited due to the heighten hype on Twitter and therefore might be disappointed with the end result increasing the bounce rates?

twitter birdTwitter will no more increase your bounce rates than say a digg or a stumbleupon or pick your favorite "hot right how" web 2.0 thingy .

In the sense that each of these channels tends to bring new traffic to your site, perhaps a higher percent of them might not be totally relevant for you. But I am not sure that the traffic from Twitter has any higher levels of ADD. :)

As to weather they should be disappointed or not, that's your call. If you/others just use Twitter to hype what you do or push sub optimal content then you lose credibility, followers and more. So the system is "self correcting".

#2: John Quarto-vonTivadar:

True or False? If you had 100% metaphysical certitude analytics coverage and could know anything you wanted to know, would some companies still be unable to increase their conversion rate? I depressingly suspect the answer is True. Remember I am not a rocket scientist. You need to dumb this down for me! :)

Let me rephrase John's question (he is a rocket scientist!): Even if we had all the data in the world would some companies still stink beyond belief in their ability to improve conversion rate?

I am afraid the answer, as John predicted, is a depressing True.

This is not a data problem. It is a people problem. Or perhaps better put it is an Organization Behavior problem.

y2k clocksI think most of the time we underestimate two things:

1] Data is just data and you need to invest in analysis (and hence people) and most companies just want tools (or as you put it "acquire the solution"). At some point tools will move from simply puking data to giving insights with no human requirement. That day is not today. Or tomorrow. Or 2010.

2] It takes a lot to get over oneself (in this case the HiPPO's), we can present data and win arguments yet people have deeply entrenched opinions that they are unwilling to set aside to actually implement what the data says. And of course I am not even going to touch on politics and solving for vested personal interests.

For example I am dealing with someone now who is doing the worst possible thing for the long term simply because he/she can get a promotion in the short term. And that's not even the worst of the problems that "data" has to deal with every day.

Result: Lower Conversions.


#3: Eric Werner:

What are the most common important business questions addressed by web analytics? – I find that a lot of marketing managers who are newly introduced to analytics say this is great – so what should I measure? I tell them it depends on the business questions they want answered and then they ask what questions should I want answered?

The single greatest root cause of failure with web analytics is the unwillingness or inability to understand what the site is trying to do, and hence defining goals.

While the real answer to your question is: it depends, let me try to see if I can help.

First tell them that Web Analytics can help measure three specific Outcomes from a website (more in the twitter analytics post):

1] Increased Revenue.

2] Reduced Costs.

3] Improved Customer Satisfaction/Loyalty.

Your question to them is: "Which of these are you working on? I can help you measure each or all of these if you tell me what you are doing."

red question

That should help focus them a bit and secondly get you started with the most perfect start in Web Analytics: Tying numbers to Outcomes (leads, conversions, loyalty, phone calls, downloads, whatever).

If they refuse to tell you which of the above they are solving for. . . first submit your resume at and start looking for a job, the company you are working for is going down. . . then tell them that web analytics can help answer these questions:

Q1: What is the intent that is driving people to our websites?
[Use the search keywords report, and internal site search.]

Q2: How do people find our websites?
[Use your referring url's reports.]

Q3: How many people land on our site, puke on it, and leave right away?
[Use your bounce rate data, for site, keywords and ref urls.]

Q4: What content do people consume on our website?
[Use your content reports, top content, plot a head and tail curve, that will get you a big hug!]

Q5: What calls to action, navigational elements do people engage with on our pages?
[Use the site overlay report, for your top 10 most viewed pages.]

Q6: Where are you spending money inefficiently?
[Use the campaigns reports, focus on where your company / Marketers are spending money right how: Search, Email, Affiliates....]

Q7: Are we making money? Reducing cost? Increasing Customer loyalty?
[Sorry could not resist, I had to hammer this in again, it is so important you measure this.]

Hope this helps Eric. More on this post if you are interested: Tips for Web Analytics Success for Businesses.

#4: Tal Galili:

How can you (and where can't you) quantify a persons talent/ability in being a web analyst? How could I judge my own performance as a web analyst?

I look for:

1] Critical Thinking
(Interviewing Tip: Stress Test Critical Thinking. Please.)

2] Business Experience
(If all they have is button pressing / report publishing experience they might be very young in their career and that is ok, but if not I am looking for people who have business / marketing / finance experience, if they are a Marketer they get bonus points from me.)

3] High EQ (emotional quotient)
(Wikipedia: The ability, capacity, a self-perceived ability, to identify, assess, and manage the emotions of one's self, of others, and of groups. Me: One person, no matter how high on the IQ scale can rarely change organizations alone.)student report card

4] Flexibility in thinking, an openness to new information.
(You might think this is obvious, who is dumb enough to stay dug in when faced with new information. You'll be surprised. I also look for people whose core thinking is not rigid, they realize the world is not perfect, they realize data is incomplete, they realize web Analytics, as in clickstream, is not panacea.)

5] Knowledge Seekers
(I mentioned in a recent interview that I spend three to four hours a week learning something new about our field. Trying new tools. New types of analysis. Reading non-pompous-only-theory-gossip blogs that share new methods of thinking. Attending free webinars in broad fields. I feel I am not doing enough. Find people that invest atleast that much a week.)

If you are doing these things I think you are on the right path. Certainly learn to use more tools and what not. But enrich your mind, keep it open and flexible, think like a marketer.

Good luck Tal.

#5: Robert Patterson:

Can you please explain how the Visits to Purchase Google Analytics report segmenting new and returning visitors can show returning visitors making a 1 visit purchase? Wouldn't that make them a new visitor purchase and not a returning visitor purchase?

Robert's question is one of those that you dig into and discover something deeply sub optimal. I am especially sad because this is one of my favorite pan session report.

Here's what Robert is asking:

google analytics visits to purchase 3

See that red arrow? If someone is a "Returning Visitor" why would the report say they purchased after one visit?

Why do you think that is?

Think about it….

Take a guess….

Aw come one…..

I'll give you one more try…. come on Justin you know this one….

Got the answer?

Its not what you thought.

This report is wrong in Google Analytics. Well that's not entirely right. Technically the label on top of the report is wrong. And what it actually measures then makes it useless.

I am getting ahead of myself.

What this report actually measures is: Visits to Purchase from the last Campaign.

Only it does not say that either in the label or in the in page help.

Take three scenarios:

Angie: Visit from paid search campaign. Direct Visit. Visit from email campaign -> Purchase.

What the report shows: Visits to Purchase: 1. (See how Angie is a returning visitor? Yes.)

What a correct report would show: Visits to Purchase: 3.

Jennifer (Angie's bff): Paid search visit. Direct visit. Organic visit. Bookmark visit. Direct visit. Direct visit -> Purchase.

What the report shows: Visits to Purchase: 3.

What a correct report would show: Visits to Purchase: 6.

Judith (Angie's part of the time bff): Affiliate visit. Direct visit. Bookmark visit. Direct visit -> Purchase.

What the report shows: Visits to Purchase: 4.

What a correct report would show: Visits to Purchase: 4.

In summary, Google Analytics will only count the number of visits after a campaign (and campaigns in GA are email, affiliate, paid search, organic search,.. literally everything except direct/bookmark) and show that on this report.

That means this report, another one I love, is also wrongly labeled:

google analytics days to purchase

The correct label for this is Days to Purchase from the last Campaign.

I am sure the team at Google will fix the label.

The challenge of course is that while the name change will mean the report will have the right description, it will essentially be useless.

Just look at the above three scenarios. If they are all in the Days/Visits to Purchase from the Last Campaign what actionable insight do you get?

You are still ten million miles away from understand how long does it take for someone to convert.

The fix is not a change in the label, the fix is scraping the report and actually creating a real Days to Purchase and Visits to Purchase reports. If I want to know how many days/visits it takes someone to convert from a organic or paid or email campaign I can always segment that data and view the clean report. Here I don't even know what the "last campaign" was.

Sorry Robert. And to all of you as well.

#6: John Stansbury:

Based on performance through yesterday, how reliably can I predict where we'll end up EOD today? (Initial promising results using Holt-Winter adaptive forecasting, but time- and effort- intensive.) Additionally, how granular is too granular for actionable analysis? Is that determined by the agility of your site to adapt?

I wish there was a easy answer to this, sadly no.

Both your questions can be very specific to the business, the goals of the website, seasonal factors unique to you, the overall business strategy (and sub components of that applied on the web), yada, yada, yada.

But regardless of your business you'll face these six challenges in your attempt to do "predictive analytics" on your web data:

data mining and predictive analytics challenge1

All the details are in this post: Data Mining And Predictive Analytics On Web Data Works? Nyet!

As to your second question, how granular is too granular for actionable analysis, you'll typically work with a portfolio. As you execute your analysis train yourself to recognize when you are reaching the point of diminishing margins of return. Then you stop, move on to the next thing.

More in this post, see #3: Bounces, Abandonment, Visitor Ratios & Data Drops!

One last tip, always seek to balance what you can do (analysis/insights) with what your company/site/HiPPO can actually action. What they can action might not be the top nine powerful actionable high impact things, they could only do ten through fifteen. Then forget the top nine.

Sucks. I know.

#7: Martin Leblanc:

How do you measure the effect of branding activities?

This is a complicated issue and I might not do it complete justice in a short reply, but let me outline some broad brush strokes.

I believe that branding is a worthy Marketing goal. It gets people to associate, hopefully, positive attributes with your products and services. Here's one of the masters at branding:

abercrombie fitch email campaign

Abercrombie & Fitch . The image above is not their website, it is their complete email campaign. The minor text at the bottom is the opt-out and their address. No call to action (!).

It certainly evokes an emotional reaction, perhaps a brand attribute they would want associated with them.


I firmly believe that every marketing activity has to drive outcomes. It can drive it now, it can drive it in 30 days, it can drive it in six months.

I believe that if you do "branding" you need to define an outcome, increased store sales, more people to the site, more leads for a future concert, newspaper stories from your out of the world campaign, something else.

If there is an outcome you can measure it. My favorites for measuring impact from branding campaigns, for the web:

  • Increased Visitor Loyalty and Recency measures post campaign.

  • If related to a product, increased sales (even if latent conversions).

  • Improved "likelihood to recommend" scores, during / post campaign, as measured by exit surveys.

My favorite way to measure impact of branding campaigns is to do rigorous controlled experiments. They can prove anything, trust me.

For more on this check out #6 here: Multi Channel Analytics.

#8: Claire Devereux Thompson:

I have to check many client sites every day to make sure that things are going smoothly – what's the one thing that I need to look at if I only have a minute for each?

I was stumped so asked Claire for a bit more.

Claire clarified (say that five times :) that her clients include non-profits that use the web to raise money, a small art school, a large regional furniture site, a online only gift store.

I am still stumped!

The real answer of course is: It Depends.

Fat good that does Ms. Thompson. So let me try to pull a rabbit out of the hat.

My first stab at this would be to look at Outcomes (Goals).

google analytics goal convresion report

The above data is for a non-ecommerce, not for profit website. It has four goals, and for each quantified goal values.

It is easy to see daily progress (if that's important), certainly weekly, by looking both at the sparklines next to each goal and also the two sweet numbers at the bottom.

Bottom line: Bottom line is important. Ask each biz to do this, add it to your dashboard. [Ideas for goals for different sites here: Measure Macro AND Micro Conversions.]

My second bunny, sorry tip, would be to focus acquisition, and the idea of the What's Changed Report.

After you install the Enhanced Google Analytics plugin from our friends at Juice Analytics you'll be able to something like this:

google analytics whats changed report

In your referring sites report you can click on the Who sent me unusual traffic? button and it will show you sites that have increased by 50% in traffic, or dropped 50%.

In the Keywords report you'll see the same thing but with search keywords.

Both help you get away from the top 10 reports, that rarely change, and help you identify big shifts in keywords and referrers which should in turn help you know if something needs your attention.

I hope the above two sets of ideas help, but what I want you to focus more on is the philosophy I am advocating: 1) Start with Outcomes, always. 2) Focus only on what changes, that mining will help find gems.

Oh and it is a bit of work, even every day. No insight worth monetizing is ever free. Hmm… that's pretty profound. No? :)

#9: Robert Kennedy:

How to get the most out of your filters/ expressions in Google Analytics. I am always pushing that angle of analytics, sometimes I mix up some wild concoctions :). Seems you are only limited by knowledge and imagination with no floor or ceiling. What is the best resource for filters and expressions?

Can I fess up that I never use them, mostly because perhaps I am not doing the same kinds of analysis.

There is one other reason. I have this constant hyper filter on: what's the marginal value of me digging a bit more, doing this fancy filters/expression magic?

google analytics shortcuts For me Advanced Segments suffices most of the times.

All that said three resources for you:

  1. Robbin "I am the queen of GA expressions" Steif: Regular Expressions Part XII: Bad Greed. Yes that's part 12!

  2. EpikOne's Regular Expression Filter Tester. Its really good for QA'ing things before you put them in the wild.

  3. Not owing Justin Cutroni's Google Analytics Short Cuts book is now considered a felony in 49 states (all except Utah for some reason). So get it!

I think that should get you going Robert, perhaps it is time for you to start tweeting your favorite expressions and filters? :)

There you go, nine questions that were top of mind for people who ran into my request for inspiration on Facebook.

These are very broad and complex questions, very difficult to answer in a short Q&A, but I hope you all find the answers to be be of some value.

Ok now your turn.

How would you have answered any of these questions differently? Did I miss something in one of the answers? Agree, disagree, shout with joy, cry with pleasure. . . do please share your thoughts.

Thank you.