Shatterproof Analytics Part 2: Eliminate Self-Referrals

by Stephen on August 16, 2010

This series of posts is about making sure the visitor data you capture with Google Analytics is as clean and as accurate as possible. In part 1 of this series I wrote about using canonical URLs to help ensure that any given page on your site is tracked properly, even when your site visitors might visit it under different URLs (a situation which would normally cause Google Analytics to treat URL as a separate page).

In this part we’re going to clean up another common source of Analytics errors, which is self-referrals. Frequently websites are either accessible under differing domain names, or set up subdomains to accomodate special features, such as a shopping cart system or discussion forum.

Even simple websites are often visited both with and without their leading “www,” leading Google Analytics to believe you’re using your tracking code on two completely different websites. That’s may not be terribly bright of Google Analytics for the vast majority of their users, but it provides flexibility to their power users.

The problem is when Google thinks we’re tracking different sites, it allows the different domain variations (www.example.com, example.com, secure.example.com) to show up as referrers for each other. You want to be able to look at your referrer reports and see who *else* is sending you traffic, not what your site users may be doing internal to your website.

There is fortunately a simple way to unify your site visitor data, and thereby eliminate these self-referrals. A quick two additional lines added to your google tracking code snippet will do the trick.

_gaq.push(['_setDomainName', '.cluepad.com']);
_gaq.push(['_setAllowHash', false]);
    The first line specifies that Google Analytics is supposed to treat all subdomains of cluepad.com (note the leading dot) as one website. The second line ensures that google will use the same tracking cookie for your visitors across all of your subdomains.I actually didn’t have much luck trying to implement this change with Google Analytics’ “traditional” tracking code, but had good results with their asynchronous snippet, which they now promote as their preferred snippet. We’ll talk about what makes the asynchronous snippet so much better in part 3.

    Here is our code from part one that includes the canonical url fix, with these two new lines added (lines 4 and 5):

<script>
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-XXXXX-X']);
_gaq.push(['_setDomainName', '.cluepad.com']);
_gaq.push(['_setAllowHash', false]);
_gaq.push(['_trackPageview', canonical_url() || window.location ]);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? '<a href="http://www.google.com/url?sa=D&q=https%3A%2F%2Fssl" target="_blank">https://ssl</a>' : '<a href="http://www.google.com/url?sa=D&q=http%3A%2F%2Fwww" target="_blank">http://www</a>') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
    Placing this snippet on all of your subdomains will eliminate your self-referrals going forward, though it won’t do anything for data already captured.The one caveat that Google makes about this technique is that by default Analytics only records everything *after* the domain name ( the /index.html in http://example.com/index.html). This means you won’t know whether /index.html came from www.example.com or secure.example.com, where they may be genuinely different content. To get around that problem, they recommend setting up a filter which instructs Analytics to record the full URL, including domain name.

Previous post:

Next post: