Google Analytics vs. AWstats - Why are the numbers different?

Google Analytics and AWstats were both reporting data on the same website. The numbers should have looked the same, but they couldn’t have been more different.
Not too long ago I made the switch from AWstats to Google Analytics. I was stunned to see a vast difference in the data reported by the two statistical packages. AWstats was reporting nearly three times the number of pageviews that Analytics was reporting. What was going on?
…
Before we get into the nitty gritty, we’ll take a quick step backwards: For those of you who don’t know: AWstats and Google Analytics are statistical packages that tell website owners an enormous amount about their website traffic. How many unique visitors were there last month? What country were the visitors coming from? How many pages (on average) did these visitors look at? What browsers were they using? Etc.
The big technical difference between the two packages is that AWstats is a “log based” statistical package that lives on a webserver, and is usually accessible through a server’s web-based control panel. Many hosting companies provide AWstats default-installed. By contrast, Google Analytics is activated by embedding a short piece of javascript code on every page in a site. (One can of course, just put it in the header or footer file rather than actually pasting it into every page on a site). The data is then accessible by visiting analytics.google.com
When I first fired up Google Analytics after using AWstats for many years, I was instantly impressed by the graphical quality of the reporting. The “map overlay” in Analytics is an amazing tool for pinpointing the geographical locations of site visitors. The graphing tools are also excellent and provide an instant reading of traffic dynamics over time. Just when I was about to fall in love with this sleek new interface — I noticed that my “pageviews” statistic (which is arguably one of the most important statistic sfor website owners) had been decimated. The number was just a little less than one third of what AWstats was reporting!
How could this be? After all — these weren’t primitive “site counters” from a decade ago — these were statistical packages, whose sole purpose in life was… well, statistics.

Frankly I would have found it somewhat disturbing if there had been a margin of error of 1% or 2% — but this was like looking at data for a whole different website. Google Analytics was reporting a little more than one third of the pageviews that AWstats was reporting. (Naturally, I hoped that AWstats was correct, otherwise my site wasn’t nearly as popular as I thought it was.) But any way you sliced it: One of these packages was reporting some very incorrect data.
What was going on?
The problem stems from the development of AJAX, PHP and the semantics of what we call a “page”. When we see a single page in a browser we are frequently looking at a conglomerate of several different files. These ‘include’ files are assembled invisibly via PHP, AJAX and any number of other web technologies. From the server’s perspective (which is where AWstats gets it’s data) each of these requested files is a “page”. The server has no way of knowing that a client-side AJAX script is actually assembling the requested files into a single viewable ‘page’ in the visitor’s browser. So AWstats reports on every HTML (or PHP) file requested as one ‘page’.

By contrast, the javascript which reports statistical data to Google Analytics is embedded once per viewable page. (In this case Google’s idea of a “page” is likely the same as a viewer’s idea of a “page”). Neither Google Analytics nor the user have any idea how many individual files were assembled to create the “page” in the visitor’s web browser — nor should they care, since ultimately the page is perceived as a single page by the visitor.
The end result is a vastly inflated number of “pageviews” in AWstats.
Cache issues and browser settings
To some extent, the over-reporting problem in AWstats is mitigated by client-side cache issues: A page which is drawn from a visitor’s cache may not result in an http request to the server — somewhat reducing the number of pageviews reported by AWstats.
Further complicating the situation is the possiblity that site visitors will have javascript and/or cookies disabled for security reasons. Google Analytics requires both features to be enabled to accurately report.
So in the battle of AWstats vs. Google Analytics, which one is more reliable? Without starting a war, I’m going to go out on a limb and say Google Analytics provides the more reliable set of metrics. The number of visitors to any given site who have cookies and/or javascript disabled is relatively low. By contrast, the number of websites which use multiple include files to assemble a single viewable ‘page’ is relatively high.
In my opinion, as page complexity increases — and in particular, as AJAX and client-side scripting increase in popularity — AWstats and log-based statistical packages will be decreasingly reliable in their ability to report pageviews.
- J. Roven
You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.
articles:

January 8th, 2008 at 7:34 am
[…] Google Analytics vs. AWstats - Why are the numbers different? […]