Analysing website traffic

Researchers and market analysts will usually employ specialist tools for website analysis with increasing levels of statistical and algorithmic sophistication, but it is important to understand where and how the data is collected and what can be done with it. By now it is common knowledge that every time someone visits a website they leave fingerprints that you can be used to analyse their web-site traffic.

For small, specialist business-to-business (B2B) sites the number of hits to your site may be small enough to analyse by hand but in practice businesses will use online tools to monitor websites usually starting with Google Analytics as the simplest to impement (eg WebTrends although there are some free resources too).


Web-traffic basics

Each web-site is held (hosted) on a computer called a web-server (probably at your ISP) that responds to the requests for pages from the visitors to your site.

As each visitor moves around your site, the web-server records who asked for what and when in a logfile - not just the html page itself, but also requests for images, scripts and other content that makes up the page. You should be able to easily get hold of this logfile from the web-server and some ISPs will also provide you with web-site statistics based on this logfile. Below is an example of what you will see in a log-file.

On the first line here you can see an IP address of a computer (216.72.94.70) who accessed the site on the 6th September at 20:24. They looked at the segmentation page and came from to the site from the Market Research Information Forum using Netscape (Mozilla).

The remaining lines are the graphics images supporting the page. In total the logfile shows this took 10 seconds for them to download. And if, they had gone to another page you would have been able to see how long they spent on this particular page before they switched somewhere else. Consequently, you can see who visited, when and what they looked for and for how long.

Analysis can start from simple landing or entry pages, journey tracking to follow customers through the purchase funnel monitoring where they abandoned the process, or to track how cross-linkages between pages are working. These can then be combined with techniques such as A/B testing to define and refine pages to maximise sales or conversion rates.

The logfiles will also show referral data - where the customer came from - and this can contain information about search terms and keywords, or the originating site for monitoring referral data. Similarly the log-files will also show spiders and bots that visit the site to collect data for search engines but also scrapers and hackers testing the site.

You can translate the IP address to a named organisation via the RIPE database for computers in Europe, ARIN in the US or APNIC for Asia Pacific regions (if you can't find it on one of these sites, try another) or via grouped services like whois.com or domaintools.com. The IP 216.72.94.70 refers to a US company called Global One.

Note though you cannot identify an individual this way, if you are selling business-to-business it may be sufficient to know someone from an organisation has visited your site. If you have just sent out a mailing, it may be sufficient to put 2 and 2 together to identify the likely visitor and some companies offer this as a lead-tracking service.

For more sophisticated tracking of customers (and sessions), to track which visitors come to your site and to see if they return, you can use "cookies"  to log different customers on the site at any one time. Unfortunately this needs a bit of HTML and Javascript experience. Combining cookies with a required user registration, means specific individuals journeys can be tracked. Third-party cookie services allow tracking across websites, but need subscription and coding. These forms of tracking can be used for remarketing where visitors using services like Amazon then lead to adverts on other sites.

Cookies are also tied to machines and if a customer has a different machine at home to work, you can find that the information is lost, so companies may also infer connections between individuals by using IP addresses. Some companies have tried to use this to target pricing but Amazon came unstuck when it was trying "price testing" with different prices for different users by this distinction and this can still be seen for some ticketing sites. The problem is that when customers discover these approaches they start to distrust the sites.

216.72.94.70 - - [06/Sep/2000:20:24:03 +0100] "GET /segmentation.htm HTTP/1.0" 200 7557 "http://www.marketresearchinfo.com/forum/index.cfm/fuseaction/thread/CFB/1/Tid/1150/DoOnePage/Yes.cfm" "Mozilla/4.7 [en] (Win95; I)"
216.72.94.70 - - [06/Sep/2000:20:24:05 +0100] "GET /_themes/water/wate1011.css HTTP/1.0" 200 14026 "-" "Mozilla/4.7 [en] (Win95; I)"
216.72.94.70 - - [06/Sep/2000:20:24:12 +0100] "GET /compass3.gif HTTP/1.0" 200 10401 "http://www.dobney.com/segmentation.htm" "Mozilla/4.7 [en] (Win95; I)"
216.72.94.70 - - [06/Sep/2000:20:24:12 +0100] "GET /_borders/_derived/left.htm_txt_Water_side.gif HTTP/1.0" 200 10874 "http://www.dobney.com/segmentation.htm" "Mozilla/4.7 [en] (Win95; I)"
216.72.94.70 - - [06/Sep/2000:20:24:12 +0100] "GET /logo.gif HTTP/1.0" 200 1387 "http://www.dobney.com/segmentation.htm" "Mozilla/4.7 [en] (Win95; I)"
216.72.94.70 - - [06/Sep/2000:20:24:13 +0100] "GET /images/Water_rule.gif HTTP/1.0" 200 1747 "http://www.dobney.com/segmentation.htm" "Mozilla/4.7 [en] (Win95; I)"

As mentioned the other alternative to tracking just via your website, is to use Google Analytics, or one of the targeted ad networks. If you sign up to some of the ad networks, a cookie is placed on a page on your site, and on the site of the advertiser. It is then possible for the ad network to monitor who visits each site and in which order.

Another alternative is to include Google ads on your site. From these details, Google can provide information about the number of page views and click behaviour. Similarly Google ads (or any other ad network) can be used to bring traffic to the site. By using things like reference IDs and landing pages, it is possible to target different customers with different campaigns and to understand which adverts are generating the best return on investment.


Forms of web-site analysis

The basic forms of analysis for websites start with the simple, how many people visited each page. We then look in more detail - where did they land? So which page did they get to first? And what were the referring sites or links that brought the visitor to the page - was it advertising, a particular keyword, social network links or another type of connection.

We can then follow their journey through the site - where did visitor go next? For specialist landing pages where the page is deliberately focused on channelling traffic to a purchase or a sign up, we would follow up the conversion process for each step in the process. How many customers made it all the way through? And where were visitors lost? Playing with different forms of landing page or process can greatly increase the end conversion rate.

The customer journey then has several metrics - number of pages looked at (or if just the first page, the bounce rate - how many people left after the first page?) - the time spent on site and, where customers come back, the return rate. For each type of referral it is possible then to look at the points in the journey with a view to improving the experience or better channelling behaviour towards a sale or sign-up activity (not that everyone wants to do this - we deliberately do not track on this site).

Similarly useful data like the type of browser used (eg mobile or web-based), country and possibly location from IP address. If you run a targeted campaign in Denver, then you would hope to have more hits from Denver during the campaign for instance.

The landing and visitor rate can also be tracked over time and day for changes in volume. These changes may be driven by factors off the website including changes to competitor sites, or changes to search engine algorithms and so help show which pages need to be reviewed and updated.

For those selling B2B with lead generation activities, one measurement is the ability to track IP address against company. Many larger companies run their own servers and so visitors from those companies can be found in the records. This isn't fullproof as small businesses and consumers usually connect via an ISP and so cannot be identified.

With Google Analytics and Facebook links on the site, you potentially also get information about sites that your visitors are also visiting and more information about visitor demographic background. These then enable content to be tuned to different audiences.

To improve website performance, the main option is to try different approaches and then to monitor which works best. For instance does landing page A produce more conversions than landing page B. This A/B testing approach can be applied throughout the customer journey but requires technical set up so as to serve the right page to the right customer and to track the delivery or experiment (particularly if you use a richer experimental design). Incremental changes to pages can then increase overall effectiveness of the website and conversion.

Finally, for more information about visitors you can pop-up sign-up invites and offers, and so potentially gather names and email addresses, or pop-up surveys to find out more about visitor profiles. These usually have low completion rates, but can provide valuable clues as to visitor's background and interests.