Analysing website and internet traffic
Website analysis is now a specialist task, with increasing levels of statistical and algorithmic sophistication for tracing, tracking and modelling customer journeys often using dedicated external providers. With web-apps large amounts of data can be gathered, although consent and understanding GDPR or other privacy regimes is essential.
In developing market intelligence systems, it is important to understand where and how the data is collected and what can be done with it and how to integrate website analysis into other market information data to develop market experiments and evaluate marketing effectiveness.
Web-traffic basics - visitor logs
Each web-site is held (hosted) on a computer called a web-server that responds to the requests for pages from the visitors to the website. The web-server is normally in a data centre and may be part of a larger cluster of servers, or part of a giant cloud-based computing service such as AWS or Azure.
As each visitor moves around a website, the web-server records who asked for what and when in a logfile - not just the html page itself, but also requests for images, scripts and other content that makes up the page. Below is an example of some of the types of content in a log-file.
On the first line here you can see an IP address of a computer (216.72.94.70) who accessed the site on the 6th September at 20:24. They looked at the segmentation page and came from to the site from the Market Research Information Forum using Netscape (Mozilla).
The remaining lines are the graphics images supporting the page. In total the logfile shows this took 10 seconds for them to download. And if, they had gone to another page the log-file would have been able to see how long they spent on this particular page before they switched somewhere else. Consequently, just with a log file, the business can see who visited, when and what they looked for and for how long.
You can translate the IP address to a named organisation via the RIPE database for computers in Europe, ARIN in the US or APNIC for Asia Pacific regions (if you can't find it on one of these sites, try another) or via grouped services like whois.com or domaintools.com. The IP 216.72.94.70 refers to a US company called Global One.
Note though a direct individual cannot be identified in this way. However, if you are selling business-to-business it may be sufficient to identify that someone from an organisation visited the site. For B2B mailings , it may be sufficient to put 2 and 2 together to identify the likely visitor, and some companies offer this as a lead-tracking service.
The IP address can be combined with other information services to obtain an approximately location of the user. Combined with usage data, this may be sufficient to identify an individual, and is then considered personal information subject to privacy laws a need for consent to collect in Europe.
216.72.94.70 - - [06/Sep/2000:20:24:03 +0100] "GET /segmentation.htm HTTP/1.0" 200 7557 "http://www.marketresearchinfo.com/forum/index.cfm/fuseaction/thread/CFB/1/Tid/1150/DoOnePage/Yes.cfm" "Mozilla/4.7 [en] (Win95; I)"
216.72.94.70 - - [06/Sep/2000:20:24:05 +0100] "GET /_themes/water/wate1011.css HTTP/1.0" 200 14026 "-" "Mozilla/4.7 [en] (Win95; I)"
216.72.94.70 - - [06/Sep/2000:20:24:12 +0100] "GET /compass3.gif HTTP/1.0" 200 10401 "/segmentation.htm" "Mozilla/4.7 [en] (Win95; I)"
216.72.94.70 - - [06/Sep/2000:20:24:12 +0100] "GET /_borders/_derived/left.htm_txt_Water_side.gif HTTP/1.0" 200 10874 "/segmentation.htm" "Mozilla/4.7 [en] (Win95; I)"
216.72.94.70 - - [06/Sep/2000:20:24:12 +0100] "GET /logo.gif HTTP/1.0" 200 1387 "/segmentation.htm" "Mozilla/4.7 [en] (Win95; I)"
216.72.94.70 - - [06/Sep/2000:20:24:13 +0100] "GET /images/Water_rule.gif HTTP/1.0" 200 1747 "/segmentation.htm" "Mozilla/4.7 [en] (Win95; I)"
Cookies and trackers
More sophisticated tracking of customers to track visitors to a site and to see if they return, you can use "cookies" to log different customers on the site at any one time. Combining cookies with a required user registration or pre-sent ID field, means specific individuals journeys can be tracked. Newsletter tracking using images labelled with personal identifiers to track who viewed an email, and then cookies to track the journey after clicking the link. (For privacy turn off downloading images automatically on emails).
Third-party cookie services allow tracking across websites. These forms of tracking can be used for remarketing where visitors using services like Amazon then lead to adverts on other sites. Facebook cookies can be used to track Facebook users across sites that include a 'like' link. Similarly Google analytics has the potential to track across sites. Similarly, a range of advertising support companies and third parties offer a range of tracking services, with supporting analytics for measuring performance and effectiveness of content on the site.
Note that concerns over privacy, mean that consent is required, especially in Europe, and many more sophisticated users turn third-party trackers off through the browser and ad-blockers.
Apps and data
With mobile phone apps and web-apps, the business sees much more data than just which pages were visited. The usage of the app itself collects and stores data as part of the service to the customer. For instance, a game will capture time spent playing, connections to other players, levels achieved and so on, all of which are essential to the socialness of the game itself. With permission, the app may also collect information about the users location, images, audio and other phone-based data. This allows companies like Facebook to then use face recognition to suggest tags in photos, but can also be used for targeting advertising or reaching out to specific groups or individuals, or linking individuals to preferences based on likes or comments.
Web-site analysis
The basic forms of analysis for websites start with the simple, how many people visited each page. We then look in more detail - where did they land? So which page did they get to first? And what were the referring sites or links that brought the visitor to the page - was it advertising, a particular keyword, social network links or another type of connection.
We can then follow their journey through the site - where did visitor go next? For specialist landing pages where the page is deliberately focused on channelling traffic to a purchase or a sign up, we would follow up the conversion process for each step in the process. How many customers made it all the way through? And where were visitors lost? Playing with different forms of landing page or process can greatly increase the end conversion rate.
The customer journey then has several metrics - number of pages looked at (or if just the first page, the bounce rate - how many people left after the first page?) - the time spent on site and, where customers come back, the return rate. For each type of referral it is possible then to look at the points in the journey with a view to improving the experience or better channelling behaviour towards a sale or sign-up activity (not that everyone wants to do this - we deliberately do not track on this site).
Similarly useful data like the type of browser used (eg mobile or web-based), country and possibly location from IP address. If you run a targeted campaign in Denver, then you would hope to have more hits from Denver during the campaign for instance.
The landing and visitor rate can also be tracked over time and day for changes in volume. These changes may be driven by factors off the website including changes to competitor sites, or changes to search engine algorithms and so help show which pages need to be reviewed and updated.
For those selling B2B with lead generation activities, one measurement is the ability to track IP address against company. Many larger companies run their own servers and so visitors from those companies can be found in the records. This isn't foolproof as small businesses and consumers usually connect via an ISP and so cannot be identified.
With Google Analytics and Facebook links on the site, you potentially also get information about sites that your visitors are also visiting and more information about visitor demographic background. These then enable content to be tuned to different audiences.
To improve website conversions and impact, A/B testing is a common approach. For instance, does landing page A produce more conversions than landing page B. A/B testing can be applied throughout the customer journey but requires technical set up so as to serve the right page to the right customer and to track the delivery or experiment (particularly if you use a richer experimental design). Incremental changes to pages can then increase overall effectiveness of the website and conversion.
The richness of app-based data and the volume of information obtainable makes hand-based analysis techniques redundant. Increasing high volume web-based data has to be handled automatically, often using machine learning or artificial intelligence using data science tools.
The data itself is behavioural, and more and more companies are looking to blend this behavioural data with direct feedback from customers. Research panel providers have customers who are willing both to be tracked and then to follow up with survey questions, allowing for the why's to be combined with the what's.
Web analytics is a specialised area, however, it does need to be built into the broad market intelligence plan for a holistic picture for market experimentation and measuring marketing effectiveness.
For help and advice on using web analytics in a blended marketing framework, contact info@dobney.com