Provenance Missing from Web Analytics

Posted by: Matt Shanahan

Provenance is one of the big holes in web analytics today. Reporting and manipulating visit data based on the source that generated the visit data isn’t exposed. Some vendors offer expensive data mart add-ons to patch the hole, but this creates more headaches. Consequently, the ability to understand individual visitor behavior isn’t possible with web analytics such as Omniture SiteCatalyst, Google Analytics, Webtrends Analytics, and others.

Surprisingly, very few publishers are aware of this limitation and its impact on the revenue model. For publishers, the lack of provenance puts real operational limits on monetizing subscribers or audience members. I’ll cover these limitation in my next post. This post looks mainly at the technical side.

So what is provenance? Provenance is simply the ability to trace visit data back to the source. In this case, a page view should be traceable back to the subscriber or audience member, device, network location, time and date that generated the page view.

What does provenance enable? Visit data enrichment and visitor analysis are two important capabilities that come from provenance. Data enrichment is the ability to annotate visit data with data from external systems. An amazing amount of data regarding visitors and organizations is available in circulation databases, CRM systems, databases of other properties (e.g., vertical network), and publich databases (e.g., Hoovers, Facebook). A page view can be enriched long after it is recorded if the provenance is in place to make the links to these other data sources. With enriched data, a publisher has more ways to segment, analyze, and target subscribers and audience members. Additionally, data enrichment on the back-end reduces cumbersome front-end tagging of a site.

Provenance also enables visitor analysis Visitor analysis is the ability to look for behavioral patterns at an individual visitor level. Visitor analysis creates new insights such as scoring visitor loyalty or understanding intent. Visitor analysis essentially extends the segmentation and targeting schemes to become predictive.

Visit data enrichment and visitor analysis are capabilities that directly tie to monetization. In the next post, I’ll explain the impact of provenance on targeting, prediction, and revenue.