When it comes to analytics, one thing is abundantly clear: Data can be messy. (Or as our in-house analytics expert likes to call it, dirty.)
Anomalies pop up even in the most well-configured Google Analytics accounts and, when left unfixed, can cause metrics to misfire for months or even years. Oftentimes, spammers might have added ghost traffic—visits that are not actually reaching the website—or new programmers have not removed old analytics tags, resulting in multiple tags stacking on top of each other. (Just this year, we encountered a site that still had an ancient urchin.js legacy tag, which belongs to the tracking platform that Google used as a basis for building Google Analytics.)
In any case, disorganized data is a common occurrence across any industry and as digital marketers, correcting anomalies is paramount for obtaining accurate comparative analytics. So with that in mind, here are four tips from our in-house Analyst, Allison Glaser, for working through dirty data:
1. Add Filters or Custom Segments
The best thing you can do to clean out a messy analytics account is to simply remove any data that is inaccurate, misleading, spammy, etc. There are two ways of doing this: either creating a filter or creating a custom segment.
Creating a filter is the more permanent solution, eliminating data from the source so that “bad” data never even enters your analytics database. The benefit is, you’ll never again have to deal with certain misleading data, hence why it’s perfect for eliminating distorting factors like spam traffic, bot traffic, and test bookings. The drawback, of course, is that filters cannot be applied to historical data and that once something is filtered out, you won’t be able to view the original unfiltered data going forward. For this reason, Allison suggests leaving one view as unfiltered, or “raw,” data: “You don’t want to find yourself a few years down the road asking yourself when you began filtering.”
Your other option—this one more flexible—is to create a custom segment, meaning you’re configuring a new report (inside an account) using the same templates you’re used to on Google Analytics, albeit with customized filters. Compatible with both future and historical data, this is a great bet if you want to hide certain data but not eliminate it altogether. For example, we often use customized segments to remove traffic from certain IP addresses, referrals from certain unqualified sources or paid campaigns, or test bookings from certain transaction IDs.
2. Check Old Tags With Internet Archive Wayback Machine
If you’re new to an account and find problems that are difficult to diagnose, checking a website’s historical source code might give you some valuable clues. Allison uses a tool called the Wayback Machine, which gives her snapshots of a website’s code over time and helps identify any tracking errors through wrongly configured tags. “If you don’t have programmers who are up to date on the issue, looking at the source code is the best way to get the history of an analytics account,” says Allison. “It helps to diagnose double tracking and how data was tracking previously.”
3. Use Browser Extensions to Test Tags in Real Time
While Wayback Machine tests historical tags, browser extensions like Google Tag Assistant or Facebook Pixel Assistant test whether there’s anything wrong with a website’s current tags. Once the tag managers are activated, simply visit the website, act like a real user, and—by recording in real time—identify any pages that are thrown off, double tracking, or not tracking at all. If so, there’s your diagnosis.
4. Make Annotations
Even with an exceptional memory, chances are you won’t remember the exact date a new website launched three years ago, when you ran a paid advertising campaign two years ago, or the few days earlier this year when a booking engine wasn’t tracking correctly. That is, unless you make detailed annotations noting exact historical events that you can look back on for future reports. If a client asks you why bookings were up or down—and the numbers themselves don’t seem to explain it—an annotation might save you from embarrassment.