Welcome to this week’s Tip of the Hat! We covered the basics of web cookies in Part One, including tracking and what users can do to protect their online privacy and not be tracked by these not-so-delicious cookies. Part Two focuses on the site owners who use tracking products to serve up those cookies to their users.
A Necessary Evil (Cookie)?
Many site owners use web analytics products to assess the effectiveness of an online service or site. These products can measure not only site visits but also how visitors get to the site, including search terms in popular search engines. Other products can track and visualize the “flow” or “path” through the site: where users enter the site (landing page), how users navigate between pages on the site, and what page users end their site visit (exit page).
Web analytics products provide site metrics that can help assess the current site and determine the next steps in investigating potential site issues, such as developing usability testing for a particular area of the site where the visitor flow seems to drop off dramatically. Yet, products such as Google Analytics collect personal information by default, creating user profiles that are accessible to you, the company, and whoever the company decides to share the data. Libraries try to limit data collection in other systems such as the ILS to prevent such a setup, so it shouldn’t be any different for web analytics products.
Protecting Site User Privacy
There are a few ways libraries can protect patron privacy while collecting site data. Most libraries can do at least one or two of these strategies, while other strategies might require negotiation with vendors or external IT departments.
User consent and opt-out
Use a product other than Google Analytics
Chances are, your server is already keeping track of site visits. Install AWStats and you’ll find site visit counts, IP addresses, dates and times of visits, search engine keyword results, and more just from your server logs.
(Which, BTW, do you know what logs are kept by your server and what data they are collecting?)
Several web analytics products provide site data without compromising user privacy by default. One of the more popular products is Matomo, formerly Piwik, which is used by several libraries. Cornell University Library wrote about their decision to move to Piwik and the installation process, and other libraries are already running Matomo or are starting to make the migration. You can find more information about privacy-focused analytics products in the Action Handbook from the National Forum of Web Privacy and Web Analytics. Many of these products allow you to control what data is being collected, as well as allow you to install and host the product on a local server.
If you must use Google Analytics
There are times where you can’t avoid GA. Your vendor or organization might use GA by default. You might not have the resources to use another analytics product. While this is not the optimal setup, there are a couple of ways to protect user privacy, including telling GA to anonymize IP addresses and turning off data sharing options. Again, you can find a list of recommended actions in the Action Handbook. You might also want to read Eric Hellman’s posts about GA and privacy in libraries, as well as how library catalogs leak searches to Amazon via cookies.
Protecting patron privacy while they use your library’s online services doesn’t necessarily mean prohibiting any data collection, or cookies for that matter. Controlling what data is collected by the web analytics product and giving your patrons meaningful information about your site’s cookie use are two ways in which you can protect patron privacy and still have data for assessing online services.
 Hat tip to Shana McDanold for the Stanford link!