Beyond Web Cookies: The Ways Google Tracks Your Users

Welcome to this week’s Tip of the Hat!

Earlier we discussed the basics of web cookies, including the cookies used in tracking applications such as Google Analytics. However, there are many ways Google can track your online behavior even when you block Google Analytics cookies and avoid using Google Chrome. Because Google provides applications and infrastructure for many web developers to use on their sites, it’s extremely hard to avoid Google when you are browsing the Web.

An example of this is Google Fonts. The LDH website uses a font provided by the service. To use the font, the following code is inserted into the web page HTML code:

link href=”https://fonts.googleapis.com/css?family=Open+Sans&display=swap” rel=”stylesheet”

For those who are not familiar with HTML code, the above line is instructing the web page to pull in the font style from the external fonts.googleapis.com site. The FAQ question about user privacy describes the data exchanged between our site and the Google Font API service. The exact data mentioned in the FAQ is limited to the number of requests for the specific font family and the font file itself. On the surface, the answer seems reasonable, though there is always the possibility of omission of detail in the answer.

This isn’t to say that other Google services provide the same type of assurance, though. In Vanderbilt University Professor Douglas C. Schmidt’s research study about how Google tracks users, many other Google services that collect data that can be tied back to individuals. Schmidt’s study leans heavily toward tracking through mobile devices, but the study does cover how users can be tracked even through the exclusive use of non-Google products thanks to the pervasiveness of third-party tracking and services that feed data back to Google.

We covered some ways that you can avoid being tracked by Google as a web user in our earlier newsletter, including browser add-ons that block cookies and other trackers. Some of the same add-ons and browsers block other ways that Google tracks web users. Still, there is the same question that we brought up in the earlier newsletters – what can web developers and web site owners do to protect the privacy of their users?

First, take an audit of the Google products and API services you’re currently using in your web sites and applications. The audit is easy when you’re using widgets or integrate Google products such as Calendar and Docs into your site or application. Nonetheless, several Google services can fly under the radar if you don’t know where to look. You can make quick work out of trying to find these services by using a browser plugin such as NoScript or Privacy Badger to find any of the domain URLs listed under the Cookies section in Google’s Privacy and Terms site. Any of the domains listed there have the potential to collect user data.

Next, determine the collection and processing of user data. If you are integrating Google Products into your application or website, examine the privacy and security policies on the Google Product Privacy Guide. APIs are another matter. Some services are good in documenting what they do with user data – for example, Google Fonts has documentation that states that they do not collect personal data. Other times, Google doesn’t explicitly state what they are collecting or processing for some of its API services. Your best bet is to start at the Google APIs Terms of Service page if you cannot find a separate policy or terms of service page for a specific API service. There are two sections, in particular, to pay attention to:

  • In Section 3: Your API Clients, Google states that they may monitor API use for quality, improvement of services, and verify that you are compliant within the terms of use.
  • In Section 5: Content, use of the API grants Google the “perpetual, irrevocable, worldwide, sublicensable, royalty-free, and non-exclusive license to Use content submitted, posted, or displayed to or from the APIs”. While not exclusively a privacy concern, it is worth knowing if you are passing personal information through the API.

All of that sounds like using any Google service means that user tracking is going to happen no matter what you do. For the most part, that is a possibility. You can find alternatives to Google Products such as Calendar and Maps, but what about APIs and other services? Some of the APIs hosted by Google can be hosted on your server. Take a look at the Hosted Libraries page. Is your site or application using any libraries on the list? You can install those libraries on your server from the various home sites listed on the page. Your site or application might be a smidge slower, but that slight slowness is worth it when protecting user privacy.

Thank you to subscriber Bobbi Fox for the topic suggestion!

Shining a Light on Dark Data

[Author’s note – this posts uses the term “Dark Data” which is an outdated term. Learn more about the problem with the term’s use of “dark” at Intuit’s content design manual.]

Welcome to this week’s Tip of the Hat!

Also, welcome to the first week of Daylight Savings Time in most of the US! To celebrate the extra hour of daylight in the morning (we at LDH are early birds), we will shed light on a potential privacy risk at your organization – dark data.

The phrase “dark data” might conjure up images of undiscovered data lurking in the dark back corner of a system. It could also bring to mind a similar image of the deep web where the vast amount of data your organization has is hidden to the majority of your staff, with only a handful of staff having the skills and knowledge to find this data.

The actual meaning of the phrase is much less dramatic. Dark data refers to collected data that is not used for analysis or other organizational purposes. This data can appear in many places in an organization: log files, survey results, application databases, email, and so on. The business world views dark data as an untapped organizational asset that will eventually serve a purpose, but for now, it just takes up space in the organization.

While the reality of dark data is less exciting than the deep web, the potential privacy issues of dark data should be taken seriously. The harm isn’t that the organization doesn’t know what it’s collecting – dark data is not unknown data. One factor that leads to dark data in an organization is the “just in case” rationale used to justify data collection. For example, a project team might set up a new web application to collect patron demographic information such as birth date, gender, and race/ethnicity not because they need the data right now, but because that data might be needed for a potential report or analysis in the future. Not having the data when the need arises means that you could be out on important insights and measures that could sway decision-makers and the future of operations. It is that fear of not having that data, or data FOMO, that drives this collection of dark data.

When you have dark data that is also patron or other sensitive data, you put your organization and patrons at risk. Data sitting in servers, applications, files, and other places in your organization are subject to being leaked, breached, or otherwise subject to unauthorized access by others. This data is also subject to disclosure by judicial subpoenas or warrants. If you choose to collect dark data, you choose to collect a toxic asset that will only become more toxic over time, as the risk of a breach, leak, or disclosure increases. It’s a matter of when, not if, the dark data is compromised.

Dark data is a reality at many organizations in part because it’s very easy to collect without much thought. The strategies in minimizing the harms that come with dark data require some forethought and planning; however, once operationalized, these strategies can be effective in reducing the dark data footprint in your organization:

  • Tying data collection to demonstrated business needs – When you are deciding what data to collect, be it through a survey, a web application, or even your system logs, what data can be tied back to a demonstrated business need? Orienting your data collection decisions to what is needed now for operational purposes and analysis shifts the mindset away from “just in case” collection to what data is absolutely needed.
  • Data inventories – Sometimes dark data is collected and stored and falls off the radar of your organization. Conducting regular data inventories of your organization will help identify any potential dark data sets for review and action.
  • Retention and deletion policies – Even if dark data continues to persist after the above strategies, you have one more strategy to mitigate privacy risks. Retention policies and proper deletion and disposal of electronic and physical items can limit the amount of dark data sitting in your organization.

The best strategies to minimize dark data in your organization happens *before* you collect the data. Asking yourself why you need to collect this data in the first place and looking at the system or web application to see what data is collected by default will allow you to identify potential dark data and prevent its collection.

Privacy Tech Toolkit: VPNs

Welcome to this week’s Tip of the Hat!

Data breach and website hacking stories are (sadly) commonplace in the news. But what happens when the hack in question did not involve a single site, but your entire browsing history, complete with sensitive data, while you were logged into what was supposed to be a secure and private connection? With the recent breach with three VPN services – NordVPN, TorGuard, and Viking VPN – customers might be looking at that reality.

Some of you might be scratching your heads while reading the reports, though. Not everyone is familiar with VPNs, how they work, why they matter, and when you should use one. In this newsletter, we’ll cover the basics of VPNs, including how you can use them to protect your online privacy.

VPN Basics

A virtual private network (VPN) is a network of computers that provide access to the internet from a private network. Let’s use your work’s VPN service as an example. You are traveling with your work computer and you need to log into a work application. The problem is that the application can’t be accessed by computers outside the office. That’s where the work VPN comes in. You open your VPN client and log into the VPN service, creating a connection between your computer and the office server running the VPN service. This connection allows you to use the internet from that office server, making it appear that you are back in the office. Your computer can then access the work application now that the application thinks that your computer’s location is at the office and not in a hotel room.

Typically, the VPN connection is secure and encrypted, which makes VPN use essential for when you are connecting to public WIFI connections. Being able to change your location by using a server in another part of the world can also help protect privacy by placing you in a location other than the one you’re currently at. This comes in handy when trying to access sites that are geo-locked (sites that you cannot access outside of a certain geographical area, such as a country). Then there is the privacy component. A VPN can provide privacy protection for browsing history, current location, and web activity. Overall, VPNs can provide a secure and private space for you to browse the web away from those who want to track your every online move, be it some random person running Wireshark on a public network, your internet service provider looking for data for targeted advertising purposes, or possibly even the government (depending on your location).

VPN Considerations

A private and secure connection to the internet can protect online privacy, but as we found out last week, VPNs themselves are susceptible to breaches. This might cause some to wonder if VPNs are still a good choice in protecting online privacy. While VPNs are still an essential tool in the privacy toolkit, you still have to evaluate them like any other tool. There are some things to look for when choosing a VPN for work or personal use:

  • Encryption, protocols, and overall security – is the connection between your computer and the VPN server encrypted? You also have to consider the processes used in the actual creation of the tunnel between you and the VPN server. You might run across a lot of protocol terminology that is unfamiliar. NordVPN has a good post explaining various security protocols to help you wrap your head around VPN protocols.
  • Activity logs – is the VPN service keeping a log of activity on its servers? You might not know if your work VPN keeps a log of user activity, so it’s safer to use a separate VPN service than your work VPN for any personal use. No logs mean no record of your activity and your privacy remains intact.
  • Location – What server locations are available so you can access geo-blocked sites? Do you need your computer’s location to be at a specific IP address or location for work?
  • Price (for personal VPN use) – Never use a free VPN service. They are the most likely to log your activity as well as sell your data to third parties.

VPNs @ Your Library

Most likely you have access to a VPN service at work. While the technical aspects of work VPN are relegated to the IT and Systems departments, there is the question of who can use a VPN. Some libraries do not restrict VPN use to certain types of staff while other libraries only allow those who travel for work or do remote work to use VPN. A potential risk with work VPNs is when staff change roles or leave the organization. Auditing the list of users who have VPN access to the system will help mitigate the risk of unauthorized access to work systems by those who no longer should have access.

Your library provides internet access to patrons, so how do VPNs fit into all of this? First, we have WIFI access. Your library’s WIFI is a public network and patrons who want to protect their privacy might use a VPN to protect their privacy. Can your patrons use their VPN service while connected to the WIFI? Your desktop computers are another place where patrons are using a public network, but many public computers don’t allow patrons to install software, including VPN clients. There are ways to configure the public network to break the ties between one IP address and one computer, so web activity cannot be traced back to a single computer user based on IP alone.

VPNs And Other Tools In The Privacy Tech Toolkit

VPNs are just one way to protect your privacy online. There are many other ways you can protect privacy, including Tor and other types of proxy servers. Sometimes folks use multiple tools to protect their privacy; for example, some folks use both a VPN service and the Tor browser. Each tool has its strengths and weaknesses in protecting your privacy, and choosing which one to use depends on your situation. We’ll be covering other tools in the Privacy Tech Toolkit soon, so stay tuned!

Cookies, Tracking, and You: Part 2

Welcome to this week’s Tip of the Hat! We covered the basics of web cookies in Part One, including tracking and what users can do to protect their online privacy and not be tracked by these not-so-delicious cookies. Part Two focuses on the site owners who use tracking products to serve up those cookies to their users.

A Necessary Evil (Cookie)?

Many site owners use web analytics products to assess the effectiveness of an online service or site. These products can measure not only site visits but also how visitors get to the site, including search terms in popular search engines. Other products can track and visualize the “flow” or “path” through the site: where users enter the site (landing page), how users navigate between pages on the site, and what page users end their site visit (exit page).

Web analytics products provide site metrics that can help assess the current site and determine the next steps in investigating potential site issues, such as developing usability testing for a particular area of the site where the visitor flow seems to drop off dramatically. Yet, products such as Google Analytics collect personal information by default, creating user profiles that are accessible to you, the company, and whoever the company decides to share the data. Libraries try to limit data collection in other systems such as the ILS to prevent such a setup, so it shouldn’t be any different for web analytics products.

Protecting Site User Privacy

There are a few ways libraries can protect patron privacy while collecting site data. Most libraries can do at least one or two of these strategies, while other strategies might require negotiation with vendors or external IT departments.

User consent and opt-out

Many sites nowadays have banners and popups notifying visitors that the site uses cookies. You can thank the EU for all of these popups, including the ePrivacy Directive (the initial law that prompted all those popups) and GDPR. US libraries such as Santa Cruz Public Library and Stanford University Libraries [1] have either adopted the popup or otherwise provided information to opt-out of being tracked while on their site. The major drawback to this approach, as one study points out, is that these popups and pages can be meaningless to users, or even confuse them. If you decide to go this route, user notification needs to be clear and concise and user consent needs to be explicit.

Use a product other than Google Analytics

Chances are, your server is already keeping track of site visits. Install AWStats and you’ll find site visit counts, IP addresses, dates and times of visits, search engine keyword results, and more just from your server logs.

(Which, BTW, do you know what logs are kept by your server and what data they are collecting?)

Several web analytics products provide site data without compromising user privacy by default. One of the more popular products is Matomo, formerly Piwik, which is used by several libraries. Cornell University Library wrote about their decision to move to Piwik and the installation process, and other libraries are already running Matomo or are starting to make the migration. You can find more information about privacy-focused analytics products in the Action Handbook from the National Forum of Web Privacy and Web Analytics. Many of these products allow you to control what data is being collected, as well as allow you to install and host the product on a local server.

If you must use Google Analytics

There are times where you can’t avoid GA. Your vendor or organization might use GA by default. You might not have the resources to use another analytics product. While this is not the optimal setup, there are a couple of ways to protect user privacy, including telling GA to anonymize IP addresses and turning off data sharing options. Again, you can find a list of recommended actions in the Action Handbook. You might also want to read Eric Hellman’s posts about GA and privacy in libraries, as well as how library catalogs leak searches to Amazon via cookies.

Protecting patron privacy while they use your library’s online services doesn’t necessarily mean prohibiting any data collection, or cookies for that matter. Controlling what data is collected by the web analytics product and giving your patrons meaningful information about your site’s cookie use are two ways in which you can protect patron privacy and still have data for assessing online services.

[1] Hat tip to Shana McDanold for the Stanford link!

Cookies, Tracking, and You: Part 1

Welcome to this week’s Tip of the Hat!

LDH would like to let our readers know that in the eternal feud between Team Cookie and Team Brownie, we are firmly on Team Brookie.
A pan of brookies cut into bars, with two bars missing. One bar sits on top of the other bars.
But that doesn’t mean we don’t appreciate a good cookie!
A plate of honey nut cookies.
Unfortunately, not all cookies are as tasty as the ones above, and some we actively want to avoid if we want to keep what we do online private. One such cookie is the web cookie.

Web Cookie 101

You probably encountered the terms browser cookie, HTTP cookie, and web cookie when you read articles about cookies and tracking, and they all refer to the same thing. A web cookie is data sent from a website and stored in the user browser, such as Edge, Chrome, or Firefox. Web cookies come in many different flavors including cookies that keep you signed into a website, remember your site preferences, and what you put in your shopping cart when you were doing some online shopping at 2 am. Some cookies only last until you close your browser (session cookies) and some will stick around after you close and reopen your browser (persistent cookies). A website can have cookies from the site owner (first-party cookies) and cookies from other sites (third-party cookies). Yep, you read that right – the site that you’re visiting may have other sites tracking you, even if you don’t visit those other sites.

However, you don’t need a third-party cookie for a site to track you. Chances are that you’ve been tracked when you are browsing the Web by web analytics products such as Google Analytics. What does that all entail, and how does it affect your privacy online?

Tracking Cookies and Privacy

Many web analytics products use cookies to collect data from site visitors. Google Analytics, for example, collects user IP addresses, user device information (such as browser and OS), network information, geolocation, if the user is a returning or new site visitor, and user behavior on the site itself. A site owner can build a user profile of your activity on their website based on this information alone, but Google Analytics doesn’t stop there. Google Analytics also generates demographic reports for site owners. Where do they get this demographic data from? Cookies, for the most part. This is a feature that site owners have to turn, but the option is there if the owner wants to build a more complete user profile.

(Let’s not think about how many libraries might have this feature turned on, lest you want to stress-eat a batch of cookies in one sitting.)

This is one example of how cookies can compromise user privacy. There are other examples out there, including social media sites and advertising companies using cookies to collect user information. Facebook is notorious for tracking users on other sites and even tracking users who do not have a Facebook account. If there’s a way to track and collect user data, there’s a web site that’s doing it.

Using Protection While Browsing The Web

Web users have several options in blocking tracking cookies. The following guides and resources can help you set up a more private online browsing experience:

You can also test out your current browser setup with Panopticlick from the EFF to find out if your browser tracker blocker settings are set up correctly.

Stay Tuned…

But why do users have to do all the work? Where do site owners come into protecting their users’ privacy? Next week, we’ll switch to the site owners’ side and talk about cookies: what can you do to collect data responsibly, regulations around web cookies, and resources and examples from the library world. For now, go get a real-world cookie while you wait!

Ethics Breach As Privacy Breach

Welcome to this week’s Tip of the Hat! We’re still sorting through the big pile of notes and handouts from our trip to #PSR19 last month. This week’s newsletter will cover another session from the conference. Escaping the clutches of CCPA we focus on another important topic – particularly for libraries – for reasons that will become clear below.


Data breaches are a common occurrence in life. We get email notifications from Have I Been Pwned, credit monitoring referrals, and the inevitable “we value your privacy” statement from the breached company. Breaches also happen at libraries and library vendors; there’s no escaping from the impact from a data breach.

What you might not know, though, is that breaches come in different forms. In their presentation “The Data Breach vs. The Ethics Breach: How to Prepare For Both,” James Casey and Mark Surber broke down the three types of data breaches: security, data, and ethics. Security and data breaches take many forms: improper staff access levels to a database, a stolen unencrypted laptop, or sending an email with sensitive data to the wrong email address.

While security and data breaches focus primarily on failures to secure data on a technical, procedural, or compliance level, ethics breaches focus on the failure to handle the data consistent with organizational or professional values. A key point is that you can still have an ethics breach even if you follow rules and regulations. Ethics breaches involving privacy can include using consumer data for purposes that, while not violating any legal regulations, the consumer would not expect their data to be used for such a purpose. Another example is doing the absolute minimum for data security and privacy based on regulations and industry standards, even when the reality is that these minimum requirements will not adequately protect data from a breach.

Ethics breaches damage an organization’s reputation and public trust in that organization and, given the difficult nature of cultivating reputation and trust with the public, are hard to restore to pre-breach levels. Monetary fines and settlements make data and security breaches costly, but the lost reputation and trust from ethics breaches could very well be the more expensive type of loss even before you factor in the harm to the persons whose data was caught in the breach.

Casey and Surber’s talk proposed an Ethics by Design approach to aligning data practices in all stages of development and processes to ethical standards and practices. Ethics by Design might look something like this in libraries:

  • Adherence to professional ethics codes and standards, including:
  • Auditing vendors for potential ethics breaches – this audit can be done at the same time as your regularly scheduled privacy and security audits.
  • Considering patron expectations – patrons expect libraries to respect and protect their privacy. That privacy extends to the library’s data practices around collection, use, and sharing with third parties. They do not expect to be subject to the same level of surveillance and tracking as practiced by the commercial sector. The ethics breach litmus test from Casey and Surber’s talk can help identify an unethical data practice – upon learning of a particular practice, would a consumer (or in this case, patron) respond by saying “you did WHAT with my data?!”? If so, that practice might lead to an ethics breach and needs to be re-evaluated.

Ethics by Design asks us to “do the right thing”. Ethical practices need money, time, and resources – all which many libraries are short of at one time or another. It is easy to bypass ethical standards and practices, as well as doing the absolute minimum to follow regulations, particularly when “everyone else is doing it.” The nature of library work at its core is to uphold our patrons’ human rights to access information. Ethics guides libraries in creating practices that uphold and protect those rights, including the right to privacy. Protecting patron privacy should not only focus on preventing a security or data breach but also preventing an ethics breach.

Filtering and Privacy: What Would You Do?

Welcome to this week’s Tip of the Hat!

You’re working the information desk at the local college library. A student comes up to you, personal laptop in tow. They say that they can’t access many of the library databases they need for a class assignment. You ask them to show you what errors they are getting on their laptop when trying to visit one of the databases. The student opens their laptop and shows you the browser window. You see what appears to be a company logo and a message – “Covenant Eyes has blocked http://search.ebscohost.com. This page was blocked due to your current filter configuration.”

What’s going on?

Online filtering is not an unfamiliar topic to libraries. Some libraries filter library computers to receive funds from the E-rate program under the Children’s Internet Protection Act [CIPA]. Other libraries do not filter for many reasons, including that filters deny the right to privacy for teens and young adults. The American Library Association published a report about CIPA and libraries, noting that over filtering resources blocks access to legitimate educational resources, among many other resources used for educational and research purposes.

We’re not dealing with a library computer in the scenario, though. An increasing number of libraries encounter filtering software on adult patrons’ personal computers. Sometimes these are college students using a laptop gifted by their parents. These computers come with online monitoring and filtering software, such as Covenant Eyes, for the parents to track and/or control the use of the computer by the student. Parents can set the filter to block certain sites as well as track what topics and sites the student is researching at the library. This monitoring of computer activity, including online activity, is in direct conflict with the patron’s right to privacy while using library resources, as well as the patron’s right to access library resources.

Going back to the opening scenario, what can the library do to help the patron maintain their privacy and access library resources? There are a few technical workarounds that the library and patron can explore. The EEF’s Surveillance Self-Defense Guide lists several ways to circumvent internet filtering or monitoring software. Depending on the comfort level of both library staff and patron, one workaround to explore is running the Tor browser from a USB drive, using the pluggable transports or bridges built into Tor as needed. This method allows the patron to use Tor without having to install the browser on the computer, which then would keep the monitoring software from keeping track of what sites the person is visiting. The other major workaround is to use a library computer or another computer, which while inconvenient for the patron, would be another way to protect the privacy of the patron while using library resources.

The above scenario is only one of many scenarios that libraries might face in working with patrons whose personal computers have tracking or filtering software. Tracking and filtering software on patron personal computers is a risk to patron privacy when patrons use those devices to use the library. It is a risk that the library can help mitigate through education and possible technical workarounds, nonetheless.

Now it’s your turn – how would your library handle the college student patron scenario described in the newsletter? Reply to this newsletter to share your library’s experiences with similar scenarios as well. LDH will de-identify the responses and share them in a future newsletter to help other libraries start formulating their procedures. You might also pick up a new procedure or two!

[Many thanks to our friends at the Library Freedom Project for the Tor information in today’s post!]

Silent Librarian and Tracking Report Cards

Welcome to this week’s Tip of the Hat! We at LDH survived the full moon on the Friday the 13th, though our Executive Assistant failed to bring donuts into the office to ward off bad luck. Unfortunately, several universities need more than luck against a widespread cyberattack that has a connection to libraries.

This attack, called Cobalt Dickens or Silent Librarian, relies on phishing to gain access to university systems. The potential victims receive a spoofed email from the library stating that their library account is expired, followed by instructions to click on a link to reactivate the account by entering their account information on a spoofed library website. With this attack happening at the beginning of many universities’ semesters, incoming students and faculty might click through without giving a second thought to the email.

We are used to having banking and other commercial sites be the subject of spoofing by attackers to obtain user credentials. Nonetheless, Silent Librarian reminds us that libraries are not exempt from being spoofed. Silent Librarian is also a good prompt to review incident response policies and procedures surrounding patron data leaks or breaches with your staff. Periodic reviews will help ensure that policies and procedures reflect the changing threats and risks with the changing technology environment. Reviews can also be a good time to review incident response materials and training for library staff, as well as reviewing cybersecurity basics. If a patron calls into the library about an email regarding their expired account, a trained staff member has a better chance in preventing that patron falling for the phishing email which then better protects library systems from being accessed by attackers.

We move from phishing to tracking with the release of a new public tool to assess privacy on library websites. The library directory on Marshall Breeding’s Library Technology Guides site is a valuable resource, listing thousands of libraries in the world. Each listing has basic library information, including information about the types of systems used by the library, including specific products such as the integrated library system, digital repository, and discovery layer. Each listing now includes a Privacy and Security Report Card that grades the main library website on the following factors:

  • HTTPS use
  • Redirection to an encrypted version of the web page
  • Use of Google Analytics, including if the site is instructing GA to anonymize data from the site
  • Use of Google Tag Manager, DoubleClick, and other trackers from Google
  • Use of Facebook trackers
  • Use of other third-party services and trackers, such as Crazy Egg and NewRelic

You can check what your library’s card looks like by clicking on the Privacy and Security Report button on the individual library page listing. In addition to individual statistics, you can view the aggregated statistics at https://bit.ly/ltg-https-report. The majority of public library websites are HTTPS, which is good news! The number of public libraries using Google Analytics to collect non-anonymized data, however, is not so good news. If you are one of those libraries, here are a couple of resources to help you get started in addressing this potential privacy risk for your patrons:

Leaving Platforms and Patrons Behind

Welcome to this week’s Tip of the Hat!

Remember when the online library catalog was just a telnet client? For some of you, you might even remember the process of moving from the card catalog to an online catalog. The library catalog has seen many different forms in recent decades.

The most recent wave of transitions is the migration from an old web catalog – in most cases an OPAC that came standard with an ILS – to a newer discovery layer. This discovery layer is typically hosted by the vendor and offers the ability to search for a wider array of collections and materials. Another main draw of the discovery layers in the market is the enhanced user experience. Many discovery layers allow users to add content to the site, including ratings, comments, and sharing their reading lists to others on the site.

While being able to provide newer services to patrons is important, this also brings up a dilemma for libraries. Many discovery layers are hosted by vendors, and many have separate Terms of Service and Privacy Policies attached to their products outside of the library’s policies. The majority of library catalogs that the discovery layers are meant to replace are locally hosted by the library, and fall under the library’s privacy policies. Libraries who made the transition to the discovery layer more often than not left their older catalog up and running, marketed as the “classic” catalog. However, the work necessary to keep up two catalogs can be substantial, and some libraries have retired their classic catalogs, leaving only the discovery layer for patrons to use.

The dilemma – How will the library provide a core library service to patrons objecting to the vendor’s TOS or privacy policy when the library only offers one way to access that core service?

We can use the Library Bill of Rights [LBR] interpretations from ALA to help guide us through this dilemma. The digital access interpretations of the LBR provides some guidance:

Users have the right to be free of unreasonable limitations or conditions set by libraries, librarians, system administrators, vendors, network service providers, or others. Contracts, agreements, and licenses entered into by libraries on behalf of their users should not violate this right… As libraries increasingly provide access to digital resources through third-party vendors, libraries have a responsibility to hold vendors accountable for protecting patrons’ privacy. [Access to Digital Resources and Services: An Interpretation of the Library Bill of Rights]

Moving core services to third-party vendors can create a barrier between patrons and the library, particularly when that barrier is the vendor’s TOS or privacy policy. The library then needs to decide what next steps to take. One step is to negotiate with the vendor regarding changes to the TOS and privacy policy-based to address patron concerns. Another step is a step that several libraries have opted for – keeping the classic catalog available to patrons alongside the discovery layer. Each step has its advantages and disadvantages in terms of resources and cost.

The classic catalog/discovery layer dilemma is a good example of how offering newer third-party platforms to provide core library services can create privacy dilemmas for your patrons and potentially lock them out from using core services. If your library finds itself making such a transition – be it the library catalog or another core service platform – the ALA Privacy Checklists and the interpretations of the LBR can help guide libraries through the planning process. Regardless of the actions taken by the library, ensuring that all patrons have access to core library services should be a priority, and that includes taking privacy concerns to account when replacing core service platforms.

Caring Who Is Sharing Your Patron Data

Welcome to this week’s Tip of the Hat! Last week Tom Boone stated his intent to boycott two vendors – Thomson Reuters and RLEX Group – at the American Association of Law Librarians annual conference based on the current business relationships that both companies have with U.S. Immigration and Customs Enforcement [ICE]. While the objections are based on the relationships themselves, the boycott posts brings us back to a question posed by Jason Griffey about LexisNexis’s interest in assisting ICE in building an “extreme vetting” system for immigrants to the US – what role would data collected from libraries that subscribe to those vendors’ products play in building such a system? For this week’s letter, we’ll broaden the – what do vendors do with library patron data and what say do libraries have in the matter?

Patron data is as valuable to vendors as it is to libraries. To vendors, patron data can be used to refine existing services while building newer services based off of patron needs and behaviors. The various recommendation systems in several library products are powered partially by patron borrowing activity, for example. Nonetheless, while vendors use patron data for their products and services, many vendors share patron data with other service providers and third-party businesses for a variety of reasons. For example, some vendors run their applications on commercial cloud servers, which could mean storing or transferring patron data to and from these servers. Depending on the agreement between the vendor and the commercial cloud service, the service might also have access to the data for performance tracking and analysis purposes.

How do you find out what vendors are doing with your patron data? One of the first places to look is their privacy policy. Like libraries, vendors too should inform patrons how they are handling patron data. The library should have a separate privacy policy that indicates how library data is shared with vendors, but vendors also need a privacy policy that clearly communicates to patrons using the vendor service on how the data is handled by the vendor, including any sharing of data with service providers or other third parties. LexisNexis’ privacy policy provides some of this information in their How We Use Your Information and Sharing of Your Information sections (which, BTW, you should read if you do use LexisNexis!).

If you can’t find the information you need in the privacy policy, the vendor contract might have some information regarding the collection, use, and sharing of patron data by the vendor. The vendor contract can also serve another purpose, particularly when you are at the contract negotiation or contract renewal stages. The contract can be a good place to lay out expectations to the vendor as to what level of data collection and sharing is permissible. Some data sharing is unavoidable or necessary, such as using aggregated patron data for analyzing infrastructure performance, so if you come to the negotiation table with a hardline “no reuse or sharing with third parties” position, you will most likely be making some compromises. This is also a good place to bring up the question about “selling” vs “sharing” data with service providers – while some vendors state in their privacy policy that they do not sell patron data, they might not mention anything about sharing it with others. Setting expectations and requirements at the point of negotiations or renewal can mitigate any surprises surrounding data use and sharing down the road for all parties involved.

Having the discussion about patron data use and sharing by the vendor will not only allow you to find out what exactly happens to your patrons’ data when they use vendor products, but it also opens up the opportunity for your library to introduce language in the contract that will protect your patrons’ data. You can do this through line edits, or through a contract addendum that has been vetted by your local legal team. Before going to the negotiation table with your proposed changes and requests, you will need to determine what points will you be willing to compromise on, and which points are dealbreakers. Ideally negotiations provide a workable outcome for all, but in reality, sometimes the best outcome for your patrons and staff is to leave the negotiations. Not giving a vendor your library’s business is a valid option – an option that could signal to the vendor that some of their practices need to change if enough libraries choose to follow suit.