Just Published – Data Privacy and Cybersecurity Best Practices Train-the-Trainer Handbook

Cover of the "Data Privacy and Cybersecurity Best Practices Train-the-Trainers Handbook".

Happy October! Depending on who you ask at LDH, October is either:

  1. Cybersecurity Awareness Month
  2. An excuse for the Executive Assistant to be extra while we try to work
  3. The time to wear flannel and drink coffee nevermind, this is every month in Seattle

Since the Executive Assistant lacks decent typing skills (as far as we know), we declare October as Cybersecurity Awareness Month at LDH. Like last year, this month will focus on privacy’s popular sibling, security. We also want to hear from you! If there is an information security topic you would like us to cover this month (or the next), email us at newsletter@ldhconsultingservices.com.

We start the month with a publication announcement! The Data Privacy and Cybersecurity Training for Libraries, an LSTA-funded collaborative project between the Pacific Library Partnership, LDH, and Lyrasis, just published two library data privacy and cybersecurity resources for library workers wanting to create privacy and security training for their libraries:

  • PLP Data Privacy and Cybersecurity Best Practices Train-the-Trainer Handbook – The handbook is a guide for library trainers wanting to develop data privacy and cybersecurity training for library staff. The handbook walks through the process of planning and developing a training program at the library and provides ideas for training topics and activities. This handbook is a companion to the Data Privacy Best Practices Toolkit for Libraries published last year.
  • PLP Data Privacy and Cybersecurity Best Practices Train-the-Trainer Workshops (under the 2021 tab) – If you’re looking for train-the-trainer workshop materials, we have you covered! You can now access the materials used in the two train-the-trainer workshops for data privacy and cybersecurity conducted earlier this year. Topics include:
    • Data privacy – data privacy fundamentals and awareness; training development basics; vendor relations; patron programming; building a library privacy program
    • Cybersecurity – cybersecurity basics; information security threats and vulnerabilities; how to protect the library against common threats such as ransomware and phishing; building cybersecurity training for libraries

Both publications include extensive resource lists for additional training materials and to keep current with the rapid changes in cybersecurity and data privacy in the library world and beyond. Feel free to share your training stories and materials with us – we would love to hear what you all come up with while using project resources! We hope that these publications, along with the rest of the project’s publications, will make privacy and cybersecurity training easier to create and to give at your library.

Is Library Scholarship a Privacy Information Hazard?

A white hazard sign with an image of a human stick figure being sapped by a electric blob. Image is sandwiched between red and black text - "Warning, this area is dangerous"
Image source: https://www.flickr.com/photos/andymag/9349743409/ (CC BY 2.0)

Library ethics, privacy, and technology collided again last week, this time with the publication of issue 52 of the Code4Lib Journal. In this issue, the editorial committee published an article describing an assessment process with serious data privacy and ethical issues and then explained their rationale for publishing the article in the issue editorial. The specifics of these data privacy and ethical issues will not be covered in-depth in this week’s newsletter – you can read about said issues in the comment section of the Code4Lib Journal article in question.

You might have noticed that we said “again” in the last paragraph. This isn’t the first time library technology publications and patron privacy collided. The Code4Lib Journal published a similarly problematic article last year, but the journal is one of many library scholarship venues that have published scholarly and practical literature that are ethically problematic with regard to patron privacy. Technology and assessment are the usual offenders, ranging from case studies of implementing privacy-invasive technologies to research extolling the benefits of surveilling students in the name of learning analytics without discussing the implications of violating student patron privacy. These publications are not set up as a point-counterpoint exploration of these technologies and assessment methods in terms of privacy and ethics. Instead, these publications are entered into the scholarly record as is, with an occasional contextual note or superficial sentence or two about privacy. Retraction is almost unheard of in library scholarship, and retraction is not very effective in addressing problematic research.

Library scholarship is not consistently aligned with the profession’s ethical standards to uphold patron privacy and confidentiality. Whether or not an article is judged on its potential impact on library privacy is currently up to the individual peer reviewer (or in the case of editor-reviewed journals such as Code4Lib, the editor). In addition, library scholarship is not set up to assess the potential privacy risks and harms of the publication in question to specific patron groups, particularly patrons from minoritized populations. Currently, there is no suitable mechanism to do such an assessment that can be included in the original publication so that it would be both meaningful and informative to the reader. We are left with publications in the library scholarship record that promote the uncritical adoption of high-risk practices that go against professional ethics and harm patrons. This becomes more perilous when these publications come across those in the field who do not have the knowledge or experience in assessing these publications with patron privacy and ethics in mind.

What we end up with, therefore, is a scholarly record full of information hazards. An information hazard is a particular piece of information that can potentially cause harm to the knower or create the potential to harm others. This differs from misinformation where the information being spread is false, whereas the truthfulness of the information hazard is intact. Nick Bostrom’s seminal work on information hazards breaks down the specific risks and harms of different types of hazards. Library scholarship has (at least) two information hazards in particular when it comes to library privacy and ethics:

Idea hazard – Ideas hold power. They also come with risks. Even if the dissemination of an idea is kept at a high level without specific details, it can become an idea hazard. The idea that a library can use a particular system or process to assess library use can risk patron privacy. There are ways to mitigate an idea risk of this nature, including evaluating the assessment idea through the Five Whys method or other methods to determine the root need for such an assessment.

Development hazard – A development hazard is when advancement in a field of knowledge leads to technological or organizational capabilities that create negative consequences. Like other fields of technology, library technology falls into this hazard category, particularly when combined with the evolution of library assessment practices and norms. Sharing code and processes (which is a data hazard) can lead to community or commercial development of more privacy-invasive library practices if no care is taken to mitigate patron privacy risks.

How, then, can library scholarship become less of a privacy information hazard? First and foremost, the responsibility falls on the publishers, editors, peer reviewers, and conference program organizers who control what is and is not added to the library scholarly record. This includes creating a code of ethics for submission authors to follow and guidelines for reviewers and editors to follow to assess the privacy and ethical implications of the submission. However, these codes and guidelines are not effective if they are not acted upon. As Dorothea Salo says, “Research on library patrons that contravenes library-specific ethics is unethical; it should not be published in the LIS literature, and when published there, should be retracted.” Regardless of the novelty or other technical merits of the submission, if the submission violates or goes against library ethics or privacy standards, the editors, reviewers, and publishers have the responsibility as shapers of the scholarly record to not publish the submission lest they add yet another information hazard to the record.

Library privacy and ethics must also be a part of every stage of the submission and publication process. This takes a page from Privacy by Design, taking a proactive approach to privacy instead of rushing to include privacy at the last minute, making any privacy effort ineffective at best. Ethical codes and guidelines are one way to embed privacy into a process; another is to include checkpoints in the process to bring in external subject matter experts to review submissions well in advance to identify or comment on specific privacy or ethical risks. If done early in the submission process, the information received can then be used to revise the submission to address these issues or to change the focus of the submission to one that is more appropriate to address the privacy and ethical implications of the topic at hand. The submission itself doesn’t have to be abandoned, but it must be constructed so that the privacy and ethical risks are front and center, describing why this method, idea, process, or code goes against library ethics and privacy. This option doesn’t eliminate the idea/data hazard, but shifting the focus on privacy and ethical repercussions can mitigate the risks that come with such hazards.

Whether intentional (as in the case of the latest Code4Lib Journal issue) or unintentional, library scholarship places patron privacy at risk through the unrestricted flow of information hazards. Many in the profession face pressure to create a constant stream of scholarship, but at what cost to our patrons’ privacy and professional ethics? A scholarly record full of privacy information hazards has and will continue to have long-lasting implications for the profession’s ability to protect patron privacy as well as how well we can serve everyone in the community (and not just those who have a higher tolerance for privacy risks or won’t be as negatively impacted by poor privacy practices). As the discussion about the Code4Lib Journal’s decision to publish the latest information hazard into the scholarly record continues, perhaps the community can use this time to push for more privacy and ethically-aligned submission and review processes in library scholarship.

Mid-September Readings, Viewings, and Doings

A light brown rabbit sits on top of a keyboard looking up at two computer screens, reading email.
Image source: https://www.flickr.com/photos/toms/127809435/ (CC BY 2.0)

September has proven itself to be a busy month for all of us! This week we’re taking a breather from our usual (longer) posts by highlighting a few resources that you might find of interest, and some homework, to boot.

What to Read

For years there has been a concerted effort in getting libraries to secure their websites through HTTPS, but have those efforts paid off? A recently published article by librarian Gabriel Gardner describes how much further we have to go with HTTPS on library websites, but it doesn’t stop there. The article also describes how libraries are complicit in third-party tracking with various web trackers found on library websites, including (unsurprisingly) Google Analytics. Give this article a read, then hop on over to your library website. How is your library website contributing to surveillance by allowing third parties to vacuum up all the data exhaust your patrons are leaving behind while using the library website? We’ve written about alternatives to Google Analytics and other forms of tracking if you need a place to start in reducing the third-party tracker footprint at your library.

What to Watch/Read

At LDH, we talk a lot about ethics and technology. You might be wondering where you can learn more about the ethics of technology without diving headfirst into a full-time college course. If you have some time to watch a few TikTok videos and read a couple of articles during the week, you’re in luck – Professor Casey Fiesler’s Tech Ethics and Policy class is in session! You can follow along by watching Dr. Fiesler’s TikTok videos and doing the readings posted on Google Docs. But you can do much more than following along – join the office hours or the discussions in the videos!

What to Do

Perhaps you’re looking for something else to do other than website or ethics classwork. We won’t hold that against you (though we really, really recommend reviewing what trackers your library website has). So, here’s a suggestion for your consideration. It’s been a while since we did our #DataSpringCleaning. Do you dread cleaning because there’s always so much stuff to deal with by the time we get around to doing it? Taking five to ten minutes now to dispose of patron data securely can go a long way to reducing the amount of data you have to deal with during the annual #DataSpringCleaning. It’s also an excellent privacy and security hygiene habit to adopt. Spending a few minutes to secure sensitive data can fill in the gaps in your schedule between meetings or projects, or it can be part of your routine for starting or ending your workday. And it does give you some feeling of accomplishment on particularly frustrating days where nothing seems to have gotten done.

If you come across any library privacy-related resources that you would like highlighted in the newsletter, let us know by emailing newsletter@ldhconsultingservices.com. In the meantime, best of luck with the workweek, and we’ll catch you next week.

The Lasting Impact of The Patriot Act on Libraries

A man wearing sunglasses holds a white sign as he walks through a street protest. The sign has two human eyes looking up and to the right. The sign message - 'The "Patriot" Act is watching you"
Image source – https://flickr.com/photos/crazbabe21/2303197115/ (CC BY 2.0)

This weekend marked the 20th anniversary of 9/11 in the US. Life changed in the US after the attacks. One of the many aspects of our lives that changed was the sudden erosion of privacy for everyone living in the States. One of the earliest visible examples of this rapid erosion of privacy was the Patriot Act. Let’s take a moment and revisit this turning point in library privacy history and what has happened since.

A Quick Refresher

The Patriot Act was signed in October 2001 after the attacks of September 11th. The law introduced or vastly expanded government surveillance programs and rights. US libraries are most likely familiar with Section 215. While in the past the government was limited in what information they could obtain through secret FISA orders, Section 215’s “tangible things” expanded the use of these secret orders to “books, records, papers, documents, and other items.” Given the examples included in the Section’s text, it wasn’t too much of a stretch to assume that “tangible things” included library records.

The good news – for now – is that Section 215 is not here to mark the 20th anniversary of the passage of the Patriot Act. The Section was sunsetted in 2020 after years of renewal and a second life through the USA Freedom Act. The Section did not die quietly, though – while support for renewal spanned across both parties in the Senate and the House, different versions of the renewal bill stalled the renewal process. The possibility of a renewal of Section 215 or a similar version of the Section is still present. However, it is unclear as to when talks of renewal will restart.

The Act’s Impact on Libraries

Libraries acted quickly after the passage of the Act. Right after the passage of the Patriot Act, those of us in the library profession might remember taking stacks of borrowing histories and other physical records containing patron data and sending them through the shredder. Other libraries adjusted privacy settings in their ILSes and other systems to not collect borrowing history by default. ALA promptly sent out guidance for libraries around updating privacy and law enforcement request policies and procedures. And it would be safe to assume that several people got into librarianship because of the profession’s efforts in protecting privacy and pushing back against the Patriot Act.

Even with the flurry of activity in the profession early on, questions about the use of Section 215 to obtain patron data persist today. Even though the Justice Department testified in 2011 that Section 215 was not used to obtain circulation records, the secrecy imposed on searches in Section 215 makes it difficult to determine precisely the extent of the Section’s library record collection activities.

While we cannot say for sure if Section 215 was used to obtain patron data, we know that other parts of the Act were used in an attempt to get patron data. Most notably was the use of National Security Letters (NSL) and gag orders by the government to obtain patron data. The Connecticut Four successfully challenged the gag order on an NSL served to the Connecticut library consortium Library Connection. While the Connecticut Four took their fight to court, other libraries proactively tried to work around the gag order by posting warrant canaries in the building to notify patrons if they had been served an NSL.

Lessons Learned or Business as Usual?

The Patriot Act reminded libraries of the threat governments pose to patron privacy. Libraries responded with considerable energy and focus to these threats, and these responses defined library privacy work in the 21st century library. Still, the lessons learned from the early days of the Act didn’t entirely transfer to other threats that pose as much of a threat to patron privacy as governments and law enforcement. While libraries could quickly dispose of risky patron data on paper after the Act’s passage, a substantial amount of today’s patron data lives on third-party databases and systems. The removal of control over patron data in third-party systems limits the ability to adjust to new privacy threats quickly. Technology has evolved to provide some possible protections, including encryption and other ways to restrict access to data. Legal regulations around privacy give both libraries and patrons some level of control over data privacy in third-party systems. Despite these progressions in technology and law, data privacy in the age of surveillance capitalism in the library brings new challenges that many libraries struggle to manage.

Some could argue that libraries sub-optimized data privacy protections in response to the Act’s threats, hyper-focusing on government and law enforcement at the expense of addressing other patron privacy risks. At the same time, the standards and practices developed to mitigate governmental threats to patron privacy can be (and to certain extents have been) adapted to minimize these other risks, particularly with third parties. One of the first lessons learned in the initial days of the Act came from the massive efforts of shredding and disposing of patron data in bulk in libraries throughout the country. Libraries realized at that moment that data collected is data at risk of being seized by the government. Data can’t be seized if it doesn’t exist in the first place. As libraries continue to minimize risks around law enforcement requests, we must remember to extend those privacy protections to the third parties that make up critical library operations and services.

A Short Reflection on Uncertainty and Risk

A white woman standing with her back to the beach in front of waves coming to shore. A yellow sign in the foreground has an illustration of a shark and states "Shark sighted today - enter water at own risk".
Photo by Lubo Minar on Unsplash

We made it! We’re coming to you from our new server home. We’re still settling in, so please let us know if you come across something that isn’t quite working on the website. If you are one of our email subscribers and find this post in your spam box, you can add newsletter@ldhconsultingservices.com to your contacts list to help prevent future emails from being banished to the spam folder.

Now that the dust has settled, we regret to inform you that summer is almost over. Schools are back in session, summer reading programs are wrapping up for the season, and a new batch of LIS students are starting their first semester of library school. We also regret to inform you that the pandemic is still hanging in there, adding its own layer of stress and uncertainty on top of everything else.

Uncertainty is hard to plan for, even in non-pandemic times. Libraries with plans for phasing back in-building services find themselves changing those plans daily to keep up with changes in health ordinances, legal regulations, and parent organizational mandates. We find ourselves back in the first few months of the pandemic, scrambling to figure out what to do. Then again, we haven’t stopped scrambling throughout the pandemic to find ways to provide patrons services that won’t put both patrons and library workers at risk.

Risk assessment and management are exercises in dealing with uncertainty. We like to have neat solutions to neat problems; risk management tells us that problems are much messier and are less likely to be solved with neat solutions. Take, for example, four common responses used in determining how to manage risk:

  • Accept – Choosing to accept the risk, usually done in cases where the cost of the realized risk is less than the cost in addressing the risk
  • Transfer – Shifting the risk to another party (another person, group, or tool) who is better situated to manage the risk
  • Mitigate – Adding checks or controls to limit risk in a particular situation
  • Eliminate – Changing something to remove or avoid the risk

Some of you might be surprised that the last response, eliminate, is not the primary goal in risk management. This is partly due to the level of control we have in the situation that presents the risk. We cannot eliminate some risks due to, well, pandemic, while others are unavoidable due to the nature of our work – where we work, operational needs, external needs/pressures, and so on. In those instances where we cannot entirely eliminate the risk, we can still have some control over our response to the risk, particularly with mitigating or transferring the risk.

While we cannot eliminate all risks in our libraries around the pandemic’s uncertainty, we can still work toward identifying and managing risks that we have more control over, including those risks around patron privacy. Here are a few resources to get you started on managing patron data privacy risks:

By focusing on risks that we are better situated to address through transference, mitigation, and elimination, we can avoid the inertia that comes with being overwhelmed by risks we have less control over. It might seem like arranging the deckchairs on the Titanic, but living with so much uncertainty in such a short time can short-circuit our ability to identify and manage risk, particularly when we are not trained to manage risk during long periods of heightened uncertainty. If you find yourself at that point, you can take advantage of the start of the fall season by resetting the privacy risk management button by making a list of privacy risks outside your control and risks that you or your library are better able to manage. You might not be able to identify all the risks in one sitting, and that’s okay. If you are struggling to identify risks that you or your library can manage, revisit the earlier resources to help you through the process.

Managing risk requires accommodating uncertainty and variations of the same risk. Risk likelihoods and severity can change without notice. Risks also have different severity, harms, and likelihoods for different people – what might be a low harm risk for one person might be a risk that has more significant harms for another. Risk management strategies help wrangle this uncertainty by providing some structure in responding to the uncertain nature of risk. While we can’t eliminate uncertainty, we can be better prepared to manage uncertainty in parts of our lives, such as our work that affects patron privacy.

Watching You Watching Me

Imagine this – you visit your local art museum for the first time in over a year. You’re excited to be back in the physical building! You get to be in the same physical space as the art! You make your way to one of your favorite art pieces in the museum, but when you finally arrive, you find something odd. Next to your favorite art piece is a small camera pointing at you and everyone else viewing your favorite art piece.

A ShareArt camera next to a painting in the Istituzione Bologna Musei.
Image source: Istituzione Bologna Musei

Is this to make sure people are wearing masks? Social distancing? Or is it something more?

Museum-goers in Italy are already facing this reality with the inclusion of the ShareArt system in several Italian museums. The system aims to track how long museum visitors spend at the museum piece, creating data to inform exhibition layout and scheduling decisions. In addition, there is interest in having the system capture and analyze facial expressions as mask mandates fall to the wayside. While this project aims to guide museums in making their collections more visible and accessible for museum visitors, it also brings up new and perennial concerns around privacy.

Tracking Bodies, Tracking Data

Libraries and museums are no strangers to counting the number of people who come into a building or attend an event. Door counters installed on entrance/exit gates are a common sight in many places, as well as the occasional staff with a clicker manually counting heads in one space at a specific time. The data produced by a door counter or a manual clicker counts heads or people in an area usually is relegated to the count and the time of collection. This data can get very granular – for instance, a door counter can measure how many people enter the building in the span of an hour, or a staff person can count how many people are in a space at regular intervals in a day. This type of data collection, if nothing else is collected alongside the count and time collected, is considered a lower risk in terms of data privacy. Aggregating count data can also protect privacy if the door or event count data is combined with other data sets that share data points such as time or location.

Patron privacy risk exponentially increases when you introduce cameras or other methods of collecting personal data in door or space counts. Door or space counters with webcams or other cameras capture a person’s distinct physical traits, such as body shape and face. This updated door counter mechanism is a little different than a security camera – it captures an individual patron’s movements in the library space. With this capture comes the legal gray area of if audio/visual recordings of patron use of the library is protected data under individual state library privacy laws, which then creates additional privacy risks to patrons.

Performing for an Audience

One good point on Twitter about the ShareArt implementation is that people change their behavior when they know they are being watched. This isn’t a new observation – various fields grapple with how the act of being observed changes behavior, from panopticon metaphors to the Hawthorn Effect. If a project is supposed to collect data on user behavior in a specific space, the visible act of measurement can influence the behavioral data being collected. And if the act of measurement affected the collected data, how effective will the data be in meeting the business case of using behavioral data to improve physical spaces?

Libraries know that the act of surveilling patron use of library resources can impact the use of resources, including curtailing intellectual activities in the library. Privacy lowers the risk of consequences that might result from people knowing a patron’s intellectual pursuits at the library, such as checking out materials around specific topics around health, sexuality, politics, or beliefs. Suppose patrons know or suspect that their library use is tracked and shared with others. In that case, patrons will most likely start self-censoring their intellectual pursuits at the library.

The desire to optimize the layout of the physical library space for patron use is not new. There are several less privacy-invasive ways already in use by the average library to count how many people move through or are in a particular space, such as the humble handheld tally clicker or the infrared beam door counter sensors. Advancements in people counting and tracking technology, such as ShareArt, boast a more accurate count than their less invasive counterparts but underplay potential privacy risks with the increased collection of personal data. We come back to the first stage of the data lifecycle – why are we collecting the data we are collecting? What is the actual, demonstrated business need to track smartphone wifi signals, record and store camera footage, or even use thermal imaging to count how many people enter or use a physical space at a particular time? We might find that the privacy costs outweigh the potentially flawed personal data being collected using these more invasive physical tracking methods in the name of serving the patron.

An Audacity Postmortem for the Library World

A black silhouette of a condenser microphone against a white background with a blue audio wave track spanning across the middle of the background.
Image source: https://www.flickr.com/photos/187045112@N03/50135316221/ (CC BY 2.0)

It seemed so long ago – last week at this time, LDH was logging back into the online world only to find yelling. Lots of yelling. Why were so many people yelling in our timeline? What did someone in the library world do this time to set people off?

It turns out that the source of the outrage wasn’t located in the library world but instead in the open source community. Users of the popular audio editor Audacity loudly objected to the recently updated privacy policy, claiming that the new language in the policy violates the existing license of the software and turns Audacity into spyware. Even after clarification about the new language from Audacity, several users took the current Audacity code to start their own Audacity-like software projects that wouldn’t be subject to the new policy language. This created its own issues – one project maintainer was attacked after a targeted harassment campaign after they objected to the offensive name of another project.

The Audacity debacle continues; nevertheless, are a couple of lessons that libraries can take away from this mess.

Privacy Notices and Your Patrons

We will start at the privacy notice. In the privacy world, a privacy notice differs from a privacy policy. The latter is an internal document, and the former being a document published to the public. In part, a privacy notice informs the public about your privacy policies/practices and what rights the public has regarding their privacy. The language changes to the privacy notice carry several possible points of failure, which we encountered with the Audacity example. A comment thread in the language clarification post identifies some of the significant issues with how Audacity went about the changes:

“I think what a lot of people are also taking issue with… is that these major, scary-sounding changes are popping up seemingly out of nowhere without any sense of community consultation. Right now, I think people feel caught off-guard yet again and are frustrated that the maintainers aren’t demonstrating that they care about what the broader community thinks of their decisions.”

What can libraries take away from this?

  • Write for your audience – Privacy notices are notoriously riddled with legal language that many in the general public are not equipped to navigate or interpret. Your privacy notice can’t skip the vetting process by your legal staff, but you can avoid confusion by using language that is appropriate for your audience. This includes limiting library and legal jargon or creating summaries or explanations for specific points in the notice to understand more detailed or longer sections of the notice. Twitter’s use of summaries and lists in their privacy notice is one example of writing to the audience. In addition, don’t forget to write the notice in the major languages of your audience. Everyone in your community deserves to know what’s going on with their privacy at the library.
  • Involve your audience – The earlier quote from an Audacity community member demonstrates what can happen when key stakeholders are left out of critical decision-making processes. How is your library working with patrons in the creation and review of the privacy notice? Asking patrons to review notices is one way to involve patrons, but involving patrons throughout the entire process of creating and reviewing a privacy notice can reveal hidden or overlooked privacy issues and considerations at the library.
  • Communicate to your audience – What do you do when you publish a change in the privacy notice? Your patrons should not be caught off guard with a significant change to the notice. Luckily, your library already has many of the tools needed to tell your patrons about important updates, from your library’s news blog or newsletter to in-library physical signage and flyers. Website alerts are also an option if used judiciously and designed well – a website popup, while tempting, can be easily clicked away without reading the popup text.

Open Source Software and Privacy Expectations

We’ll go ahead and get this out of the way – open source software is not inherently more private and secure than its proprietary counterparts. OSS can be private and secure, but not all OSS is designed with high privacy and security standards by default. One of the primary reasons why so many in the Audacity community were upset with the changes is their assumption that OSS would not engage in data collection and tracking. However, several other popular OSS projects engage in collecting some level of user data, such as collecting data for crash reporting. Having other major OSS players collect user data doesn’t automatically make this practice okay. Instead, the practice reminds those who make software decisions for their libraries that OSS projects should be subject to the same rigorous data privacy and security review as proprietary products.

A strength of OSS is the increased level of control users have over the data in the software – libraries who have the in-house skills and knowledge can modify OSS to increase the level of privacy and security of patron data in those systems. The library OSS community can provide privacy-preserving options for libraries. Other libraries have already shared their experiences adopting privacy-preserving OSS at the library, such as Matomo Analytics and Tor. Ultimately, libraries who want to protect patron privacy must choose any software that might touch patron data with care and with the same level of scrutiny regardless of software licensing status.

To Build or to Target?

It’s been a busy couple of weeks in the privacy world. First, Colorado is poised to be the newest state to join the patchwork of US state data privacy law. Next, Overdrive acquires Kanopy. And then there’s what happened when a patron submits an FOIA request for their data. Privacy forgot that it’s supposed to be summer vacation! Today we’re setting aside those updates and talking about a topic that has been one of the most requested topics for the blog.

You or your colleagues might be scanning through the last couple months of American Libraries in preparation for ALA Annual later this month, only to come across the “Target Acquired” article in the May 2021 issue (page 52-53), profiling three libraries in their use of marketing and data analytic products. The profiles seem harmless enough, from email newsletter management to collection analysis. They want to understand their patrons to serve their communities better. These profiles give three different ways these products can help other libraries do the same.

Did you notice, though, that none of the profiles talked about patron privacy?

There’s a reason for that. Marketing and data analytics products such as customer relationship management systems (CRMS) rely on personal data – the more, the better. The more data you feed into the system, the more accurate the user profile is to create a personalized experience or more effective marketing campaigns. CRMS are increasingly integrated into the ILS – OCLC Wise is an example of such an integration, and other ILS companies plan to release their own versions or create better integrations with existing products on the market. The libraries using Engage and Wise are excited about the possibilities of better understanding their patrons through the data generated by patron use of the library. However, we wonder if these libraries considered the consequences of turning patrons into data points to be managed in a vendor system.

It should be no surprise to our readers that LDH’s approach to marketing and data analytics in libraries does not place data above all else. Data ultimately does not replace the relationship-building work that libraries must do through meeting with community members. However, advertisement pieces such as the one in American Libraries aim to normalize user profiles in CRMS and other analytics products in libraries. As the article states at the beginning, data plays a large part in library outreach. With the pressure to prove their value to the community, library administration and management will reach for data to secure their library’s future in the community. The cost of over-relying on data to prove a library’s value, however, is usually left unexamined in these situations.

With that said, let’s do a little exercise. We have the chance to write a sequel to the advertisement piece. Instead of questions about the products, our questions will turn the tables and focus on the libraries themselves:

What are the privacy risks and potential harms to different patron groups from using the product?

Increased patron surveillance via data collection and user profiling can lead to disproportionate privacy risks for several patron groups. In addition, the business models of several vendors create additional harm by targetting specific minoritized groups, such as reselling data to data brokers or providing data to government agencies such as ICE.

What business need(s) does the product meet? What other products can meet the same need that doesn’t create a user profile or require increased patron surveillance?

Sometimes libraries buy one system that doesn’t match the actual business need for the library. For example, several collection management systems on the market do not require individual-level data to provide analysis as to how to spend collection budgets or meet patron demand. In addition, libraries do not need market segmentation products to perform collection usage analysis.

How does the library reconcile the use of the product with Article III of the ALA Code of Ethics, Article VII of the ALA Library Bill of Rights (and the accompanying Privacy Interpretation document), and other applicable library standards and best practices around patron privacy?

This one is self-explanatory. FYI – “Other libraries are doing the same thing” is not an answer.

What are social, economic, and cultural biases encoded into the product? What biases and assumptions are in the data collection and analysis processes?

Library services and systems are not free from bias, including vendor systems. One bias that some libraries miss is that the data in these systems do not reflect the community but only those who use the library. Even the list of inactive users in the system does not fully reflect the community. Moreover, data alone doesn’t tell you why someone in your community doesn’t have a relationship with the library. Data doesn’t tell you, for example, that some patrons view the library as a governmental agency that will pass along data to other agencies. Data also won’t fix broken relationships, such as libraries violating patron trust or expectations.

What is the library doing to inform patrons about the use of the product? Do patrons fully understand and consent to the library using their data in the product, including pulling data from data brokers and creating profiles of their library use?

More likely than not, your library does not give patrons proper or sufficient notice, nor give patrons the chance to explicitly consent for their data to be collected and used in these products. Refer to the Santa Cruz Civil Grand Jury report on what happens when the public calls out a library using a product in the advertisement article without full patron notification or consent.

Keep these questions in mind the next time you read about marketing and data analytics products in professional magazines such as American Libraries. These advertisement articles are designed to fly under the radar for readers who might not be thinking about the privacy implications of highlighted products and practices. Building relationships with the community require a considerable amount of time and care from the library. Data might seem to be a shortcut in speeding up the process. Nonetheless, choosing to view patrons as targets and metrics can ultimately undermine the foundation of any sustainable relationship.

Reader Survey Open Until June 15th

Thank you to everyone who has filled out the reader survey. If you haven’t filled out the survey yet, we want to hear from you! Take five minutes to help shape the future of the blog by filling out our short survey.

A Quick Chat About Patron Data Privacy During Company Acquisitions and Mergers

Another week, another acquisition. The latest news in the library vendor world came last Monday, with Clairvate purchasing ProQuest at the small sum of $5.63 billion. Academic libraries that subscribe to Web of Science and EndNote with Clairvate and Alma and Primo with ProQuest face the reality that now all of these products are owned by one company. We can’t forget that ProQuest has its fair share of mergers and acquisitions, though, as illustrated in Marshal Breeding’s ProQuest mergers and acquisitions chart.

This latest acquisition continues the trend of consolidation in the library vendor marketplace. With this consolidation of products and services comes the ability for companies to create more complete profiles of library patrons through increased data collection and tracking capabilities. In fact, during the company call regarding the acquisition on May 17th, company representatives commented that with the ProQuest acquisition, the company “can serve the entire research value chain, early stage and K12 setting, thru postgrad.” Put another way by another company representative, “We can touch every student in K through doctoral degrees everywhere. There is no product overlap.” Combine that quote with phrases from the press release such as “long-term predictive and prescriptive analytics opportunities from the enhanced combination of ProQuest’s data cloud with the billions of harmonized data points in the Clarivate Research Intelligence Cloud” (emphasis mine). You start to understand why this acquisition is a patron privacy concern.

This isn’t the first time a merger or acquisition brought up library privacy concerns. However, the size of this acquisition is cause for all libraries to stop and review their vendor management practices. The vendor relationship lifecycle can assist libraries in reviewing some of their vendor management practices. It’s difficult to determine if a vendor will still be around as an independent company in a few years when you’re shopping for a product or service. Nonetheless, it’s still worthwhile to do some research around the company. For example, you can find the latest vendor news in various library industry publications and sites such as Computers in Libraries and Library Technology Guides. Doing some research ahead of time (including asking around your professional network) can flag potentially problematic or unsustainable businesses to remove from consideration in the selection process.

The onboarding stage provides opportunities for libraries to mitigate privacy risks throughout the rest of the vendor lifecycle. Contracts usually do the heavy lifting when determining the fate of customer data after an acquisition, merger, or bankruptcy. We won’t get into the detailed legal aspects of mergers and acquisitions – we are not lawyers at LDH. Still, you can read a two-part blog series about pre- and post-closing liabilities around privacy and acquisitions/mergers if you want the nitty-gritty legal details. Nonetheless, vendor contracts should have something in the contract about what will happen to patron data in the case of a merger, acquisition, or bankruptcy. Though the concept of data ownership is fraught with equating data to a commodity, retaining ownership of patron data by the library addresses some of the risks, including patron data in the list of company assets during a sale or bankruptcy. Another contract negotiation point is reserving the right to withdraw the library’s data from the company after a sale or bankruptcy. This withdrawal needs to address how the data should be securely transferred and deleted from the vendor’s systems, treating this process as the separation process at the end of a business relationship. Yet another control strategy is requiring explicit and affirmative informed consent from patrons if the vendor wants to include the patrons’ data in the acquisition or merger. The more control the library has over the fate of the data after a company is bought or goes under, the better chances the library has to mitigate privacy risks.

Thanks to the trend toward monopolies in the library marketplace, libraries subscribing to ProQuest or Clairvate products and services have limited options outside of using the contract in controlling data flows and disclosures during a merger or acquisition. When discussed with your legal staff, the contract strategies mentioned earlier can mitigate data privacy risks when the vendor eventually becomes part of a giant conglomerate. Conglomerates (or monopolies) can go beyond the basic user profiles and analytics with more invasive behavioral tracking and analytic practices traditionally absent in libraries. Until there is a critical mass of libraries combining their political capital to push vendors to engage in privacy-preserving data management, individual libraries will need to continue navigating contract languages and “what if” scenarios on a vendor-by-vendor basis.