Teaching Privacy in Information Literacy Sessions

Welcome to this week’s Tip of the Hat!

Summer is over, and for many library workers, the start of the fall season means an uptick of library instruction sessions and programs. Academic and school library workers who already face the challenge of creating and teaching “one-shot” instructional sessions have the added challenge of moving these sessions online instruction during a pandemic. With this move to online comes the increased use of learning management systems and other online tools and applications that collect, process, and share student data. This increase in use translates into an increased risk to student privacy, particularly while interacting with the library’s online services and programs, and this risk might not be readily apparent to students who are facing many stressors and challenges in their first few weeks into the new school year.

Navigating “one-shot” library instruction sessions or other short interactions between the library and the student is not easy; however, these instruction sessions and interactions also present the opportunity to raise awareness about data privacy and security. One way to take advantage of this opportunity is to move away from the mindset of approaching data privacy in library instructional sessions as “yet-one-more-thing” to teach in an already packed session. That’s not an easy task for anyone, even for those of us who are privacy advocates.

In their article “Privacy literacy instruction practices in academic libraries: Past, present, and possibilities“, Sarah Hartman-Caverly and Alexandria Chisholm surveyed academic library workers and their experiences incorporating privacy into their instructional sessions. Out of 80 respondents, over one-third reported not including privacy topics in their library instruction sessions. Even those who include privacy topics in their instruction were not satisfied with privacy instruction at their institutions, with the majority being neutral or somewhat dissatisfied. This dissatisfaction stems from a variety of factors, with 80% of 55 respondents (n=44) stating that they do not have enough instructional time to cover privacy. This is the reality of many library instructors overall and requires a radical departure of how libraries traditionally deliver library instruction to students, as well as working with faculty and staff in developing and delivering this instruction.

What caught our attention at LDH is the second factor that almost 62% of survey respondents (n=34) identified as to why they are dissatisfied with privacy instruction – “Privacy is not a priority learning outcome for IL sessions”. What can make privacy a priority, then? Again, this requires a radical departure of how libraries approach information literacy (IL), but it also requires an examination of the priorities of the individual library as well as the professional frameworks library workers use to inform their approach to IL and pedagogy. While ALA’s Library Bill of Rights explicitly states privacy as a patron right, the ACRL Framework for Information Literacy for Higher Education only includes one mention of privacy concerning “issues related to privacy and the commodification of personal information.” Privacy is much more than the commodification of personal information, but the Framework does not reflect this reality. The lack of guidance in the Framework, as well as the dearth of concrete case studies of privacy in IL in the LIS literature noted by Hartman-Caverly and Chisholm, leave IL instructors little to work within a time where privacy instruction is more vital than ever.

Hartman-Caverly and Chisholm give their readers some guidance in their privacy literacy case study as well as their recommendations for addressing the barriers noted by survey respondents. The literature review of the article is another resource to glean strategies in bringing privacy into IL practices.

For those who are still struggling in thinking about how to incorporate privacy into an already packed lesson plan, think about this – what library resources and apps are you teaching to your students? Library systems and applications, particularly third-party apps and resources, also collect, process, and share patron data. Talking about digital data privacy and security in the context of using library services and resources can be one way to introduce students to privacy literacy while educating patrons about the library’s privacy practices. This approach to privacy literacy in “one-shot” instructional sessions can be strengthened by offering patron data privacy services such as the services provided by Cornell University; nonetheless, using the library’s own resources and tools when talking about privacy is a start for library instructors who are short on time.

So You Want to Work With Patron Data… De-identification Basics

Welcome to this week’s Tip of the Hat! This week’s post is a “back to basics” about de-identification and patron data. Why? After reading a recent article published in the Code4Lib Journal where patron data was not de-identified before combining it with external data sets, now’s a good time as any to remind library workers about de-identification. [1]

De-identification Definitions

Before we talk about de-identification, we must talk about anonymization and the differences between the two:

  • De-identification is when you remove the connection between the data and any identifiable individual in the real world. Sometimes de-identified datasets have a unique identifier replacing personally identifiable information (PII) to data points, which is then called pseudonymization.

De-identification provides a way for some to work with data to track individual trends with a reduced risk of re-identification and other privacy risks. Why “for some” and “reduced”? We’ll get into the whys of the issues with de-identification later in this post.

De-identification Method Basics

PII comes in two forms: data about a person and data about a person’s activities that can be linked back to the person. The methods and level of work needed to sufficiently de-identify patron data depend on the type of PII in the data set. The methods commonly used to de-identify PII include truncation, obfuscation, and aggregation.

  • Obfuscation moves the reference point of the data up a few levels of granularity. An example is using a birth year or age instead of the person’s full birth date.
  • Truncation strips the raw data to a small subsection general enough that it cannot be easily connected to an identifiable person. A real-world example of truncation is HIPAA’s guidance on physical address de-identification, truncating the address to the first three digits of the zip code.
  • Aggregation further groups individual data points creating a more generalized data set. Going back to the obfuscation example, individual ages can be aggregated into age ranges.

There are more methods to de-identify data, some of which can get quite complex, such as differential privacy. The three methods mentioned above, nonetheless, are some of the more accessible de-identification methods available to libraries.

Before You De-identify…

Remember in the first section that we mentioned that de-identification only works for some data sets and only reduces privacy risk? There are two main reasons why this is:

  1. De-identification does not protect outliers in data or for small population data sets. There are equations (more) and properties that can help you determine if your dataset cannot be re-identified, but for most libraries, de-identification is not possible due to the type or size of the data set they wish to deidentify.
  2. De-identified data can still be re-identified through the use of external data sets, particularly if the data in the de-identified dataset was not properly de-identified. An evergreen example is the AOL data set that retained identifying data in the search queries, even though AOL scrubbed identifying data about the searcher.

It is possible to have a de-identified data set of patron data, but the process is not fool-proof. De-identification requires multiple sample de-identification processes and analysis in determining the risk of how easy it is to reconnect the data to an individual.

Overall, de-identification is a tool to help protect patron privacy, but it should not be the only privacy tool used in the patron data lifecycle. The most effective privacy tools and methods in the patron data lifecycle are the questions you ask at the beginning of the lifecycle:

  • Why are you collecting this data?
  • Does this reason tie to a demonstrated business need?
  • Are there other ways you can achieve the business need without collecting high-risk patron PII?

If you want to learn more about de-identification and privacy risks, check out the resources below:

[1] The article contains additional privacy and security concerns that we will not cover in this post, including technical, administrative, and ethical concerns.

Summer Homework – Understanding Your State’s Library Privacy Law

Welcome to this week’s Tip of the Hat!

Have you always dreamed of spending countless hours reading legal regulations and reviews? If so, you might be suited for legal life! Reading laws is probably not high on your list of things to do; nonetheless, it’s always good to know how to navigate the text of a legal regulation when you are researching what laws could apply to you or to the third parties that you do business with. Even though we’re not lawyers, knowing how to read legal regulation text enables people to have more productive conversations with legal staff.

Here are three questions that can help you start understanding a law or statute:

  1. Who is covered by this law?
    • Does your state library privacy law cover only for publicly-funded libraries, or does the scope include other types of libraries, no matter the funding source? Does it include third parties acting on behalf of the library?
  2. What types of information (and what uses of information) are covered?
    • What does the law mean when it says “patron data”? Are there any definitions or descriptions of specific data points covered by the law?
  3. What exactly is required or prohibited?
    • In particular, what exemptions are listed in the law?

You might not be able to answer all the questions depending on what law you choose to study. However, not being able to answer a question might be a topic of discussion with legal staff, particularly around the specifics of who is within the scope of the law. There’s also the question of preemption between different governmental levels of legal regulation (or even within the same level of government). Sometimes a lower government’s law is stricter than a higher government’s law, but if the higher government’s law states that their law preempts any laws from lower governments, then you are not bound to follow the lower government’s law in that specific matter.

Now it’s time to take what you learned and put it into practice. Find your state’s library privacy law and read the law while trying to answer the questions above. Let us know if these questions help you through the legal text! Don’t be afraid to let us know if this exercise brings up more questions than it answers – we’ll do our best in addressing them, or at least help you prepare in asking these questions to your legal staff.

[Legal questions source: Swire, Peter, and DeBrae Kennedy-Mayo. (2018). U.S. Private-Sector Privacy: Law and Practice for Information Privacy Professionals, 2nd ed.]

Libraries, Privacy, and… Tropes?

Welcome to this week’s Tip of the Hat!

A popular way to procrastinate at LDH is to dig through the pile of articles and other literature about all facets of privacy: regulations, ethics, practices, current events… the current events pile is at overcapacity at the moment. In these piles of articles, we come across one particular trope that we’d like to address – libraries as exemplars of privacy ethics and practices.

This trope is similar to others in other mainstream stories that use libraries as exemplars for other things, such as community engagement, democracy, and learning centers. The “library as privacy exemplar” trope coexists with these other tropes, sometimes in the same story. Other times the trope is front and center of an article. An example of this is an IAPP article about general privacy practices at the library. At best, this article demonstrates the attitude and tone of how many writers think about the library as an enlightened entity with their focus on privacy. Near the end of the article comes another trait that these articles tend to share, which is modeling privacy practices off of the library profession: “While library culture tilts heavily in favor of protecting the ‘citizen from state’ intrusion, that same culture can be mobilized to advocate for ‘customer’ privacy as well in relation to third-party service providers.”

All of this leads us to a hidden danger in the “library as privacy exemplar” trope, which is unquestioned trust in libraries in all matters of privacy and data ethics. Some of that trust has been earned – there are several library privacy initiatives, such as the Library Freedom Institute, that are very active in the greater community in their advocacy and education around data privacy. In addition, LDH’s conversations with technology workers in other fields have made it clear that professionals in other industries wished that they had strong professional ethics and standards like the library profession.

Nonetheless, others from outside the library profession take this trust too far. For example, in Emma Trotter’s “Patron Data Privacy Protection at Public Libraries: The Ethical Model Big Data Lacks”, Trotter proposes that libraries should become personal data stores (PDS) where people can gather their data in one secure place and then manage the processing of their data by third parties. Trotter is very confident that libraries can become the ethical role model for Big Data with this marriage between PDS and library privacy ethics. Overall, Trotter believes that the ethical issues around Big Data would be negated once libraries become front and center in the overall management of Big Data.

While libraries do have a strong ethical basis around advocacy and adoption of privacy practices, libraries also have their fair share of privacy issues and gaps. Libraries are not immune to the same threats and vulnerabilities as other professions and industries, such as data leaks and breaches, ransomware attacks, phishing, and even underfunding or undertraining staff in ways to protect patron privacy. Librarianship also deals with ethical issues around their collection and processing of patron data, particularly for marketing and user profiling, as well as working with vendors who also collect and process patron data without giving the patron control over what is collected and processed. One doesn’t need to search too far to find an example of such – one being the Santa Cruz Public Library’s Civil Grand Jury Report about the numerous ethics breaches surrounding their use of patron data without full patron notice and consent, among other violations of patron privacy.

Yes, other industries can learn from libraries about how to approach privacy in their daily work, including ethics and advocacy, but libraries also have to be honest about the profession’s struggles around data privacy, both on a practical and ethical level. Part of that is being public with these struggles in the public discourse, be it with patrons or with people from other industries who are looking for a model to base their professional privacy ethics and practices on. Another part is re-evaluating how we, as a library profession, market ourselves as privacy experts and safe-keepers of data to our patrons. Again, libraries set themselves apart from other industries regarding privacy ethics and advocacy, but they cannot set themselves apart from the reality that is working with data in the real world that has real needs that fall into ethical gray areas and real data security and privacy risks.

Summer Homework – Requesting Your Data

Welcome to this week’s Tip of the Hat!

Have you ever wondered what data OverDrive collects while you’re reading the latest ebook? Or what Kanopy collects when you’re watching a documentary? As library workers, we have some sense as to what vendors are collecting, but we are also patrons – what exactly are vendors collecting about *us*?

GDPR and CCPA both give different sets of users (EU residents and CA consumers, respectively) the right to access the data collected by organizations and businesses; however, some organizations extended that right to all users, regardless of geographic residency. Below are some of the more well-known library vendors who are offering some form of data request process for their users (aka library patrons, including you!):

  • Cengage
  • Elsevier
  • Kanopy’s data request appears only to apply to CA consumers: “Under California Civil Code Section 1798.83, if you are a California resident and your business relationship with us is primarily for personal, family or household purposes, you may request certain data regarding our disclosure, if any, of personal information to third parties for the third parties’ direct marketing purposes. To make such a request, please send an email to privacy@kanopy.com with “Request for California Privacy Information” in the subject line. You may make such a request up to once per calendar year. If applicable, we will provide to you via email a list of the categories of personal information disclosed to third parties for their direct marketing purposes during the immediately-preceding calendar year, along with the third parties’ names and addresses. Please note that not all personal information sharing is covered by Section 1798.83’s requirements.”
  • LexisNexis
  • OverDrive
  • ProQuest
    • ExLibris, owned by ProQuest, appears to have a different data request process: “You may request to review, correct or delete the personal information that you have previously provided to us through the Ex Libris Sites. For requests to access, correct or delete your personal information, please send your request along with any details you may have regarding the method by which the information was submitted to privacy@exlibrisgroup.com. Requests to access, change, or delete your information will be addressed within a reasonable timeframe.”

What is surprising is that there are not more library vendors that offer this option, or not extending the option to all users. This might change over time, depending on how the newest data privacy ballot initiative in California goes in November, or if additional regulations are passed in other states or even in the federal government. If more companies provide this right to access for all users, then it’s more likely that this practice will become a standard practice industry-wide. LDH will provide the latest updates around data access options from library vendors when they come along!

Black Lives Matter

Hello everyone,

Black Lives Matter.

If your library or archive is thinking about collecting photographs, videos, or other materials from the protests around George Floyd’s death caused by Minneapolis police, what are you doing to protect the privacy of the protesters? Black Lives Matter protestors and organizers, as well as many protesters and organizers in other activist circles, face ongoing harassment due to their involvement. Some have died. Recently Vice reported on a website created by white supremacists to dox interracial couples, illustrating how easy it is to identify and publish personal information with the intent to harm people. This isn’t the first website to do so, and it won’t be the last.

Going back to our question – if your response to the protests this weekend is to archive photos, videos, and other materials that personally identifiable information about living persons, what are you doing to protect the privacy and security of those people? There was a call made this weekend on social media to archive everything into the Internet Archive, but this call ignores the reality that these materials will be used to harass protesters and organizers. Here is what you should be considering:

  • Scrubbing metadata and blurring faces of protesters – a recently created tool is available to do this work for you: https://twitter.com/everestpipkin/status/1266936398055170048
  • Reading and incorporating the resources at https://library.witness.org/product-tag/protests/ into your processes and workflows
  • Working with organizations and groups such as Documenting The Now
    A tweet that summarizes some of the risks that you bring onto protestors if you collect protest materials: https://twitter.com/documentnow/status/1266765585024552960

You should also consider if archiving is the most appropriate action to take right now. Dr. Rachel Mattson lists how archives and libraries can do to contribute right now – https://twitter.com/captain_maybe/status/1267182535584419842

Archives, like libraries, are not neutral institutions. The materials archivists collect can put people at risk if the archives do not adopt a duty of care in their work in acquiring and curating their collections. This includes protecting the privacy of any living person included in these materials. Again, if your archive’s response is to archive materials that identify living people at these protests, how are you going to ensure that these materials are not used to harm these people?

Black Lives Matter.

Just Published! Library Data Risk Assessment Guide

Welcome to this week’s Tip of the Hat!

To build or to outsource?

Building an application or creating a process in a library takes time and resources. A major benefit of keeping it local, though, is that libraries have the greatest control over the data collected, stored, and processed by that application or system. Conversely, a major drawback of keeping it local is the sheer number of moving parts to keep track of in the building process. Some libraries have the technical know-how to build their own applications or have the resources to keep a process in house. Keeping track of privacy risks is another matter. Risk assessment and management must be addressed in any system or process that touches patron data, so how can libraries with limited privacy risk assessment or management experience make sure that their local systems and processes mitigate patron privacy risks?

Libraries have a new resource to help with privacy risk management! The Digital Library Federation’s Privacy and Ethics in Technology Working Group (formerly known as the Technologies of Surveillance Working Group) published “A Practical Guide to Performing a Library User Data Risk Assessment in Library-Built Systems“. This 28-page guide provides best practices and practical strategies in conducting a data risk assessment, including:

  • Classifications of library user data and privacy risk
  • A table of common risk areas, including probability, severity, and mitigation strategies
  • Practical steps to mitigate data privacy risks in the library, ranging from policy to data minimization
  • A template for readers to conduct their own user data inventory and risk assessment

This guide joins the other valuable resources produced by the DLF Privacy and Ethics in Technology Working Group:

The group also plans to publish a set of guidelines around vendor privacy in the coming months, so be sure to bookmark https://wiki.diglib.org/Privacy_and_Ethics_in_Technology and check back for any updates!

Contact Tracing At The Library

Welcome to this week’s Tip of the Hat!

Contact tracing has been used in the past with other diseases which helped curve infection rates in populations, so health and government officials are looking at contact tracing once again as a tool to help control the spread of disease, this time with COVID-19. There have been various reports and concerns about contact tracing through mobile apps, including ones developed by Google and Apple. However, mobile contact tracing will not stop local health and government officials in taking other measures when it comes to other contact tracing methods and requirements, and libraries should be prepared when their local government or health officials require contact tracing as part of the reopening process.

While there are no known cases of libraries doing contact tracing as part of their reopening process, there are some ways in which libraries can satisfy contact tracing requirements while still protecting patron privacy.

Collect only what you absolutely need

What is the absolute minimum you need to contact a patron: name, email address, and/or telephone number are all options. Sometimes patrons do not have a reliable way of contacting them outside the library – health and government officials should have recommendations in handling those cases.

But what about having patrons scan in with their library card and using that as the contact tracing log? What seems to be a simple technological solution is, in reality, one that introduces complexity in the logging process as well as privacy risks:

  • Some of the people visiting the library will not have their library card or are not registered cardholders.
  • Contact logs can be subject to search or request from officials – maintaining the separation between the contact log and any other patron information in the library system will minimize the amount of patron data handed over to officials when there is a request for information.

Paper or digital log?

Some libraries might be tempted to have patrons scan in with their barcodes (see above section as to why that’s not such a good idea) or keep an electronic log of patrons coming in and out of the building. However, an electronic log introduces several privacy and security risks:

  • Where is the digital file being stored? Local drive on a staff computer that isn’t password protected? Network storage? Google Drive (yikes!)?
  • Who has access to the digital file? All staff in the library?
  • How many other copies of the file are floating around the library’s network, drives, or even printed out?

In this instance, however, a paper log will provide better privacy and security protections when you take the following precautions:

  • The paper log should be securely stored in a locked cabinet or desk in a secured area, preferably a locked office or other controlled entry space.
  • During business hours, the paper log should be filled out by designated staff members tasked to collect information from patrons. Do not leave the paper log out for patrons to sign – not only you give patrons the names of others in the building (for example, a law enforcement agent can read the log and see who’s in the building without staff knowledge) you also potentially expose patrons and staff to health risks by having them share the same hard surfaces and pen.
  • Restrict access to the paper log to only staff who are designated to keep logs, and prohibit copying (both physical or electronic copies) of the log.

Equitable service and privacy

Some patrons might not have reliable contact information or might refuse to give information when asked. If the local government or health officials state that someone can’t enter a building if they don’t provide information, how can your library work with your officials in addressing the need for libraries to provide equitable service to all patrons who come to the library?

Retention and disposal

Keep the contact tracing logs for only as long as the government or health officials require. If there is no retention period, ask! Your logs should be properly disposed of – a paper log should be shredded and the shredded paper should go to a secured disposal area or service.

Keeping a log of visits to the library is something not to be taken lightly – you are creating a log of a patron’s use of the library. Several other privacy concerns might be specific to your library that could affect how you go about contact tracing, such as unaccompanied minors. Contact tracing is an effective tool in containing disease outbreaks in the past, but it doesn’t have to come at the expense of losing entire personal privacy if the library works with its staff and government officials in creating a process that minimizes patron data collection, access, and retention.

Choose Privacy Week Recap

Welcome to this week’s Tip of the Hat!

This weekend was hot in Seattle, with temperatures near 90 F. While the Executive Assistant took this time to bask in this heat, we at LDH tried to find a cool spot in the home office to work, away from the Executive Assistant’s gaze.

Last week was a busy week on the Choose Privacy Every Day site for Choose Privacy Week! Here’s what you might have missed:

  • Virtual Programming and Patron Privacy – Jaime Eastman along with the ALSC Children and Technology committee give much-needed guidance for library workers who are moving children-oriented programs and services online due to the pandemic. The post goes into the Children’s Online Privacy Protection Act (COPPA), and what library workers need to do to protect the privacy of children while keeping in compliance with COPPA. Bookmark the ALSC Virtual Storytime Services Resource Guide for additional guidance (coming soon!).
  • Protecting Privacy In A Pandemic: A Resource Guide – On Friday, May 8th, OIF hosted a Privacy Town Hall about patron privacy. While we wait for the recording of the Town Hall event, the blog post lists the main topics and resources covered by the panelists in the Town Hall.
  • When libraries become medical screeners: User health data and library privacy – Some libraries are now giving medical screenings to patrons who want to enter the library building. What privacy risks are there in collecting health data of your patrons? Read the article by LDH to find out why library workers might not be the best choice in handling health data.

Finally, if you have that one library privacy topic that you’ve been meaning to write about or if you want to share your privacy thoughts to a wide audience, Choose Privacy Every Day is looking for blog authors! There are some requirements for being an author for the blog, but this is a great opportunity to get your ideas and thoughts out into the library world.

That’s a wrap! Or, at least, the computer core temperature says it’s time to put the computer in the freezer. If you’re on the West Coast, stay cool, and for those of you who got snow on the East Coast, stay warm!

Week Roundup – In The News and What Would You Do?

Welcome to this week’s Tip of the Hat! Last week was a busy week. Here’s a recap of what you might have missed.

LDH in the News

What Would You Do?

One public library in New Jersey has been finding various ways to support their community while the library building is closed, but one strategy has started a debate on Library Twitter – using patron data to do welfare checks:

Recently, the Library decided to take more direct action to help the Roxbury community. Armed with its enormous patron database, library staffers are going through the list and, literally in descending order, calling the oldest and most vulnerable of Roxbury’s residents to inquire on their well-being, let them know someone cares and will listen, and when need be to connect them to vital resources to get them through this difficult time.

The article goes on to describe how this strategy led to an increase in requests for masks to be distributed by the library.

While this single instance seems to have had a positive outcome, the use of the data collected by the library to do wellness checks brings up the question of “we could, but should we?” concerning using patron data in this manner. Some of the issues and considerations brought up on Library Twitter include:

  • Scope creep – several library workers serve as de facto social workers in their communities. How can libraries in this position support their community while working with local community organizations and local government departments who are better suited for social work? How can this work be done while honoring patron privacy?
  • Data quality – the article stated that the library staff used the age listed in the patron database. How reliable is that data? ILS migrations and even the move to an automated library system can introduce data quality issues in the patron record, including age.
    • For example – one library that moved from a paper-based system to an ILS in the mid-1990s still found patrons whose birthdays were listed as the date of the migration years later.
  • Notice and consent – patrons have certain expectations when giving data to libraries. Some of these expectations come from what the library states in their privacy and confidentiality notices, as well as other communications to patrons from the library. It’s safe to say that libraries don’t list “wellness checks” in their patron privacy notices as one potential use of patron data. This gets into the issue of using data outside of the stated purposes when the data was exchanged between the patron and the library. Recent data privacy legal regulations and best practices address this by requiring businesses to inform about the new use and to get affirmative consent before using the data for said new use.

There are some other items brought up in the Twitter discussion, such as different expectations from patrons, the size of the community, and patron-staff relationships. Some patrons chimed in as well! Like many other real-world data privacy conundrums, this one is not as clear cut in terms of how to best approach addressing the issue at hand – making sure that patrons in under-supported or vulnerable community groups get the support that they need.

We want to hear from you – what would you do in this situation? Email us at newsletter@ldhconsultingservices.com and we’ll discuss the results in a future newsletter. We will not post names or institutions in the newsletter results, so email away and we’ll do the rest to protect your privacy as we discuss patron privacy. Let us know what you think!