#ChoosePrivacy 2019: Privacy and Equity

Welcome to this week’s Tip of the Hat! This week marks the start of Choose Privacy Week, hosted by the ALA Office of Intellectual Freedom. We briefly covered CPW in our National Library Week newsletter, but we couldn’t pass up the chance to join in the festivities of a week dedicated to library privacy.An image of two arms coming together in front of a padlock, with the text "Inclusive Privacy, Closing the Gap" below it. To the right of the image is the text "May 1-7 Choose Privacy Week #chooseprivacy".

This year’s Choose Privacy Week is focusing on how privacy in libraries is vital for those who are otherwise targeted for surveillance and data-based discrimination elsewhere in the US. Library workers stress privacy as a core tenet of Intellectual Freedom; however, this focus can be very narrow with regard to protecting a subset of patron information from specific unauthorized uses and access, e.g. a government entity accessing a patron’s circulation records. This narrow interpretation of the role privacy plays in the library does not take into account the evolution of the role of data in libraries and in society. Data has taken its place as a critical tool in ensuring funding and continued operations. We see this evolution with the increasing prevalence of customer relations management systems, learning analytics, and identity-based services (such as RA21) in the library environment.

With the rise of data-as-valuable-asset, comes the dark side, or taking a cue from Bruce Schneier, the toxicity of data. Data has been used to target marginalized populations via surveillance and other means. How can data harm vulnerable populations? Taking a look around the Seattle area, here are two recent cases in which data collection inflicted real-world harm on people:

Another resource highlighting past and potential harms is http://neveragain.tech/. This pledge site started when the current US president proposed a registry of Muslims in the US. The page highlights some of the ways that technology was used against marginalized populations throughout recent times, as well as the harms that come with data collection.

Reframing the conversation about why privacy is important in libraries requires rethinking the field’s approaches surrounding privacy practices and policies. Privacy with regard to pursuing intellectual interests needs to take into account the social factors that come into play when someone from a vulnerable population uses the library. Many libraries market themselves as a “third place” or a place where the community can gather together for a variety of reasons, be it studying, meetings, programs, or even a safer space to spend free time outside of the home, work, or school. While data is useful in relation to building and maintaining operations that best benefit all patrons in the library’s third place role, care needs to be taken to ensure that the same data is not used to harm patrons as demonstrated in the cases above.

If you are looking for how to approach your privacy practices with an equity lens, you will hear from a variety of backgrounds and viewpoints during this year’s CPW. Maybe you’ll find something that you haven’t considered in relation to your privacy practices, or find an opportunity to be proactive in building trust with patrons. In either case, we’re looking forward to finding out more about how libraries can align privacy with equity during Choose Privacy Week!

[REDACTED] – Redacting PII From Digital Collections

Welcome to this week’s Tip of the Hat! The Executive Assistant is back, and you know what that means…

A sitting black cat looking up at the camera, meowing loudly.
We’re back in business, newsletter-wise!

This week’s topic comes from a recent post to the Code4Lib mailing list. A library is planning to scan a batch of archival documents to PDF format, and are looking for ways to automate the process of identifying personally identifiable information [PII] in the documents and redacting said PII. The person mentioned that the documents might contain Social Security Numbers or credit card numbers.

Many libraries and archives have resources – digital and physical – that contain some form of PII in the source. While physical resources can be restricted to specific physical locations (unless someone copies the source via copier, pencil and paper, camera, etc.), digital resources that are available through a digital repository can increase the risk of privacy harm if that digital resource contains unredacted PII.

When libraries and archives are incorporating personal collections, research data sets, or other resources that may contain PII, here are some considerations to keep in mind to help through the process of mitigating the risk of data breaches and other privacy harms:

Who is included or mentioned in the resource – Some archival collections contain PII surrounding the individual who donated their materials. When dealing with institutional/educational records or research data sets, however, you might be dealing with different types of PII regulations and policies depending on who is included in the resource and what type of PII is present.

What PII is in the resource – When most folks think about PII, they think about information about a person: name, Social Security number, financial information, addresses, and so on. What tends to be overlooked is PII that is information about an activity surrounding a person that could identify that person. Think library checkout histories, web search histories, and purchase history. You will need to decide what types of PII needs redacting, but keep both facets of PII in mind when deciding.

What is the redaction workflow – This gets into the question from the mailing list. The workflow of redacting PII depends on several factors, including what PII needs to be redacted, the number of resources needing to be redacted, and what format the resource is in. Integrating redaction into a digitization or intake workflow reduces the time spent retroactively redacting PII by staff. Here I’d like to offer a word of caution – while automating workflows for efficiency can be positive, sub-optimizing a part of a workflow can lead to a less efficient overall workflow as well as have negative effects on work quality or resources.

What tools and resources are available – While looking at the overall workflow for redacting PII, the available resources and knowledge available to you as an organization to build and maintain a redaction workflow will greatly shape said workflow, or even the ability to redact PII in a systematic manner. There are many commercial tools that automate data classification and redaction workflows, and there are options to “roll your own” identification and redaction tool using various programming languages and regular expressions. If you work at a library or archive that is part of a bigger institution, there might be tools or resources already available through central IT or through departments that oversee compliance or information security and privacy. Don’t be afraid to reach out to these folks!

If you’re wondering where to begin or what other organizations approach redaction, here are a few resources, here are some resources to start with:

Quick Tips

A one eyed black cat in a carrier, looking upwards in despair about her current predicament.

Welcome to this week’s Tip of the Hat! This last week proved to be a harrowing one for our Executive Assistant. She has now found more places to hide in order to avoid copy-editing newsletter drafts; hence, this week’s letter will be shorter than usual. We will be back to normal operations by next week, or when all the hiding spots have been located.

LDH in the News

LDH received a mention in last week’s CNET article about libraries and privacy in the era of ebooks. The article gave a brief overview of various technologies libraries are or have adopted, and the privacy implications that come with said adoption. Customer relations management systems (CRMs) receive a mention, and we will expand on CRMs and libraries in a future newsletter.

Your Browser’s Privacy Settings Are Changing For The Worse (Unless You’re Using Firefox)

Last week Bleeping Computer reported that the majority of browsers are or plan to disable the setting for users to prevent tracking link clicks by websites. The article explains in depth what information is being tracked if you click on a link on a site that uses this type of link auditing to track their users’ behavior. Don’t lose hope yet – if you use Firefox or Brave, you still have the ability to control this setting!

While other browsers are reducing the ability for users to protect their privacy, Firefox is working on blocking browser fingerprinting. Read more about browser fingerprinting and how it can compromise your online privacy, and maybe even test your browser to see how well your current browser protects you from tracking.

A Link A (Work) Day: Five Links for National Library Week 2019

Three children and an adult man dancing in front of a staircase in a library. 6 other adults are dancing on the stairway.

Welcome to this week’s Tip of the Hat, and happy National Library Week! To celebrate all things library, here are some library privacy resources to check out* this week:

Choose Privacy Everyday
Every year, ALA’s Office of Intellectual Freedom hosts Choose Privacy Week, usually around the first week in May. This site contains a multitude of information and resources for training, event and outreach planning, and other information to implement better privacy practices at libraries. This year’s theme for Choose Privacy Week is Inclusive Privacy: Closing the Gap.

How did we get here: A zine about privacy at the library
This is a great primer for those looking for an overview of the various factors playing into the relationship between libraries and privacy. Print it out or download the file and send it around your workplace! If folks are interested in learning more, the companion site gives folks a more detailed overview of the history of the privacy-library relationship.

Library Values and Privacy Summit Report
2018 was a bumper crop year for IMLS-funded projects and forums surrounding library privacy. One of the first in 2018 was the Library Values and Privacy Summit in New York City. The final meeting in the “Library Values & Privacy in our National Digital Strategies: Field guides, Convenings, and Conversations” project, 30 participants from various library, civic, and privacy organizations focused on various issues surrounding library privacy in several areas, including vendor relations, training, learning analytics, and staff and patron training.

National Web Privacy Forum Report
Another IMLS-funded forum took place in September 2018, bringing together 40 library, privacy, education, and technology professionals to take the first steps in creating practical action plans surrounding the enhancement of web privacy. The action plans range from dashboards and values-based assessment to privacy leadership training and education and outreach in tribal educational institutions.

Library Freedom Project
Last but not least we end with the Library Freedom Project. You might have first heard of the Project with their work surrounding Tor as well as the HTTPS Pledge. Recently the LFP co-launched the Library Freedom Institute, a six-month program educating library workers to become Privacy Advocates at their library as well as their community.

The Executive Assistant’s entry – The Library Cats Map
This page is no longer kept up to date, but is a good historical snapshot of the various library cats that have graced the stacks of various libraries.

*Pun semi-intended.

Baking Privacy Into Your Library: The What, How, and Why of Privacy by Design

Welcome to this week’s Tip of the Hat!

This week’s topic comes to you thanks to the endless hours spent last week cleaning up inactive online accounts as part of our #dataspringcleaning efforts at LDH. It’s frustrating as a user not to have the ability to choose how applications and third parties collect and process your data. Not having the ability to delete your account is one example – some systems were not built to delete data. You all have run across other examples, such as the lack of opting out of having certain personal data collected, processed, or shared with other third parties. To an extent, library workers and vendors can determine how much control patrons have over how applications and services collect and process personal data. However, these controls are put into place after the fact, leaving patrons in a lurch with limited privacy options. How can library workers and vendors avoid this lurch?

Enter Privacy by Design (PbD). First created by Ann Cavoukian, then refined by various international organizations, PbD advocates making privacy a priority throughout the lifecycle of a service or application, including the planning and implementation stages. PbD has made a major impact in the privacy and systems development worlds, as well as the legal realm – GDPR is the latest regulation where PbD has made an appearance.

There are seven foundational principles of PbD:

  1. Proactive not reactive; preventative not remedial
  2. Privacy as the default setting
  3. Privacy embedded into design
  4. Full functionality – positive-sum, not zero-sum
  5. End-to-end security – full lifecycle protection
  6. Visibility and transparency – keep it open
  7. Respect for user privacy – keep it user-centric

What would PbD look like for library workers and vendors? An example is turning off any features that might share user activity to others by default. Users who want to share their activity would have the option to turn on the feature, giving the application their consent in doing so. Another example comes from our tale of woe at the beginning of the newsletter – building a system so that a user can delete their account or personal data without consequence to the system’s integrity. It is much easier to create a system that can handle such deletions than to try to retroactively get a legacy system to learn a new trick!

Both examples highlight the user’s ability to control what data is collected, stored, and shared. Notice that privacy by default does not mean not collecting or processing data at all, but instead takes the position of letting users decide what level of privacy they are most comfortable with. On another level, PbD’s integrated approach to privacy in the development lifecycle guides all those involved in the development and planning processes in assessing how systems can protect user privacy and meet business needs at the same time. Discussing data collection and processing, privacy features, and how to address potential user concerns early in the development process can save both time and headaches when the system launches to users.

Below are a few resources to get you started with PbD:

[H/T to Chad Nelson for the inspiration for this week’s newsletter title!]