Doxing: How to Protect Yourself and Patrons

Welcome to this week’s Tip of the Hat!

The Executive Assistant has her paws full this week with rescheduling and shifting various project timelines around thanks to recent events. She was batting objects off of ledges redoing Gantt charts when she came across a small list of privacy-related things to do on a rainy day and promptly knocked the list off the pile and onto the floor. While this is not a rainy day, a few of us could use a distraction, so what can be a better distraction than protecting your privacy?

Today we’ll explore doxing: what it is, how it can harm you and your patrons, and what you can do to protect yourself and patrons from being doxed.

Doxing and You

Doxing is the act of publishing private or otherwise identifying information about a person to the public. This can include your home address, phone number, private email address, or bank account details, but it can also involve publishing private information about those close to you, like family members, along with your private information. Most times doxing is used as a tactic to intimidate or to harm a person or their loved ones – an infamous example of doxing in action is Gamergate, where online harassers doxed several games journalists, researchers, and others in the gaming industry.

Being doxed can mean a stranger showing up at your home or otherwise harassing you as you try to go about your daily life, but it can also mean that your identity can be stolen. With just a few pieces of private personal information, you can social engineer your way through customer service staff and help desk representatives to get access to critical accounts, potentially destroying the financial and reputational aspects of a person’s life in the process.

How to Dox Yourself (@ the Library)

The scary part about doxing is that anyone with little time and effort you can get access to private information. The New York Times recently published a guide on how to dox yourself, describing the various places where you can find information that you thought was not available to the public. Search engines, social media, and data brokers are all potential sources for doxers looking for your private information. Take some time to study their resource guide and perform some searches on your favorite search engine. You might be (un)pleasantly surprised as to what you can find about yourself.

Libraries are not exempt from being potential targets for doxers to gain information about a person. Library patrons routinely contact library staff with requests or questions about their patron account or another person’s patron account. What can be in the patron record that can potentially be used to dox someone? Legal name, home address, and birth date are three pieces of patron data that come to mind. Chances are, though, that your patron record includes much more, including telephone numbers, email addresses, and even government or organization-issued identification numbers, such as driver’s license numbers or student or employee id numbers.

Library workers also face the possibility of being doxed and harassed. An article by American Libraries recounted the experiences of two library school professors who were doxed for their research on racial microaggressions in academic libraries. Library workers are subject to the same harassment and doxing that their patrons face in daily life, as documented in the article. Any private information of both patrons and library workers is fair game to a doxer, even at the library.

Dox Defenses

How can you protect yourself and others from doxing?
On the personal front:

On the library front, review policies and procedures surrounding patron data confidentiality, particularly surrounding requests to disclose patron information:

  • Do you have a procedure in place to verify the patron’s identity if they request access to information in their patron record? What are the procedures regarding identity verification in-person versus over the phone versus online?
  • What information is used in the verification process?
  • What information do you disclose in the patron record in person? Over the phone? Online?
  • What is the procedure when the patron doesn’t have this information for verification?
  • What is the procedure if the patron requests access to another patron’s record?

Employee information also needs protection; however, a different set of regulations, policies, and procedures apply. Check with your human resources staff as well as legal counsel to determine what information is private, what is public, and when employers are allowed to disclose employee information to others.

Doxing is scary and can lead to harassment and other dangerous situations. The best personal defense against doxing is to be proactive in limiting the amount of private information a random person off the street can access through a data broker, your online presence, or other places where private information can be accessed by someone with a little bit of time and resources. The best library defense is making sure that there are policies and procedures in place for verification of the patron’s identity before disclosing patron information in certain situations, as well as protecting the privacy of library worker information, be it from not publishing private information such as home addresses to protecting the data from unauthorized access.

Leaving Platforms and Patrons Behind

Welcome to this week’s Tip of the Hat!

Remember when the online library catalog was just a telnet client? For some of you, you might even remember the process of moving from the card catalog to an online catalog. The library catalog has seen many different forms in recent decades.

The most recent wave of transitions is the migration from an old web catalog – in most cases an OPAC that came standard with an ILS – to a newer discovery layer. This discovery layer is typically hosted by the vendor and offers the ability to search for a wider array of collections and materials. Another main draw of the discovery layers in the market is the enhanced user experience. Many discovery layers allow users to add content to the site, including ratings, comments, and sharing their reading lists to others on the site.

While being able to provide newer services to patrons is important, this also brings up a dilemma for libraries. Many discovery layers are hosted by vendors, and many have separate Terms of Service and Privacy Policies attached to their products outside of the library’s policies. The majority of library catalogs that the discovery layers are meant to replace are locally hosted by the library, and fall under the library’s privacy policies. Libraries who made the transition to the discovery layer more often than not left their older catalog up and running, marketed as the “classic” catalog. However, the work necessary to keep up two catalogs can be substantial, and some libraries have retired their classic catalogs, leaving only the discovery layer for patrons to use.

The dilemma – How will the library provide a core library service to patrons objecting to the vendor’s TOS or privacy policy when the library only offers one way to access that core service?

We can use the Library Bill of Rights [LBR] interpretations from ALA to help guide us through this dilemma. The digital access interpretations of the LBR provides some guidance:

Users have the right to be free of unreasonable limitations or conditions set by libraries, librarians, system administrators, vendors, network service providers, or others. Contracts, agreements, and licenses entered into by libraries on behalf of their users should not violate this right… As libraries increasingly provide access to digital resources through third-party vendors, libraries have a responsibility to hold vendors accountable for protecting patrons’ privacy. [Access to Digital Resources and Services: An Interpretation of the Library Bill of Rights]

Moving core services to third-party vendors can create a barrier between patrons and the library, particularly when that barrier is the vendor’s TOS or privacy policy. The library then needs to decide what next steps to take. One step is to negotiate with the vendor regarding changes to the TOS and privacy policy-based to address patron concerns. Another step is a step that several libraries have opted for – keeping the classic catalog available to patrons alongside the discovery layer. Each step has its advantages and disadvantages in terms of resources and cost.

The classic catalog/discovery layer dilemma is a good example of how offering newer third-party platforms to provide core library services can create privacy dilemmas for your patrons and potentially lock them out from using core services. If your library finds itself making such a transition – be it the library catalog or another core service platform – the ALA Privacy Checklists and the interpretations of the LBR can help guide libraries through the planning process. Regardless of the actions taken by the library, ensuring that all patrons have access to core library services should be a priority, and that includes taking privacy concerns to account when replacing core service platforms.

AI, Read The Privacy Policy For Me

Welcome to this week’s Tip of the Hat! Last week we took a deep dive into ALA’s privacy policy to figure out where our information was going if we agreed to receive information from exhibitors while registering for the Annual Conference.

[Which, ICYMI, LDH will be exhibiting at Annual! Let us know if you want to meet up and talk about all things privacy and libraries!]

As we encountered last week, privacy policies are not the most exciting documents to read. In fact, you can test out this theory by checking out the impressive list of electronic resource vendor privacy policies generated by the folks at York University (the code is available on GitHub). Try picking out a couple of privacy policies and read them from start to finish now. We’ll be here waiting for you.

…..

……. all done?

Chances are, you probably found yourself skimming the policies if you made it all the way to the bottom. If so, you’re not alone – studies have shown that the majority of folks do not read these policies, which could lead to surprises and confusion when your data is collected, shared, or breached. The fact is that it takes a long time to get through long, detailed documents – a recent study showed that many privacy policies require a high reading level and up to around a half hour to read. What’s a busy person to do?

One way some folks are addressing this is to let the machines do the reading for you. The last few years have seen several tools that use AI and machine learning (ML) to analyze privacy policies, selecting the very important parts that users should know. For example, the Usable Privacy Policy Project, an NSF funded project, used a collection of 115 privacy policies annotated by law students to train machine classifiers to annotate over 7000 privacy policies. Another group of researchers used the same 115 annotated privacy policies for ML training, creating two different tools for AI-generated analysis of policies. The first is Polisis, which creates a Sankey diagram based off of the AI’s analysis of the policy, while the second is Pribot, a chatbot that allows users to explore and ask questions about specific privacy policies.

Each AI privacy analysis tool takes a different approach in displaying the results to the end users. Let’s use OverDrive’s privacy policy as our test policy. [1] The Usable Privacy site uses different colored fonts to indicate which parts of the policy belong to 10 different categories. The site also directs us to another policy analysis of OverDrive’s Privacy Policy for Children. Users can click on a category to only show the colored sections of the policy, or to exclude it.

A screenshot of Usable Privacy's analysis of the OverDrive privacy policy.

For Polisis’ analysis of OverDrive’s policy, the site takes the same ten categories and creates separate visualizations for most of them. Users can click on a stream to highlight it in the diagram – for example, showing what information is shared and for what reason.

A screenshot of Polisis' analysis of OverDrive's privacy policy.

We are still a ways away before widespread adoption of AI-annotated privacy policies; however, the possibilities are promising. With GDPR, CCPA, and other upcoming privacy regulations, AI and ML could help end users in keeping up with all the changes in policies, as well as dig through mountains of text in a fraction of the time it would have taken to manually read all of the text. It will still take a considerable human role in training the AI and supervising the ML to ensure proper analysis, though, as well as human labor in creating effective and accessible interfaces. Perhaps one day there could be an API service that can have AI analyze the privacy policies listed on the York University page.

[1] Both sites are analyzing older versions of OverDrive’s privacy policy. The most up to date privacy policy is at https://company.cdn.overdrive.com/policies/privacy-policy.htm.

Monday Mystery: Conference Information Sharing

Welcome to this week’s Tip of the Hat! It seems that spring has just arrived for many of us in the US; however, the calendar tells us that we are only weeks away from the ALA Annual Conference in Washington DC in June. Our Executive Assistant was going through the PDF registration form the other day and noticed the following question:

A text box with the following text: "Attendees may receive exciting advance information from exhibitors like invitations, contests and other hot news. COUNT ME IN!" Yes/No checkboxes are next to the last sentence.

The above question on the registration form asks if the person (or in this case, cat) wants to receive information from conference exhibitors. The Executive Assistant paused. What does checking the “Yes” box all entail? Since we’re in the data privacy business, this is a perfect Monday Mystery for us to investigate.

After a quick search of the conference website, we land on ALA’s Privacy Policy at http://www.ala.org/privacypolicy. If you haven’t spent time with a privacy policy, it can seem daunting or downright boring. Let’s walk through this policy to find out what happens when we check the “Yes” box.

The “Information Collection & Use” section lays out what information is collected and when. They define “personal data” as information that can be used to identify someone: name, email, address, etc. The section breaks down some common actions and situations when ALA collects data, including event registration. We already guessed that ALA was collecting our information for event registration purposes, but we need to dig deeper into the policy to answer our question.

We then find a section labeled “Information Sharing” in which we might find our answer! The section lists who ALA shares information with in detail, including the type of data and circumstances that the data is shared. “Services Providers” seems promising – that is until we get into the details. The data listed that is shared to service providers is mostly technical data – location data, log files, and cookies – and has nothing regarding giving information to receive updates from exhibitors. Back to square one.

Moving down the policy, we arrive at the “Your Rights and Choices Regarding Your Information” section, which lists the following right:

Object to processing – You have the right to object to your Personal Data used in the following manners: (a) processing based on legitimate interests or the performance of a task in the public interest/exercise of official authority (including profiling); (b) direct marketing (including profiling); and (c) processing for purposes of scientific/historical research and statistics;”

Okay, we have the right to ask ALA not to use our personal data for marketing purposes. That’s a very important right to have, though that doesn’t exactly solve the mystery of what happens when we click on the “Yes” box.

This, readers, is where we are going to cheat in this investigation. It’s time to put our exhibitor hat on!

Exhibitors at major conferences are usually offered some form of registrant/member list as a means to promote their business before the conference. ALA does the same with Annual, and exhibitors can rent attendee lists. From https://2019.alaannual.org/list-rental, exhibitors have the option to “[t]arget buyers by industry segment, demographic profile or geographic area.” So, just not names and emails are shared!

On the exhibitor side, having that information would allow for targeted marketing – instead of blasting the entire attendee list, exhibitors can reach out to those most likely to be receptive to their service or product. On the attendee side, some want to have this type of targeted marketing to plan their time at the conference efficiently, or to do homework before hitting the exhibit hall. For other attendees, though, it means more emails that they’ll just delete or unsubscribe. And then there’s the question about what happens to that attendee data after the conference…

In the end, we still have a bit of a mystery on our hands. The only reason we got this far in our little Monday Mystery investigation is that LDH has been bombarded with emails trying to sell us attendee lists which tipped us off to start looking at the exhibitor section of the conference site. Your average conference attendee wouldn’t have that information and would be left scratching their heads due to the lack of information at the point of registration about what information is shared on these attendee lists. While we don’t have a clear answer to end today’s investigation, we hope that this gives our readers a little reminder to do some research the next time they are asked a similar question on a registration form.

Speaking of ALA Annual, LDH Consulting Services is excited to announce that we will be exhibiting in DC in booth 844! Many thanks to Equinox Open Library Initiative for making exhibiting at ALA Annual possible for LDH. Give us a ping if you will be at Annual and would like to talk more about LDH can do for your organization.