Hello everyone! It’s been a while since our last post in April, and a lot has happened. A Supreme Court ruling that will change how courts interpret an individual’s right to privacy, a bipartisan federal data privacy bill gaining momentum, ICE dipping into LexisNexis data much more than initially thought – and all of that is just within the past month. A lot is going on in the privacy world right now! While we won’t be back on our regular post schedule for a little longer, we will have time to bring you analysis and updates as they come along.
Speaking of updates, we have a big one to announce – the publication of our first book! Managing Data for Patron Privacy: Comprehensive Strategies for Libraries breaks down what library workers need to do to protect the privacy of their patron’s data. In this book, Kristin Briney, Biology & Biological Engineering Librarian at the California Institute of Technology, and LDH founder Becky Yoose cover key topics as:
succinct summaries of major U.S. laws and other regulations and standards governing patron data management;
information security practices to protect patrons and libraries from common threats;
how to navigate barriers in organizational culture when implementing data privacy measures;
sources for publicly available, customizable privacy training material for library workers;
the data life cycle from planning and collecting to disposal;
how to conduct a data inventory;
understanding the associated privacy risks of different types of library data;
why the current popular model of library assessment can become a huge privacy invasion;
addressing key topics while keeping your privacy policy clear and understandable to patrons; and
data privacy and security provisions to look for in vendor contracts.
Managing Data for Patron Privacy is a great place to start for library workers and libraries looking to cultivate a sustainable, holistic approach to their data privacy practices. Come for the case studies and practical advice; stay for the cats, glitter, and pasty recipe. 😉 We hope you enjoy the book, and please let us know if you have any questions or comments as you dive into our new book!
People sometimes ask what keeps privacy professionals up at night. What is that one “worst-case scenario” that we dread? Personally, one of the scenarios hanging over my head is insider threat – when a library employee, vendor, or another person who has access to patron data uses that data to harm patrons. A staff person collecting patron addresses, birthdays, and names to steal the patrons’ identities is an example of insider threat. Another example is a staff person accessing another staff’s patron records to obtain personal information to harass or stalk the staff member.
Last week, an IT employee at NCSU was doxed as a local leader of a white supremacist group. This person, who worked IT for the libraries in the past, doxed individuals, including students in his own university, to harass and, in some cases, incite violence toward the people being doxed. As an IT employee, this person most likely had unchecked access to students, staff, and faculty personal information. It wouldn’t be a stretch to say that he still had access to patron information, given his connections to the library and his IT staff position.
Libraries spend a lot of time and attention worrying about external threats to patron privacy: vendors, law enforcement, even other patrons. We forget that sometimes the greatest threat to patron privacy works at the library. Library workers who have access to patron data – staff, administration, board members, volunteers – can exploit patrons through the use of their data for financial gain in the case of identity theft or harm them through searching for specific library activity, checkouts of certain materials, or even names or other demographic information with the intent to harass or assault. The reality is that there might not be many barriers, if at all, to stop library workers from doing so.
The good news is that there are ways to mitigate insider threat in the library, but the library must be proactive in implementing these strategies for them to be the most effective:
Practice data minimization – only collect, use, and retain data that is necessary for business operations. If you don’t collect it, it can’t be used by others with the intent to harm others.
Implement the Principle of Least Privilege – who has access to what data and where? Use roles and other access management tools to provide staff (and applications!) access to only the data that is absolutely needed to perform their intended duty or function.
Regularly review internal access to patron data – set up a scheduled review of who has what access to patron data. When an employee or other library worker/affiliate changes roles in the organization or leaves the library, develop and implement policies and procedures in revoking or changing access to patron data at the time of the role change or departure.
Confidentiality Agreements For Library Staff, Volunteers, and Affiliates – your privacy and confidentiality policy should make it clear to staff that patrons have the right to privacy and confidentiality while using library resources and services. Some libraries go further in ensuring patron privacy by using confidentiality agreements. These confidentiality agreements state the times when patron data can be access and the acceptable uses for patron data. Violation of the agreement can lead to immediate termination of employment. Here are some examples of confidentiality agreements to start your drafting process:
Regularly train and discuss about privacy – ensure that everyone who is involved with the library – staff, volunteers, board members, anyone that might potentially access patron data as part of their role with the library – is up to date on current patron privacy and confidentiality policies and procedures. This is also an opportunity to include training scenarios that involve insider threat to generate discussion and awareness of this threat to patron privacy.
A note about IT staff, be it internal library IT staff or an external IT department (campus IT, city government IT, or another form of organizational IT) – Do not automatically assume that IT staff are following privacy/security standards and policy just because they are IT. Now is the time to discuss with your IT connections about their current access is and what is the minimum they need for daily operations. However, even if the IT department practices good security and privacy hygiene (such as making sure they follow the Principle of Least Privilege), any IT staff member who works with the library in any capacity must also sign a confidentiality agreement and be included in training sessions at the very minimum.
A data inventory is a good place to start if you are not sure who has access to what data in the library. The PLP Data Privacy Best Practices for Libraries project has several templates and resources to help with creating a data inventory, assessing privacy risks, and practical actions libraries can take in reducing the risk of an insider threat.
Libraries serve everyone. We serve patrons who are already at high risk for harassment and violence. Libraries must do their part in mitigating the risk that insider threat creates for our patrons who depend on the library for resources and support. Otherwise, we become one more threat to our patrons’ privacy and potentially their lives or the lives of their loved ones.
What does the toolkit cover? The topics range from the data lifecycle and managing vendor relationships to creating policies and procedures to protect patron privacy. The toolkit covers specific privacy concerns in the library, including law enforcement requests, surveillance, and data analytics. We also get to meet Mel and Rafaël, two library patrons who have unique privacy issues that libraries need to consider when thinking about patron privacy. At the end of the toolkit is an extensive resource section with library privacy scholarship, professional standards, and regulations for further reading.
This toolkit is part of a larger group of resources, including templates and examples libraries can use to develop contract addendums, privacy policies and procedures, and data inventories and privacy risk assessments. In short, there are a lot of resources that are freely available for you to use in your library! Please let us know if you have any questions about the project resources.
Finally, stay tuned – the project is going into its second year, focusing on “train the trainer” workshops for both data privacy and cybersecurity. We’ll keep you updated as more materials are published!
Building an application or creating a process in a library takes time and resources. A major benefit of keeping it local, though, is that libraries have the greatest control over the data collected, stored, and processed by that application or system. Conversely, a major drawback of keeping it local is the sheer number of moving parts to keep track of in the building process. Some libraries have the technical know-how to build their own applications or have the resources to keep a process in house. Keeping track of privacy risks is another matter. Risk assessment and management must be addressed in any system or process that touches patron data, so how can libraries with limited privacy risk assessment or management experience make sure that their local systems and processes mitigate patron privacy risks?
Libraries have a new resource to help with privacy risk management! The Digital Library Federation’s Privacy and Ethics in Technology Working Group (formerly known as the Technologies of Surveillance Working Group) published “A Practical Guide to Performing a Library User Data Risk Assessment in Library-Built Systems“. This 28-page guide provides best practices and practical strategies in conducting a data risk assessment, including:
Classifications of library user data and privacy risk
A table of common risk areas, including probability, severity, and mitigation strategies
Practical steps to mitigate data privacy risks in the library, ranging from policy to data minimization
A template for readers to conduct their own user data inventory and risk assessment
This guide joins the other valuable resources produced by the DLF Privacy and Ethics in Technology Working Group:
[Author’s note – this posts uses the term “Dark Data” which is an outdated term. Learn more about the problem with the term’s use of “dark” at Intuit’s content design manual.]
Welcome to this week’s Tip of the Hat!
Also, welcome to the first week of Daylight Savings Time in most of the US! To celebrate the extra hour of daylight in the morning (we at LDH are early birds), we will shed light on a potential privacy risk at your organization – dark data.
The phrase “dark data” might conjure up images of undiscovered data lurking in the dark back corner of a system. It could also bring to mind a similar image of the deep web where the vast amount of data your organization has is hidden to the majority of your staff, with only a handful of staff having the skills and knowledge to find this data.
The actual meaning of the phrase is much less dramatic. Dark data refers to collected data that is not used for analysis or other organizational purposes. This data can appear in many places in an organization: log files, survey results, application databases, email, and so on. The business world views dark data as an untapped organizational asset that will eventually serve a purpose, but for now, it just takes up space in the organization.
While the reality of dark data is less exciting than the deep web, the potential privacy issues of dark data should be taken seriously. The harm isn’t that the organization doesn’t know what it’s collecting – dark data is not unknown data. One factor that leads to dark data in an organization is the “just in case” rationale used to justify data collection. For example, a project team might set up a new web application to collect patron demographic information such as birth date, gender, and race/ethnicity not because they need the data right now, but because that data might be needed for a potential report or analysis in the future. Not having the data when the need arises means that you could be out on important insights and measures that could sway decision-makers and the future of operations. It is that fear of not having that data, or data FOMO, that drives this collection of dark data.
When you have dark data that is also patron or other sensitive data, you put your organization and patrons at risk. Data sitting in servers, applications, files, and other places in your organization are subject to being leaked, breached, or otherwise subject to unauthorized access by others. This data is also subject to disclosure by judicial subpoenas or warrants. If you choose to collect dark data, you choose to collect a toxic asset that will only become more toxic over time, as the risk of a breach, leak, or disclosure increases. It’s a matter of when, not if, the dark data is compromised.
Dark data is a reality at many organizations in part because it’s very easy to collect without much thought. The strategies in minimizing the harms that come with dark data require some forethought and planning; however, once operationalized, these strategies can be effective in reducing the dark data footprint in your organization:
Tying data collection to demonstrated business needs – When you are deciding what data to collect, be it through a survey, a web application, or even your system logs, what data can be tied back to a demonstrated business need? Orienting your data collection decisions to what is needed now for operational purposes and analysis shifts the mindset away from “just in case” collection to what data is absolutely needed.
Data inventories – Sometimes dark data is collected and stored and falls off the radar of your organization. Conducting regular data inventories of your organization will help identify any potential dark data sets for review and action.
Retention and deletion policies – Even if dark data continues to persist after the above strategies, you have one more strategy to mitigate privacy risks. Retention policies and proper deletion and disposal of electronic and physical items can limit the amount of dark data sitting in your organization.
The best strategies to minimize dark data in your organization happens *before* you collect the data. Asking yourself why you need to collect this data in the first place and looking at the system or web application to see what data is collected by default will allow you to identify potential dark data and prevent its collection.