#DataSpringCleaning 2021 – Email and Patron Data

A white and brown short-haired dog places their right front paw on top of a open laptop keyboard. The laptop screen shows a blurred Gmail inbox window.
Image source: https://www.flickr.com/photos/karenbaijens/16241866468/ (CC BY 2.0)

Welcome to the first week of Spring in the Northern Hemisphere! This month marks one year of working from home for some library workers and the hybrid remote/onsite work limbo for others. In both cases, this anniversary also marks a year’s worth of patron data collected and stored all over the place due to the abrupt switch to remote work and virtual services. It’s safe to say that many disaster or business continuity plans didn’t plan for a pandemic, and the resulting scramble to virtual or reduced physical services/work created new or exacerbated existing data privacy gaps. Last year’s #DataSpringCleaning focused on setting up the home office to address a common privacy problem – the over-retention of patron data. Check out the post and the companion workshop materials about protecting patron privacy while working from home if you haven’t already done so.

This year’s #DataSpringCleaning project is ambitious as it is daunting. This year is the Sisyphean project of data cleanup projects – no matter how many times we try and fail, we keep coming back to this one project in hopes of finally completing it. Let us go back once more into the breach, friends. It’s time to scrub our work email.

Email as Major Privacy Risk to Patron Privacy

While many library workers are aware that their emails can contain patron data, they might not be aware of how much patron data is stored in their accounts. Personally identifiable information, or PII, includes data about a patron as well as data of a patron’s activity. The former can be easy to identify and easy to email without much thought about the privacy risk of doing so:

  • Name
  • Physical and email addresses
  • Birthdate or age
  • Patron record number
  • Username and password

A patron’s activities, on the other hand, can be harder to identify once you factor in the types of emails a library worker can receive or send in any given day:

  • Help desk ticket threads
  • Reference form or chat tickets or transcripts
  • Direct email from patrons
  • System or application reports or alerts
  • Vendor service desk tickets or reports

This list is just a small selection of the types of emails that can contain data around a patron’s activities such as:

  • Reference questions
  • Search and circulation histories
  • IP addresses
  • Electronic resource authentication and access history
  • Library computer and wifi logs and activity

And that’s just the start of how much patron data is in staff emails!

The ease of storing and sharing data through email makes it difficult to control data sharing and retention once the data hits the email system. The risk to patron privacy compounds once the email containing patron data leaves the library’s email system and into a third-party email account, be it a vendor or even a personal email account. Another risk for many libraries is that staff emails are subject to public disclosure requests. Several state and local regulations protect patron record data from disclosure, but in some cases, this protection might not extend to patron data in staff email. If your library’s emails can be publicly requested, don’t assume that you’ll get a chance to redact patron data before the emails are released to the public.

Starting the Long Journey of Protecting Patron Privacy in Staff Email

Scrubbing patron data from library email is a Sisyphean task. You can tell patrons not to email PII only to have patrons send over their logins for the financial website they can’t log into on a public computer. You can tell staff not to store patron data in work email, only to have staff use email as their primary knowledgebase for reference chat questions and answers. However, you have more control over how staff uses library email than you do patrons – this is where we start our scrubbing journey.

We’ll break this journey into two parts: the short and long term. The following are some actions workers and organizations can take in mitigating patron privacy risk in library emails:

Short term (individual) actions

  • First, get familiar with your email system’s filter and search capabilities! These will make the deletion process less painful.
  • Find and delete system-generated emails that contain patron data. These can be found through searching by a shared email address or subject line.
  • Search for emails with attachments and delete attachments if they contain patron data
  • Before deleting the email, migrate patron data that absolutely must be retained for a demonstrated operational need from email to a secured storage area designated by work (if one is available)
  • Create email rules to automatically delete incoming system-generated emails containing patron data
  • Learn how to use the ticketing system or other help desk or information desk systems as the primary mode of communication with other library staff about tickets and other

Long term (organizational) actions

  • Create policies and procedures around restricting the use of staff email to transmit or store certain types of patron data based on data classification level and/or privacy risk
  • Create secured data/file transfer options for sharing patron data, particularly between staff and authorized third parties
  • Set up applications and systems to not include patron data in system-generated reports and emails
  • Set up retention policies in email systems to automatically delete email  based on organizational retention schedules or retention schedules set by legal regulation
  • Create procedures or processes to use the ticketing system or other help desk or information desk systems as the primary mode of communication between staff as well as between staff and patrons
  • Create secured storage outside of staff email for patron data that absolutely must be retained for a demonstrated operational need, and create retention schedules for the data retained in storage

The short-term actions can take a while with manual reviewing of attachments and individual emails. But, with the magic of search and filter options, you can quickly eliminate a good portion of privacy risks by deleting the archive of system-generated emails. The long-term actions require a team effort in the organization, from administration drafting policies to IT creating automatic retention policies and secured storage and transmission options.

None of us want to spend more time dealing with email than we have to, and trying to keep up with the current email inbox count is near impossible as it is. Nonetheless, we need to keep in mind that work email can put patron privacy at risk, and we must address that risk as part of our library duties. It’s a #DataSpringCleaning project that never ends, but as long as we have email, there will always be the need to clean our inboxes to protect patron privacy.

#dataspringcleaning, Home Office Edition

Welcome to this week’s Tip of the Hat!

The trees outside the LDH office are now covered in leaves, the tulips and daffodils are blooming, and the grass has started growing again. All of which means one thing – allergy season Spring Cleaning Season! Or, as we at LDH like to call it, #dataspringcleaning season.

We covered the basics of #dataspringcleaning in a previous newsletter; however, determining if your data sparks joy might be a challenge this year given the state of current affairs. For this year’s #dataspringcleaning season, here’s a short cleaning list for your newly minted home office to help you in your data cleaning efforts.

Paper documents

Shred! If you don’t have a shredder at home, you have a couple of options:

  • Store documents for shredding at the office in a secured place in your home away from housemates.
  • Buy a shredder for your home. Look for a shredder that can shred at or above Level P-4. Having a shredder at home not only helps you protect patron privacy but also your privacy now that you have a convenient way to shred your personal documents and files.

Shredded paper should not go into your recycling bin – it’s most likely that your recycling center cannot accept shredded paper. In King County (where LDH is located) residents are instructed to use shredded paper for composting. You can also take a few handfuls of shredded paper to top off any garbage cans before closing up the garbage bag when you take the garbage out. Check with your local solid waste and recycling departments in your local area for more guidance about disposing of shredded paper.

Electronic equipment

  • Store patron data on work storage or equipment when necessary. Do not use personal hard drives, flash drives, or other personal storage devices to store patron data.
  • Do a quick data inventory of any personal cloud storage services you use, such as Google Drive or Evernote.
    • What patron data do you have stored in those services?
    • Can you migrate that data to work storage?
    • What data do you need to keep, and what data can be deleted?
  • If you have your work computer at home, now would also be a good time to do a data inventory of what’s stored on the local drive.
  • Remember, deleting a file doesn’t mean that the file is deleted! There are many programs available to help you permanently delete files.
  • If you do end up having to retire a physical disk or drive that held patron data, what tools do you have in your home toolbox? You most likely have a hammer, but you can also get creative depending on what’s available… we’ve mentioned power drills before, but perhaps you might want to try out the nail gun. Remember – safety first!

#dataspringcleaning at home is a good way to spend the time between meetings or to begin or end your workdays at home. A little bit of cleaning each day adds up to help protect patron privacy 🙂 Happy cleaning!

Shining a Light on Dark Data

Welcome to this week’s Tip of the Hat!

Also, welcome to the first week of Daylight Savings Time in most of the US! To celebrate the extra hour of daylight in the morning (we at LDH are early birds), we will shed light on a potential privacy risk at your organization – dark data.

The phrase “dark data” might conjure up images of undiscovered data lurking in the dark back corner of a system. It could also bring to mind a similar image of the deep web where the vast amount of data your organization has is hidden to the majority of your staff, with only a handful of staff having the skills and knowledge to find this data.

The actual meaning of the phrase is much less dramatic. Dark data refers to collected data that is not used for analysis or other organizational purposes. This data can appear in many places in an organization: log files, survey results, application databases, email, and so on. The business world views dark data as an untapped organizational asset that will eventually serve a purpose, but for now, it just takes up space in the organization.

While the reality of dark data is less exciting than the deep web, the potential privacy issues of dark data should be taken seriously. The harm isn’t that the organization doesn’t know what it’s collecting – dark data is not unknown data. One factor that leads to dark data in an organization is the “just in case” rationale used to justify data collection. For example, a project team might set up a new web application to collect patron demographic information such as birth date, gender, and race/ethnicity not because they need the data right now, but because that data might be needed for a potential report or analysis in the future. Not having the data when the need arises means that you could be out on important insights and measures that could sway decision-makers and the future of operations. It is that fear of not having that data, or data FOMO, that drives this collection of dark data.

When you have dark data that is also patron or other sensitive data, you put your organization and patrons at risk. Data sitting in servers, applications, files, and other places in your organization are subject to being leaked, breached, or otherwise subject to unauthorized access by others. This data is also subject to disclosure by judicial subpoenas or warrants. If you choose to collect dark data, you choose to collect a toxic asset that will only become more toxic over time, as the risk of a breach, leak, or disclosure increases. It’s a matter of when, not if, the dark data is compromised.

Dark data is a reality at many organizations in part because it’s very easy to collect without much thought. The strategies in minimizing the harms that come with dark data require some forethought and planning; however, once operationalized, these strategies can be effective in reducing the dark data footprint in your organization:

  • Tying data collection to demonstrated business needs – When you are deciding what data to collect, be it through a survey, a web application, or even your system logs, what data can be tied back to a demonstrated business need? Orienting your data collection decisions to what is needed now for operational purposes and analysis shifts the mindset away from “just in case” collection to what data is absolutely needed.
  • Data inventories – Sometimes dark data is collected and stored and falls off the radar of your organization. Conducting regular data inventories of your organization will help identify any potential dark data sets for review and action.
  • Retention and deletion policies – Even if dark data continues to persist after the above strategies, you have one more strategy to mitigate privacy risks. Retention policies and proper deletion and disposal of electronic and physical items can limit the amount of dark data sitting in your organization.

The best strategies to minimize dark data in your organization happens *before* you collect the data. Asking yourself why you need to collect this data in the first place and looking at the system or web application to see what data is collected by default will allow you to identify potential dark data and prevent its collection.

#dataspringcleaning

Welcome to this week’s Tip of The Hat!

This week’s newsletter is inspired from last week’s #ChatOpenS Twitter chat about patron privacy, where the topic of #dataspringcleaning made its appearance.

I’m starting the hashtag #dataspringcleaning — I need to do this in my personal life, too! https://t.co/ueVfafKDQ0
— Equinox OLI (@EquinoxOLI) March 13, 2019

Springtime is around the corner, which means Spring Cleaning Time. While you are cleaning your physical spaces, take some time to declutter your data inventory. By getting rid of personally identifiable data that you no longer need, you are scrubbing some of the toxicity out of your data inventory, and lessening the privacy risks to patrons.

When you are done with data, what do you do with it? First, you need to check in to see if you are truly done with that data. Unfortunately, we cannot use Marie Kondo’s approach by asking if the data sparks joy, but here are some questions to ask instead:

  • Is the dataset no longer needed for operational purposes?
  • Are you done creating an aggregated dataset from the raw data?
  • Is the dataset past the record retention period set by policy or regulation? Don’t forget about backup copies as well!

Once you have determined that you no longer need the data, it’s time to clean up! For data on paper – surveys, signup or sign in sheets, reservation sheets – shred the paper and dispose of it through a company that securely disposes of shredded documents. Resist the temptation of throwing the shredding into the regular recycling bin – if your shredder shreds only in long strips, or otherwise doesn’t turn your documents into tiny bits of confetti, dumpster divers can piece together the shredded document.

Electronic data requires a bit more scrubbing. When you delete electronic data, the data is still there on the drive; you’ve just deleted the pointer to that file. Using software that can wipe the file or the entire drive will reduce the risk of someone finding the deleted file. There are free and paid software options to complete the task, depending on your system and your needs (hard drive, USB sticks, etc.).

And now we get to the fun part of deleting data. Any disc drives, CDs, floppy disks, or (where I give my age away) backup tape drives that held patron data need to be disposed of properly as well. Sometimes you are close to a disk disposal center where you can destroy your drives via degaussing machines. If you can’t find a center, then you have to literally take matters into your own hands. Remember that scene from Office Space with the printer?

A man beating a printer with a baseball bat.
That is what you are going to do, but with safety gear. Hammers, power drills, anything that will destroy the platters in the drive or the disk itself – just practice safety while doing so!

And who says that cleaning can’t be fun?

Resources to get you started: