Summer Homework – Requesting Your Data

Welcome to this week’s Tip of the Hat!

Have you ever wondered what data OverDrive collects while you’re reading the latest ebook? Or what Kanopy collects when you’re watching a documentary? As library workers, we have some sense as to what vendors are collecting, but we are also patrons – what exactly are vendors collecting about *us*?

GDPR and CCPA both give different sets of users (EU residents and CA consumers, respectively) the right to access the data collected by organizations and businesses; however, some organizations extended that right to all users, regardless of geographic residency. Below are some of the more well-known library vendors who are offering some form of data request process for their users (aka library patrons, including you!):

  • Cengage
  • Elsevier
  • Kanopy’s data request appears only to apply to CA consumers: “Under California Civil Code Section 1798.83, if you are a California resident and your business relationship with us is primarily for personal, family or household purposes, you may request certain data regarding our disclosure, if any, of personal information to third parties for the third parties’ direct marketing purposes. To make such a request, please send an email to privacy@kanopy.com with “Request for California Privacy Information” in the subject line. You may make such a request up to once per calendar year. If applicable, we will provide to you via email a list of the categories of personal information disclosed to third parties for their direct marketing purposes during the immediately-preceding calendar year, along with the third parties’ names and addresses. Please note that not all personal information sharing is covered by Section 1798.83’s requirements.”
  • LexisNexis
  • OverDrive
  • ProQuest
    • ExLibris, owned by ProQuest, appears to have a different data request process: “You may request to review, correct or delete the personal information that you have previously provided to us through the Ex Libris Sites. For requests to access, correct or delete your personal information, please send your request along with any details you may have regarding the method by which the information was submitted to privacy@exlibrisgroup.com. Requests to access, change, or delete your information will be addressed within a reasonable timeframe.”

What is surprising is that there are not more library vendors that offer this option, or not extending the option to all users. This might change over time, depending on how the newest data privacy ballot initiative in California goes in November, or if additional regulations are passed in other states or even in the federal government. If more companies provide this right to access for all users, then it’s more likely that this practice will become a standard practice industry-wide. LDH will provide the latest updates around data access options from library vendors when they come along!

Last Minute Panic: A CCPA Update

Welcome to this week’s Tip of the Hat!

We hate to break it to you, but there are only a few weeks left in 2019. Do you know what that means? That’s right – only a few more weeks before the California Consumer Privacy Act comes into effect. A lot has happened since our first newsletter about the CCPA in March, so let’s take some time to catch everyone up on the need-to-knows about CCPA as we head into 2020.

Everything and nothing have changed

Lawmakers introduced almost 20 amendments in the past few months in the State Legislature, ranging from grammatical edits to substantial changes to the CCPA. In the end, only a handful of amendments were signed by the state governor, all of which do not substantially change the core of CCPA. There are now a few exceptions to CCPA with the amendments, such as employee data, but that’s the extent to the changes introduced into the Act going into 2020.

However, this doesn’t mean that we won’t see some of the stalled or dead amendments come back in the next legislative session. Expect additional amendments in the coming year, including new amendments that might affect regulation and scope of the Act.

What you need to know about regulation and enforcement

In October 2019, the California Attorney General office published a draft set of regulations of how their office will enforce CCPA. While the public comment period is open until December 6th, many businesses are taking the regulations as their new playbook in preparing for CCPA compliance.

“Household” dilemma

The problematic definition of “personal information” remains… problematic. The amendment that sought to remove “household” from the definition stalled in the State Legislature. The regulations address the handling of household information to a small extent. If someone requests access to personal information, including household information, the business has the option to give aggregated data if they cannot verify the identity of the requester.

Again, this broad definition has ramifications regarding patrons requesting information from library vendors. Libraries should work with library vendors in reviewing confidentiality and privacy policies and procedures and discuss the possible impact this definition will have on patron privacy.

Hello, COPPA!

One of the major elements of CCPA is the regulations surrounding collecting and processing personal information from anyone under 16 years of age. CCPA requires businesses to get affirmative authorization from anyone 13 years old up to 16 years old before the business can sell their personal information. To comply with the new requirement, many businesses might now have to collect or otherwise verify the age of the online user. This leads into the realm of the Children’s Online Privacy Protection Act (COPPA) – now that the business has actual knowledge of the online user’s age, more businesses could be subject to liability under COPPA.

This could lead to another tricky conversation for libraries – library vendors who fall under CCPA collecting additional patron data for compliance. Collecting and processing patron data is sometimes unavoidable due to operational needs, but it’s still worthwhile to ensure that the data is properly secured, processed, and deleted.

Do Not Track, for real this time

Do your browsers on your library public computers have “Do Not Track” turned on by default, or have other browser plugins that prevent tracking by third parties? If not, here’s another reason to do so – the regulations state that “If a business collects personal information from consumers online, the business shall treat user-enabled privacy controls, such as a browser plugin or privacy setting or other mechanism, that communicate or signal the consumer’s choice to opt-out of the sale of their personal information as a valid request…” So get installing those privacy plugins already!

Do we have to comply with CCPA?

It depends on who the “we” is in this question. As of now, most California libraries are most likely out of the scope of CCPA (though, as Joshua Metayer pointed out, the CCPA gives no guidance as to what is considered a for “profit” business). Library vendors will most likely have to comply if they do business in California. Some businesses are trying to keep CCPA compliance strictly to CA residents by setting up a separate site for California, while other businesses, such as Microsoft, plan to give all US residents the same rights CA residents have under CCPA.

We’ve only covered a section of what’s all going on with CCPA – there’s still a lively debate as to what is all entailed by the definition of “sale” in regards to personal information which is a newsletter in itself! We also could have an entire newsletter on CCPA 2.0, which is slated to be on the November 2020 ballot. California continues to be a forerunner in privacy law in the US, and the next year will prove to be an important one not only for everyone under the scope of CCPA but for other states looking to implement their CCPA-like state law.

Leaving Platforms and Patrons Behind

Welcome to this week’s Tip of the Hat!

Remember when the online library catalog was just a telnet client? For some of you, you might even remember the process of moving from the card catalog to an online catalog. The library catalog has seen many different forms in recent decades.

The most recent wave of transitions is the migration from an old web catalog – in most cases an OPAC that came standard with an ILS – to a newer discovery layer. This discovery layer is typically hosted by the vendor and offers the ability to search for a wider array of collections and materials. Another main draw of the discovery layers in the market is the enhanced user experience. Many discovery layers allow users to add content to the site, including ratings, comments, and sharing their reading lists to others on the site.

While being able to provide newer services to patrons is important, this also brings up a dilemma for libraries. Many discovery layers are hosted by vendors, and many have separate Terms of Service and Privacy Policies attached to their products outside of the library’s policies. The majority of library catalogs that the discovery layers are meant to replace are locally hosted by the library, and fall under the library’s privacy policies. Libraries who made the transition to the discovery layer more often than not left their older catalog up and running, marketed as the “classic” catalog. However, the work necessary to keep up two catalogs can be substantial, and some libraries have retired their classic catalogs, leaving only the discovery layer for patrons to use.

The dilemma – How will the library provide a core library service to patrons objecting to the vendor’s TOS or privacy policy when the library only offers one way to access that core service?

We can use the Library Bill of Rights [LBR] interpretations from ALA to help guide us through this dilemma. The digital access interpretations of the LBR provides some guidance:

Users have the right to be free of unreasonable limitations or conditions set by libraries, librarians, system administrators, vendors, network service providers, or others. Contracts, agreements, and licenses entered into by libraries on behalf of their users should not violate this right… As libraries increasingly provide access to digital resources through third-party vendors, libraries have a responsibility to hold vendors accountable for protecting patrons’ privacy. [Access to Digital Resources and Services: An Interpretation of the Library Bill of Rights]

Moving core services to third-party vendors can create a barrier between patrons and the library, particularly when that barrier is the vendor’s TOS or privacy policy. The library then needs to decide what next steps to take. One step is to negotiate with the vendor regarding changes to the TOS and privacy policy-based to address patron concerns. Another step is a step that several libraries have opted for – keeping the classic catalog available to patrons alongside the discovery layer. Each step has its advantages and disadvantages in terms of resources and cost.

The classic catalog/discovery layer dilemma is a good example of how offering newer third-party platforms to provide core library services can create privacy dilemmas for your patrons and potentially lock them out from using core services. If your library finds itself making such a transition – be it the library catalog or another core service platform – the ALA Privacy Checklists and the interpretations of the LBR can help guide libraries through the planning process. Regardless of the actions taken by the library, ensuring that all patrons have access to core library services should be a priority, and that includes taking privacy concerns to account when replacing core service platforms.

Privacy in the News: LinkedIn and the “Like” Button

Welcome to this week’s Tip of the Hat! We have various updates from around libraryland and beyond, so let’s start the week by catching you up on important news and developments.

LinkedIn Learning Stalemate

Last week we learned that negotiations between ALA representatives and LinkedIn Learning stalled over the proposed changes the company plans to implement later this year that would require users to create a LinkedIn profile to access LinkedIn Learning resources. ALA released a public statement to LinkedIn Learning to reconsider their changes, while a petition on EveryLibrary is collecting signatures of libraries and library staff who will not renew (or will consider not renewing) their contracts with LinkedIn Learning in light of this upcoming change. The list of libraries committed to not renewing the service grows, with state libraries getting into the fray.

The story has also found its way to various news outlets:

LinkedIn Learning has directed those seeking comment for the recent statements from ALA and libraries to a blog post from June 2019, which doesn’t give much in the way of addressing the concerns raised in the recent weeks.

Time to rethink the embedded “Like” button?

Today, the Court of Justice for the European Union delivered a ruling that could have ripple effects in the US. The Court ruled that websites that embed the Facebook “Like” button are responsible for the privacy of the users on the website. According to the Court, a website that has the “Like” button must follow the same consent and data processing regulations laid out in European law, even though that data is being transferred to Facebook. This is not the first time that the embedded “Like” button has gotten into trouble in the EU – a recent example comes from 2016, where a German court ruled that a site with the embedded button violated user privacy.

Many libraries and vendor products include the “Like” button on websites, catalogs, and other patron-facing applications and services. Embedding social media buttons such as the “Like” button already presents several privacy issues. For example, this 2013 article from Mother Jones explains how companies can track users through websites that have the “Tweet” button embedded into their pages. These buttons and widgets collect patron information and this information can be sent back those social media sites even if the patron doesn’t use the buttons on the page.

With US states looking toward the EU and GDPR as a foundation to build their own state data privacy laws, this ruling may influence how US law interprets the responsibility for user privacy when a website embeds social media buttons that have been known to track users. Even if no laws come to pass, it would still be worthwhile to revisit your organization’s use of these types of social media buttons on your websites and if that use aligns with your privacy policy and patron privacy expectations.

Caring Who Is Sharing Your Patron Data

Welcome to this week’s Tip of the Hat! Last week Tom Boone stated his intent to boycott two vendors – Thomson Reuters and RLEX Group – at the American Association of Law Librarians annual conference based on the current business relationships that both companies have with U.S. Immigration and Customs Enforcement [ICE]. While the objections are based on the relationships themselves, the boycott posts brings us back to a question posed by Jason Griffey about LexisNexis’s interest in assisting ICE in building an “extreme vetting” system for immigrants to the US – what role would data collected from libraries that subscribe to those vendors’ products play in building such a system? For this week’s letter, we’ll broaden the – what do vendors do with library patron data and what say do libraries have in the matter?

Patron data is as valuable to vendors as it is to libraries. To vendors, patron data can be used to refine existing services while building newer services based off of patron needs and behaviors. The various recommendation systems in several library products are powered partially by patron borrowing activity, for example. Nonetheless, while vendors use patron data for their products and services, many vendors share patron data with other service providers and third-party businesses for a variety of reasons. For example, some vendors run their applications on commercial cloud servers, which could mean storing or transferring patron data to and from these servers. Depending on the agreement between the vendor and the commercial cloud service, the service might also have access to the data for performance tracking and analysis purposes.

How do you find out what vendors are doing with your patron data? One of the first places to look is their privacy policy. Like libraries, vendors too should inform patrons how they are handling patron data. The library should have a separate privacy policy that indicates how library data is shared with vendors, but vendors also need a privacy policy that clearly communicates to patrons using the vendor service on how the data is handled by the vendor, including any sharing of data with service providers or other third parties. LexisNexis’ privacy policy provides some of this information in their How We Use Your Information and Sharing of Your Information sections (which, BTW, you should read if you do use LexisNexis!).

If you can’t find the information you need in the privacy policy, the vendor contract might have some information regarding the collection, use, and sharing of patron data by the vendor. The vendor contract can also serve another purpose, particularly when you are at the contract negotiation or contract renewal stages. The contract can be a good place to lay out expectations to the vendor as to what level of data collection and sharing is permissible. Some data sharing is unavoidable or necessary, such as using aggregated patron data for analyzing infrastructure performance, so if you come to the negotiation table with a hardline “no reuse or sharing with third parties” position, you will most likely be making some compromises. This is also a good place to bring up the question about “selling” vs “sharing” data with service providers – while some vendors state in their privacy policy that they do not sell patron data, they might not mention anything about sharing it with others. Setting expectations and requirements at the point of negotiations or renewal can mitigate any surprises surrounding data use and sharing down the road for all parties involved.

Having the discussion about patron data use and sharing by the vendor will not only allow you to find out what exactly happens to your patrons’ data when they use vendor products, but it also opens up the opportunity for your library to introduce language in the contract that will protect your patrons’ data. You can do this through line edits, or through a contract addendum that has been vetted by your local legal team. Before going to the negotiation table with your proposed changes and requests, you will need to determine what points will you be willing to compromise on, and which points are dealbreakers. Ideally negotiations provide a workable outcome for all, but in reality, sometimes the best outcome for your patrons and staff is to leave the negotiations. Not giving a vendor your library’s business is a valid option – an option that could signal to the vendor that some of their practices need to change if enough libraries choose to follow suit.

CRMS 101

Welcome to this week’s Tip of the Hat! Today we have a brief overview of an acronym that is becoming a popular tool in libraries – the customer relationship management system [CRMS] – and how this new player in the library field affects patron privacy. While some folks know about CRMS, there might be others that are not exactly sure what they are, and what they have to do with libraries. Below is a “101”- type guide to help folks get up to speed on the ongoing conversation.


What is a CRMS?

A customer relationship management system [CRMS] manages an organization’s interactions with customers with the goal to grow and maintain customer relationships with the organization. CRMS products have been used in other fields outside of librarianship for decades, mostly in commercial businesses, but the increased importance in data analysis and improving customer experiences has led for wider adoption of CRMS products in other fields, including libraries.

What is a CRMS used for?

Many organizations use CRMS products to track various communications with customers (email, social media, phone, etc.) as well as data about a customer’s interests, demographics, and other data that can be used for data analysis. This analysis is then used to improve and customize the user experience (targeted marketing, personal recommendations, and invitations, etc.) as well as making business decisions surrounding products, services, and organization-customer relations. This analysis can also be used to create user profiles or for market segmentation research.

What are some examples of CRMS?

There are many proprietary and open source options, though Salesforce is one of the most recognized CRM companies in the overall field. In the library world, several library vendors sell standalone CRMS products, such as OrangeBoy’s Savannah. Other library vendors have started offering products that integrate the CRMS into the Integrated Library System [ILS]. OCLC’s WISE is one such example of this integration, while other library vendors plan to release their versions in the near future.

What data is collected in a CRMS?

A CRMS is capable of collecting a large quantity of very detailed data about a customer. Types of patron data that can be collected with a library CRMS includes (but not limited to):

  • Demographic information
  • Circulation information like total checkouts, types of materials checked out, and physical location of checkouts
  • Public computer reservation information
  • Electronic resource usage
  • Program attendance

In addition to library supplied data, other data sets from external sources can be imported into the CRMS ranging from US Census data to open data sets from cities and other organizations that could include other demographic information by geographical area (such as zip code) or by other indicators.

How is patron privacy impacted by CRMS?

The amount of information that can be collected by a CRMS is akin to the type of information collected by commercial companies who sell services and products. By creating a user profile, the company can use that information to personalize that customer’s experience and interactions with the company, with the ultimate goal of creating and maintaining return customers. Traditionally libraries do collect and store some of the same information that CRMS products collect; however, it is usually not stored in one central database. Creating a profile of a patron’s use of the library leaves both the library and the patron at high risk for harm on both a personal and organizational level. This user profile is subject to unauthorized access by library staff, data breaches and leaks, or intentional misuse by staff or by the vendor that is hosting the system. This user profile can also be subject to a judicial subpoena, which puts patrons who are part of vulnerable populations at higher risk for personal harm if the information is collected and stored in the CRMS.

Further reading on the conflict between the CRMS, data collection, and library privacy:

What can we do to mitigate privacy risks if we use a CRMS?

If your library chooses to use a CRMS:

  • Limit the type and amount of patron data collected by the system. For data that is collected and stored in the CRMS, consider de-identification methods, such as aggregation, obfuscation, and truncation
  • Perform risk assessments to gauge the level of potential harm connected by collecting and storing certain types of patron information as well as matching up patron information with imported data sets from external sources
  • Negotiate at the contract signing or renewal stage with the vendor regarding privacy and security policies and standards around the collection, storage, access, and deletion/retention of patron data, as well as who is responsible for what in case there is a data breach
  • Perform regular privacy and security audits for both the library and the vendor

We hope that you find this guide useful! Please feel free to forward or pass along the guide in your organizations if you are having conversations about CRMS adoption or implementation. LDH can also help you through the decision, negotiation, or implementation processes – contact us to learn more!

Data Analytics @ Your Library: An Executive Summary of the Santa Cruz Report

Welcome to this week’s Tip of the Hat!

Last week was a busy week in the world of library privacy, and not just because there were a variety of privacy-related presentations and events at ALA Annual. While folks were wrapping up and traveling back from DC, a Santa Cruz county civil grand jury published a report that will shape the library and vendor data analytics landscapes. Running short on time due to ALA travel last week and this week’s holiday schedule? Here’s an executive summary so you can get a head start on thinking about how to approach the report at your own organizations.


What was the report about?

The report, “Patron Privacy at Santa Cruz Public Libraries: Trust and Transparency in the Age of Data Analytics,” is the result of an investigation by the Civil Grand Jury about the Santa Cruz Public Library’s (SCPL) use of a commercial analytics program, Gale’s Analytics on Demand (AoD), to analyze patron data.

Who wrote the report?

The report was written by the Civil Grand Jury. The county of Santa Cruz has a Civil Grand Jury comprised of 19 private citizens. One of their roles in the county is to examine and investigate government operations and to recommend actions to improve said operations. The Consolidated Final Report for 2018-2019 lists other investigations undertaken by the Jury, including detention facilities and public defense contracts.

What did the report find?

The report found that the Santa Cruz Public Library did not adequately inform patrons about the use of AoD at SCPL or do a thorough privacy risk analysis on using AoD at SCPL. The major themes in the Grand Jury’s findings are:

  • Mismatch between use of AoD and SCPL confidentiality and privacy policy
  • Lack of communications between SCPL and library patrons regarding use of data analytics, including giving the patrons the option to give consent to the library to use their data for data analytic use
  • Failure on SCPL’s part to thoroughly investigate the risks, effectiveness, and best practices in using data analytics in processing patron data
  • Lack of contract language with the vendor that protects the interest of both SCPL and library patrons

What are the recommendations?

The Grand Jury recommendations to SCPL include:

  • Updating the SCPL confidentiality and privacy policy to reflect the use of data analytic tools to process patron data
  • Create a system that allows patrons to consent to having their data used for data analytics
  • Follow professional and industry best practices around patron privacy
  • Create a data privacy officer role whose responsibility will be implementing and enforcing the privacy policy
  • Review and amend vendor contracts to protect the interests of both the library and library patrons

What’s next?

ALA will most likely release a response to the report in the near future; however, the next major updates will most likely come at the time where the library will submit their responses to the Grand Jury’s finding and recommendations later in the year.

We use analytics software – based on this report, what do we do?

The recommendations provide a good outline to where to begin. If you need a place to start, here are four key actions to focus on:

  • Review privacy policies – does your policy clearly tell patrons that you use analytics to process patron data?
  • Review current patron communications – how are you communicating with patrons about how the library uses their data? Can your patrons give consent to having their data processed by analytics software? Is there a way they can opt-out?
  • Review your privacy practices – Go through the ALA Library Privacy Checklists and make a plan of action for any areas in the Priority 1 Actions sections of the lists that your organization has not implemented
  • Review vendor contracts – pay close attention to areas in which contracts can be amended to shore up patron privacy protections including reflecting local and state regulations surrounding patron data and responsibilities of the vendor in the event of a data breach.

Feel free to forward this summary to folks in your organization! We highly recommend giving the full report a read, but we recognize that time is sparse during the summer season, so we hope that the above summary can help you start conversations at your organization. LDH will keep you updated as the official responses from SCPL, ALA, and others are published in the coming months.

As a reminder, LDH Consulting Services can assist your organization in reviewing privacy policies and practices in addition to risk assessments, staff training, and data inventories. If you have any questions, or would like to discuss how LDH can help your organization’s privacy practices, give us a ping!

To Renew Or Not To Renew

Welcome to this week’s Tip of the Hat! We at LDH are furiously getting ready for ALA Annual next week, and the Executive Assistant is bummed that she was not able to register for the conference. It appears that the only cats that are allowed at Annual are Baker and Taylor. Worry not, for the Executive Assistant has lined up someone to go in her place. You will get a chance to meet this new team member if you are heading to Annual. Stay tuned…

In the meantime, it’s Monday, and Mondays are the best days to talk contract renewals, right?

(Right?)

Last week Samantha Lee wrote about the upcoming changes to Lynda.com’s authentication process for library patrons, which would require patrons to either create or link a LinkedIn account to use their library’s Lynda.com subscription. Lee details the various issues surrounding patron privacy with this upcoming change:

LyndaLibrary had access to library card numbers for verification purposes. With the proposed change to require patrons to get LinkedIn accounts to access the Lynda resources, LinkedIn Learning would have access to more personally identifiable information than they would have as LyndaLibrary. To get a LinkedIn account, patrons would need to provide an email address and their first and last names. This is more PII than other library e-content vendors would require (OverDrive requires library card numbers only, Hoopla requires a library card and email). After a user creates an account, they are prompted to then add employment history and import their email contacts – under the presumption to help users expand their professional network. So LinkedIn would not only have patron information, but also information for others who did not agree to use its platform. [emphasis added]

In the post, Lee pointed out that several libraries have already decided not to renew their Lynda subscriptions. In the comments section, two commenters related their less-than-positive experiences in asking their vendor representative about the proposed changes, as well one commenter a vendor representative, explaining why the changes were being made.

This recent change highlights the long-standing tension between libraries and vendors regarding patron data. As Lee mentioned, other vendors do use some patron data to verify that the patron is with that particular library and can use the service. This tension is complicated by a number of factors, from the administrative (what data is being collected and why) to the technical (what data is needed for the service to function). Cloud-based applications add another layer of complicating factors, particularly if third-party contractors (sub-contractors) are involved in providing the infrastructure or other services for the application, which then increases the number of potential people that have access to patron data.

Some libraries use the contract negotiations and/or renewal phases to include contract clauses holding vendors to privacy and confidentiality policies set by the library, along with other privacy and security requirements surrounding patron data. Other times vendors work with libraries to create privacy-driven development and practices, closely aligning their applications to the standards of privacy laid out by libraries. And then there are times when vendors are proactive in creating a service or application with patron privacy in mind!

The Lynda.com change seems to be following the usual conflict pattern if you read through the comments – libraries pushing vendors for changes, vendors pushing libraries about why the changes are necessary. Sometimes, though, one party leaves the negotiations in hopes to gain an advantage over the other party. This is not without risk. Considering that many library patrons use Lynda.com for professional development and learn much-valued technical skills, some libraries might hesitate leaving the Lynda.com contract on the table. Nonetheless, some libraries are taking that risk in hopes that if there is a critical mass of unsigned contract renewals, then the vendor would have to respond to their requests. As Lee states, “If LinkedIn Learning cannot take our profession’s concerns seriously… then we can and will take our business elsewhere. Maybe then they will be willing to adopt the changes we require to protect patron privacy.” There is already some momentum for this strategy as mentioned by Lee and the commenters, and perhaps we might observe a critical mass sooner than later.

AI, Read The Privacy Policy For Me

Welcome to this week’s Tip of the Hat! Last week we took a deep dive into ALA’s privacy policy to figure out where our information was going if we agreed to receive information from exhibitors while registering for the Annual Conference.

[Which, ICYMI, LDH will be exhibiting at Annual! Let us know if you want to meet up and talk about all things privacy and libraries!]

As we encountered last week, privacy policies are not the most exciting documents to read. In fact, you can test out this theory by checking out the impressive list of electronic resource vendor privacy policies generated by the folks at York University (the code is available on GitHub). Try picking out a couple of privacy policies and read them from start to finish now. We’ll be here waiting for you.

…..

……. all done?

Chances are, you probably found yourself skimming the policies if you made it all the way to the bottom. If so, you’re not alone – studies have shown that the majority of folks do not read these policies, which could lead to surprises and confusion when your data is collected, shared, or breached. The fact is that it takes a long time to get through long, detailed documents – a recent study showed that many privacy policies require a high reading level and up to around a half hour to read. What’s a busy person to do?

One way some folks are addressing this is to let the machines do the reading for you. The last few years have seen several tools that use AI and machine learning (ML) to analyze privacy policies, selecting the very important parts that users should know. For example, the Usable Privacy Policy Project, an NSF funded project, used a collection of 115 privacy policies annotated by law students to train machine classifiers to annotate over 7000 privacy policies. Another group of researchers used the same 115 annotated privacy policies for ML training, creating two different tools for AI-generated analysis of policies. The first is Polisis, which creates a Sankey diagram based off of the AI’s analysis of the policy, while the second is Pribot, a chatbot that allows users to explore and ask questions about specific privacy policies.

Each AI privacy analysis tool takes a different approach in displaying the results to the end users. Let’s use OverDrive’s privacy policy as our test policy. [1] The Usable Privacy site uses different colored fonts to indicate which parts of the policy belong to 10 different categories. The site also directs us to another policy analysis of OverDrive’s Privacy Policy for Children. Users can click on a category to only show the colored sections of the policy, or to exclude it.

A screenshot of Usable Privacy's analysis of the OverDrive privacy policy.

For Polisis’ analysis of OverDrive’s policy, the site takes the same ten categories and creates separate visualizations for most of them. Users can click on a stream to highlight it in the diagram – for example, showing what information is shared and for what reason.

A screenshot of Polisis' analysis of OverDrive's privacy policy.

We are still a ways away before widespread adoption of AI-annotated privacy policies; however, the possibilities are promising. With GDPR, CCPA, and other upcoming privacy regulations, AI and ML could help end users in keeping up with all the changes in policies, as well as dig through mountains of text in a fraction of the time it would have taken to manually read all of the text. It will still take a considerable human role in training the AI and supervising the ML to ensure proper analysis, though, as well as human labor in creating effective and accessible interfaces. Perhaps one day there could be an API service that can have AI analyze the privacy policies listed on the York University page.

[1] Both sites are analyzing older versions of OverDrive’s privacy policy. The most up to date privacy policy is at https://company.cdn.overdrive.com/policies/privacy-policy.htm.

Monday Mystery: Conference Information Sharing

Welcome to this week’s Tip of the Hat! It seems that spring has just arrived for many of us in the US; however, the calendar tells us that we are only weeks away from the ALA Annual Conference in Washington DC in June. Our Executive Assistant was going through the PDF registration form the other day and noticed the following question:

A text box with the following text: "Attendees may receive exciting advance information from exhibitors like invitations, contests and other hot news. COUNT ME IN!" Yes/No checkboxes are next to the last sentence.

The above question on the registration form asks if the person (or in this case, cat) wants to receive information from conference exhibitors. The Executive Assistant paused. What does checking the “Yes” box all entail? Since we’re in the data privacy business, this is a perfect Monday Mystery for us to investigate.

After a quick search of the conference website, we land on ALA’s Privacy Policy at http://www.ala.org/privacypolicy. If you haven’t spent time with a privacy policy, it can seem daunting or downright boring. Let’s walk through this policy to find out what happens when we check the “Yes” box.

The “Information Collection & Use” section lays out what information is collected and when. They define “personal data” as information that can be used to identify someone: name, email, address, etc. The section breaks down some common actions and situations when ALA collects data, including event registration. We already guessed that ALA was collecting our information for event registration purposes, but we need to dig deeper into the policy to answer our question.

We then find a section labeled “Information Sharing” in which we might find our answer! The section lists who ALA shares information with in detail, including the type of data and circumstances that the data is shared. “Services Providers” seems promising – that is until we get into the details. The data listed that is shared to service providers is mostly technical data – location data, log files, and cookies – and has nothing regarding giving information to receive updates from exhibitors. Back to square one.

Moving down the policy, we arrive at the “Your Rights and Choices Regarding Your Information” section, which lists the following right:

Object to processing – You have the right to object to your Personal Data used in the following manners: (a) processing based on legitimate interests or the performance of a task in the public interest/exercise of official authority (including profiling); (b) direct marketing (including profiling); and (c) processing for purposes of scientific/historical research and statistics;”

Okay, we have the right to ask ALA not to use our personal data for marketing purposes. That’s a very important right to have, though that doesn’t exactly solve the mystery of what happens when we click on the “Yes” box.

This, readers, is where we are going to cheat in this investigation. It’s time to put our exhibitor hat on!

Exhibitors at major conferences are usually offered some form of registrant/member list as a means to promote their business before the conference. ALA does the same with Annual, and exhibitors can rent attendee lists. From https://2019.alaannual.org/list-rental, exhibitors have the option to “[t]arget buyers by industry segment, demographic profile or geographic area.” So, just not names and emails are shared!

On the exhibitor side, having that information would allow for targeted marketing – instead of blasting the entire attendee list, exhibitors can reach out to those most likely to be receptive to their service or product. On the attendee side, some want to have this type of targeted marketing to plan their time at the conference efficiently, or to do homework before hitting the exhibit hall. For other attendees, though, it means more emails that they’ll just delete or unsubscribe. And then there’s the question about what happens to that attendee data after the conference…

In the end, we still have a bit of a mystery on our hands. The only reason we got this far in our little Monday Mystery investigation is that LDH has been bombarded with emails trying to sell us attendee lists which tipped us off to start looking at the exhibitor section of the conference site. Your average conference attendee wouldn’t have that information and would be left scratching their heads due to the lack of information at the point of registration about what information is shared on these attendee lists. While we don’t have a clear answer to end today’s investigation, we hope that this gives our readers a little reminder to do some research the next time they are asked a similar question on a registration form.

Speaking of ALA Annual, LDH Consulting Services is excited to announce that we will be exhibiting in DC in booth 844! Many thanks to Equinox Open Library Initiative for making exhibiting at ALA Annual possible for LDH. Give us a ping if you will be at Annual and would like to talk more about LDH can do for your organization.