On April 5, 2023 the Mitigating Unauthorized Scraping Alliance (MUSA) partnered with the International Association of Privacy Professionals (IAPP) to organize the panel “Web Scraping: Understanding Compliance Risks and Hidden Costs” for the IAPP’s Global Privacy Summit 2023. The panel featured perspectives from Eric Null, the Director of Privacy & Data Project at the Center for Democracy and Technology (CDT), Chelsea Reckell, an attorney at Venable LLP, Lindsay Vogel, the Lead U.S. Counsel for Privacy at Bumble, and was moderated by Cobun Zweifel-Keegan, the Managing Director for IAPP D.C.

The discussion opened with an analysis of the complex legal and ethical environment around unauthorized data scraping. Speakers outlined the challenges around combating the issue given that web scraping currently is not clearly addressed or defined under the law. The panel highlighted some of the avenues of redress that have been tested, particularly the Computer Fraud and Abuse Act, the principal federal anti-hacking statute, and their limitations. Speakers identified the need for a civil remedy and legal framework to create distinctions around the liability of unauthorized scraping, particularly for public data. Current U.S. privacy laws have publicly available data exceptions that leave no recourse for platforms to pursue threat actors. Thus, speakers continued to highlight the need for defining the limits of how “publicly available” data is used as a necessary first step in improving current privacy legislation.
The panel also highlighted distinctions between authorized and unauthorized scraping and explored the privacy risks to users, particularly for sensitive and personally identifiable information. Speakers highlighted Clearview AI and Weight Watchers’ Kurbo as notable examples of personal images and information being collected without authorization and the difficulty around removing information once it is scraped and packaged in datasets. Enforcement options like cease and desist letters and litigation can be challenging to pursue if actors cannot be identified or unauthorized data is already being used to train an algorithm or widely dispersed through other mechanisms. Thus, the panel emphasized the importance of coordinated action around unauthorized scraping prevention, whether as a technical solution like modifying APIs, or contractual, such as updating terms of service. Panelists urged attendees to turn to the Industry Practices to Mitigate Unauthorized Scraping document as a useful resource.
Finally, the panel emphasized the need to start conversations around unauthorized scraping and the importance of industry collaboration. Panelists highlighted that scraping affects many companies and having regulators involved in conversation with industry members on a global scale is crucial for generating meaningful action. The panel raised a number of unresolved questions around the boundaries of permissible scraping related to first amendment protected activities like journalism or research and the absence of legal distinctions between authorization and authentication within the current law, suggesting that there is still much work to be done. However, what is clear is that continuing to promote dialogue around the issue of unauthorized scraping is necessary to tackle this complex issue.