David Newberger

February Newsletter

Welcome to the Mitigating Unauthorized Scraping Alliance newsletter, where we highlights topics of interest related to unauthorized data scraping. Unauthorized data scraping is the automated collection of user data at scale that violates a platform’s Terms of Service.

MUSA International Data Privacy Day 2023 Event: The State of Unauthorized Scraping

In observance of International Data Privacy Day, the Mitigating Unauthorized Scraping Alliance (MUSA) hosted an event on January 31, 2023, featuring industry, legal, and academic experts who examined the landscape and impacts of unauthorized data scraping.

As threat actors seek to collect public personal data on a large scale for their own gain, how best to regulate data scraping and enforce against unauthorized scraping has become a critical challenge for industry, government, and policy makers. Unauthorized data scraping and the unauthorized use of scraped data have far-reaching impacts on both users and industry that challenge user privacy expectations and can lead to harmful actions, such as spamming, fraudulent communication, or identity theft. In recent years, high-profile court cases, regulatory fines, and newly introduced legislation involving data scraping have drawn media coverage and public attention.

You can watch a replay of the event below

The State of Unauthorized Scraping and Its Impact on Users & Industry (full event)

Discussion Panel #1: The State of Unauthorized Scraping Enforcement

Discussion Panel #2: The Impact of Unauthorized Scraping on Users

Discussion Panel #3: The Impact of Unauthorized Scraping on Industry

Building the Conversation Around Unauthorized Scraping

On January 31, 2023, the Mitigating Unauthorized Scraping Alliance (MUSA) held its inaugural public event in observance of International Data Privacy Day. Over 175 people, both in-person and online, attended this three-panel discussion and networking opportunity, entitled “The State of Unauthorized Scraping and Its Impacts on Users and Industry.” The event featured perspectives from leading academic, legal, and industry representatives who discussed the impacts of unauthorized scraping on users and industry as well as the legal and regulatory landscape.

The first panel of the day focused on current and prospective laws and regulations that protect publicly available data and could be used to address unauthorized scraping. Panelists pointed out that we should challenge the assumption that we cannot have laws that regulate scraping and highlighted that current hacking and privacy laws such as the Computer Fraud and Abuse Act (CFAA) do not address unauthorized scraping or the technical aspects of authentication, authorization, and access control. In their discussion, panelists agreed that there is a need for more comprehensive laws and enforcement mechanisms that go beyond the CFAA and other hacking statutes to better address unauthorized scraping.

The next panel kicked off with a discussion on how to define an unauthorized scraping incident. Notably, speakers stressed that unauthorized scraping incidents are not data breaches. With this clarification in mind, panelists dug into the varied impacts of unauthorized scraping, including the loss of users’ ability to change or delete data once it has been scraped and the loss of trust in platforms. For example, panelists highlighted how threat actors can scam, stalk, or blackmail individuals using information scraped from online dating and social media sites. In addition, panelists highlighted that companies need to implement both legal and technical solutions to mitigate the impacts of unauthorized scraping, particularly because laws often cannot evolve fast enough to keep up with technological innovation. However, speakers expressed that technological prevention measures alone are not enough to deter unauthorized scrapers. 

The second panel also discussed the need for balancing user data protection and privacy expectations with public research interests. Speakers highlighted privacy laws that try to address these issues, like the Digital Services Act (DSA) in the EU and the proposed Platform Accountability and Transparency Act (PATA) in the US, which includes a protection for researchers using scraping to gain access to data. Panelists emphasized the importance of building understanding around the potential misuses of public data among regulatory bodies in the US and abroad and strengthening regulatory enforcement capacities to combat unauthorized scraping.

The final panel centered around the growing need for fostering dialogue on unauthorized scraping and building a unified front to combat data misuse. By collaborating through organizations like MUSA to create industry partnerships and share anti-scraping practices, companies of all sizes can work to mitigate unauthorized scraping based on their needs and capacities. As the third panel highlighted, there is no singular solution to preventing or combating unauthorized scraping. However, as the market for unauthorized scraped data continues to grow, regulatory action is needed to combat threat actors. By building awareness around the impact of unauthorized scraping and fostering public-private collaboration to ensure that there is an expectation of consequence for unauthorized scrapers, MUSA can protect data from unauthorized scraping and misuse.  The Mitigating Unauthorized Scraping Alliance will continue to inspire public-private dialogue and increase awareness around unauthorized scraping by engaging with policy makers and the industry, legal, media, and academic communities. MUSA is currently working with members to align on industry practices to publish in March 2023. MUSA will hold additional public conversations and develop opportunities for collaboration to combat unauthorized scraping.

Panel Summary:

Panel 1 “The State of Unauthorized Scraping Enforcement”

Moderated by Julia Tama (Venable LLP) with panelists Timothy Edgar (Brown University, Harvard Law School), Megan Iorio (EPIC), Chelsea Reckell (Venable LLP), and Cobun Zweifel-Keegan (IAPP DC).

Panel 2 “The Impact of Unauthorized Scraping on Users”

Moderated by Tejas Narechania (UC Berkeley School of Law) with panelists Brandie Nonnecke (CITRIS Policy Lab), Calli Schroeder (EPIC), Hannah Shimko (ODA), and Sarah Wight (LinkedIn). 

Panel 3 ”The Impact of Unauthorized Scraping on Industry”

Moderated by Hemu Nigam (Venable LLP) with panelists Mike Clark (Meta), Doug Hudson (Etsy), and Veronica Torres (Jumio Corporation).

You can view the recording of the full event below:

2023 Data Privacy Day Event: The State of Unauthorized Scraping and Its Impact on Users & Industry

Safeguarding User Data: Building a United Front Against Unauthorized Scraping

The Mitigating Unauthorized Scraping Alliance (MUSA) sets out an explainer on unauthorized webscraping, as well as its impact on industry, individuals and privacy.

The Problem Space

User data has become a valuable commodity which threat actors seek and platforms protect. Threat actors have turned to automated mass collection of user data to create and sell datasets, replicate existing legitimate webpages, or exploit information for purposes such as stalking or surveillance. In order to raise awareness about the importance of safeguarding data, it is valuable to understand the rise of unauthorized scraping and its impact.

Defining Unauthorized Scraping

‘Authorized scraping’ is the automated collection of data with expressed permission. ‘Unauthorized scraping’ is the automated collection of data that violates a platform’s Terms of Service. This involves the collection of data that a user shares with other users or is accessible as a result of a user unwittingly sharing access to their account. Therefore, unauthorized scraping is not considered a breach of a platform’s security protections. The use of unauthorized scraping to access user data creates the possibility of data misuse. Given the threat of unauthorized scraping, it is important to highlight its implications and raise awareness around safeguarding data and user protection.

How Scraped Data is Used

Demand for data that informs marketing, business development, and personal targeting has significantly increased over the past decade and has fueled the growing market for user data. Simultaneously, companies have limited the supply of data by restricting its access to protect against user data misuse. As a result, there has been an unprecedented rise in the amount of unauthorized scraping incidents with negative implications for both companies and users.

Threat actors are motivated to engage in unauthorized scraping for their own personal and financial gain. Some threat actors scrape to create datasets and databases of aggregated scraped user information that can be bought, sold, or posted online by third-party actors for profitDepending on the nature of the scraped data, it may be possible to facilitate phishing or spamming attacks, plant spyware, or steal credentials to further exploit individuals. Threat actors can also use unauthorized scraped data to create clone sites, which impersonate legitimate webpages.

In addition, they can aggregate scraped data into datasets for sale on data broker websites or for targeted advertising and marketing purposes. Often legitimate businesses or researchers are not aware that the services they rely on use unauthorized scraped data. Threat actors also access user data for political value by using targeted datasets for purposes such as reconnaissance or surveillance. Enemy nation states can also take advantage of unauthorized scraped data for their own gain. It is important to note that not all instances of unauthorized scraping lead to the aforementioned impacts.

The Impact of Unauthorized Scraping

The impacts of unauthorized scraping are far-reaching. Both unauthorized scraping and the subsequent use of the data decreases public trust and threatens industry reputations. It can also lead to system slowdowns, increased costs, and the loss of control over data. For users, unauthorized scraping reduces user control over information and can lead to spamming, fraudulent communication, identity targeting, surveillance, and unexpected disclosures of content intended to be temporary.

Combating Unauthorized Scraping

Currently, there are no industry standards for combating unauthorized scraping. A recent study conducted by NewtonX highlighted that nearly 90% of experts surveyed believe unauthorized scraping prevention is either important or very important, but only 42% of respondents have established strategies to address the practice. To address these gaps, NewtonX concluded that effectively tackling unauthorized scraping requires a collaborative and multi-stakeholder effort. While there is no singular approach to combating unauthorized scraping, there are an array of practices that companies engage in to mitigate unauthorized scraping. Consequently, there is a demonstrable need to foster public-private dialogue and to mitigate the current lack of industry-wide collaboration to combat unauthorized scraping.

About The Mitigating Unauthorized Scraping Alliance

Mitigating Unauthorized Scraping Alliance (MUSA) brings together industry members to address these challenges to offer a unified front against unauthorized scraping and data misuse. MUSA is working with member companies and experts to publish industry-aligned practices for unauthorized scraping mitigation with the goal of making unauthorized scraping more difficult across member platforms, reducing the attack vector for unauthorized scraping threat actors, and serving as a resource for media and policymaker engagement.

MUSA provides insight, knowledge, and expertise to the public on unauthorized scraping by hosting public education events like an International Data Privacy Day Panel Event on January 31, 2023 and publishing a monthly newsletter highlighting unauthorized scraping related news and events.

If you would like to learn more about MUSA and stay informed about unauthorized scraping visit our website and connect on LinkedIn. If you are interested in joining a diverse group of industries and experts in combating unauthorized scraping and want to get involved with MUSA, contact us or fill out the: Membership Inquiry Form.

This article was originally published at techUK.org.

Special Event: “The State of Unauthorized Scraping and its Impact on Users and Industry.”

The regulation and enforcement around unauthorized scraping have become critical issues for industry, government, and policy makers to address as threat actors seek to collect public personal data on a large scale for personal and professional gain. The Mitigating Unauthorized Scraping Alliance will convene leading industry, legal, and academic experts in Washington DC on January 31, 2023 to examine the landscape and impacts of unauthorized data scraping. This free special event will take place in observance of International Data Privacy Day and will also be streamed online. 

Unauthorized data scraping and the unauthorized use of scraped data have far-reaching impacts on both users and industry that challenge user privacy expectations. Unauthorized data scraping can lead to a number of harmful actions, such as spamming, fraudulent communication, or identity theft. In recent years, high-profile court cases, regulatory fines, and newly introduced legislation involving data scraping have drawn media coverage and public attention. 

This three-panel event aims to discuss these challenges and contribute to a growing dialogue on the impacts of unauthorized scraping. The panels will address the legal and regulatory landscape surrounding data scraping enforcement, the impacts of unauthorized data scraping on users and industry, and the industry tools employed for user data protection. 

Register now to join us for an afternoon of engaging conversation and networking with leaders on this new frontier of fighting data misuse.

Networking event to follow immediately after.

Panelists:

Julia Tama (Venable LLP)

Megan Iorio (Electronic Privacy Information Center)

Chelsea Reckell (Venable LLP)

Tejas Narechania (UC Berkeley School of Law)

Hannah Shimko (Online Dating Association)

Calli Schroeder (Electronic Privacy Information Center)

Brandie Nonnecke (CITRIS Policy Lab)

Timothy Edgar (Brown University)

Hemu Nigam (Venable LLP)

Mike Clark (Meta)

Veronica Torres (Jumio Corporation)

Paul Girardi (Cybersecurity Growth Partners)

Doug Hudson (Etsy)

Sarah Wight (LinkedIn)

Cobun Zweifel-Keegan (IAPP)