Facebook Shuts the Gate after the Horse Has Bolted, and Hurts Real Research in the Process

Mark Zuckerberg appears before the US Congress, 10 Apr. 2018. (Image: Prachatai on Flickr, CC-BY-NC-ND)

A public response from leading members of the Internet research community.

In reaction to the Cambridge Analytica controversy, Facebook has recently announced a substantial tightening of access restrictions to the Application Programming Interfaces (APIs) of Facebook, Instagram, and other platforms it owns. Other platform providers are likely to follow suit. The APIs are the means through which third parties access data on these platforms, such as when banking, retail, or even dating apps like Tinder access Facebook data to verify the identity of their users.

While these changes may generate some positive publicity for the company and its beleaguered CEO Mark Zuckerberg, they are likely to compound the real problem, further diminishing transparency and opportunities for independent oversight. The net effect of the new API restrictions is to lock out third parties and consolidate Facebook’s position as the main analytics and advertising broker. Contrary to popular belief, these changes are as much about strengthening Facebook’s business model of data control as they are about actually improving data privacy for users.

Collateral Damage

Up to now, thousands of social scientists around the world have also been using API data from Facebook and other social media platforms to study various online communities and to independently and rigorously investigate the impact of such platforms on our media and society. Such research is undertaken in the public interest and is often overseen by the research ethics review boards of universities and/or by national data protection agencies.

Indeed, research ethics have been a consistent concern in the Internet research community for the past two decades already. The leading international community of researchers in the field, the Association of Internet Researchers (AoIR), has regularly published detailed, gradually evolving research ethics guidelines since 2002, paying particularly close attention to the ethics of social media research.

As researchers at leading international research organisations, we are deeply concerned about the collateral impacts of the new API access rules implemented by Facebook and other platforms.

An informal call by one of us, Danish researcher Anja Bechmann, to list the published research that has relied on API data from various platforms resulted in an impressive collection of articles within just a few hours, and documents the societal significance and scale of such research; the future of this important work is now at risk. It is this work — in the public interest, independent, and sometimes critical of Facebook and other social media platforms — that is most likely to suffer from API lock-downs. In spite of, and even because of, the recent troubles with data misuse by malevolent third parties, better API access for legitimate data users is urgently necessary.

Providing API Access for Scholarly Research

Facebook has now also announced a new initiative “to help provide independent, credible research about the role of social media in elections, as well as democracy more generally”. This is delivered in partnership with several philanthropic foundations and overseen by a hand-picked group of scholars, mainly from the United States, who will “define the research agenda” and manage a peer-review process for research project proposals. In principle, we welcome this new engagement with researchers — but we suggest that the engagement model Facebook has chosen for this initiative falls well short of what is required, and fails to provide sufficient support for free and independent scientific research. (Note that the freedom of science is recognised as a human right.)

The narrow terms of reference for this initiative (elections and democracy), the requirement to adhere to a research agenda defined by the selection panel, and the selection process itself are inherently excluding a much broader range of research that investigates the impact of Facebook on all aspects of society. Indeed, we have seen the consequences of such selection processes before: in 2013, for example, Twitter selected only six of the more than 1,300 applications for its ‘Twitter Data Grants’ programme. The projects chosen through such processes may well be worthy and important, but they represent only a minuscule subset of all the scholarly research that could and should be conducted, in the public interest, on platforms such as Facebook and Twitter.

By being so selective about which research they actively support, the platforms exclude the critical voices to which they should be paying keen attention; they also tend to privilege US research over broader international collaboration. This creates an unacceptably imbalanced environment for social media research. Facebook’s new initiative is set up in such a way that it will select projects that address known problems in an area known to be problematic; it is unlikely to provide data access to research that addresses yet-unrecognised problems, or research that deals with issues broader than elections and politics.

We therefore argue that the platform providers — and the research advisors they collaborate with — cannot be allowed to position themselves as the gatekeepers for the research that investigates how their platforms are used. Instead, we need far more transparent data access models that clearly articulate to platform users who may be accessing their data, and for what purposes.

Such data access is crucial because independent, critical, public-interest research that is conducted in university contexts and is overseen by ethics review boards can diagnose emergent problems and suggest possible remedies. Locking out such research doesn’t make the problems go away, but simply hides them from view. Had Facebook and Twitter listened to scholarly concerns about undifferentiated third-party data access, political bots, and ‘fake news’, for instance, they could already have acted on these issues well before the political upheavals of 2016.

Instead, as some of us have argued for some time, if such research is locked out as a result of the coming API change, all that will remain is the shallow, commercially focussed analysis provided by the major market research companies that are strategic partners of or commercially dependent on Facebook and Twitter — this is neither in the interest of the users, nor ultimately good for the platforms themselves. Now more than ever, strong independent research on these platforms is urgently needed: rigorous, ethical research access to platform APIs actually protects users and enhances evidence-based social media literacy.

A Different Approach to Granting API Access

So how should API access be managed to ensure that such independent, critical research in the public interest can be conducted while protecting ordinary users’ privacy? We see four key points here: 1) Straightforward scholarly data access policies; 2) Custom APIs for research purposes; 3) Accept the use of research data repositories; 4) Open and transparent engagement with the research community.

Straightforward Scholarly Data Access Policies

First, social media platforms must provide broad-based data access to scholarly researchers at universities, if those researchers can demonstrate that their work is approved and closely monitored by ethics review boards or national data protection agencies that secure frameworks for compliant research solutions. Platforms, or their intermediaries, cannot pick favourites here: this would almost certainly lead to the exclusion of critical research that points to the problems and not just the benefits of social media. Furthermore, it would create a significant Matthew effect, where only researchers from well-known universities are granted access to API data, because of the legitimisation and PR this would generate for the platform in question.

At a time of heightened concerns about user privacy, substantial API-based access to public communication on these platforms is crucial for scholars precisely because it is only such research that can provide a transparent and independent assessment of the problems that the social media platforms are facing. Unlike the platforms and commercial research companies, universities can be trusted to take an independent perspective and to manage research ethics with great care and nuance: incorrect assessments, overt bias towards the platforms, and unethical engagement with social media data would seriously damage their public standing and destroy future careers.

Custom APIs for Research Purposes

Second, this also requires API-based data access services that are specifically tailored to research rather than commercial uses. At present, most social media platforms insist on providing one undifferentiated API offering for all users, including third-party end-user applications, commercial data analytics companies, and scholarly research. The platforms’ terms of service are typically written with commercial uses in mind, and rarely address use by researchers.

It would be more sensible to provide a dedicated API for research purposes, whose terms of service explicitly require that the research conducted using the data is in the public interest, and for public rather than commercial benefit. This would encourage a focus on the rights of platform users in such research projects, and would in turn provide a strong prohibition against problematic collaborations and data sharing with companies like Cambridge Analytica.

Accept the Use of Research Data Repositories

Third, current API terms of service prevent the sharing and publication of social media data alongside the peer-reviewed scholarly analysis, even in appropriately aggregated and anonymised form and through the safe repositories now available for research data. This is deeply problematic for the cross-checking and replication of research results: much like the platforms themselves, our research, too, becomes a black box whose internal logics cannot be critically examined. This is already a crucial issue for outputs from the internal research teams of the platforms themselves, whose accuracy simply cannot be independently reviewed.

Instead, then, what is necessary is an acknowledgment by the platforms that researchers have a legitimate need to share their data with each other, in a controlled and ethical way, and that the best way to do so is openly and transparently through the carefully controlled data repositories that already serve other academic fields, from genomics to econometrics. Rather than preventing the use of such safe, managed facilities for data sharing through their terms of service, platforms should work with university researchers to determine meaningful, workable approaches to sharing data that protect both the privacy of users and the integrity of the data. We acknowledge Facebook’s new initiative as an important first step on this path — but this must go much further still, and involve the relevant scholarly communities more fully.

Open and Transparent Engagement with the Research Community

Finally, then, this also requires the platforms to engage fully, openly, and transparently with the research community — beyond a lucky few researchers, and especially also beyond the narrow field of computer science that Silicon Valley firms have mainly engaged with. Social media are named so for very good reason, yet the platforms’ history of engaging with publicly funded social science, information, and media and communication studies, and related fields has been patchy to date.

This focus on the technological over social aspects of social media is arguably the root cause of the platforms’ persistent difficulties in understanding and reacting to user concerns, and it needs to be addressed urgently. We are pleased to see some progress on this front in recent announcements and activities — but much more work remains to be done. There are no easy technological remedies to the problems that the platforms are currently experiencing: it is now the task of social science to develop ideas not just for how the platforms might respond but, more importantly, also for how society itself may address these problems.

Time for a Rethink

If the current trend in API policy continues, only two forms of research that draw on ‘big data’ from social media will be available: self-interested, non-public research by the platforms and their commercial partners that is usually not subjected to independent scholarly review on the one hand; and on the other hand, nefarious, big-data social media intelligence and influence operations by unscrupulous actors who have long since learnt how to bypass any API limitations by exploiting technical loopholes and tricking users into weakening their own privacy protections.

In that scenario, the problems with the major social media platforms — including how they are used and abused — will not go away, but the independent, critical, public-interest scholarly research that alerts society to these problems will be severely hampered. To prevent this, we need a considerable reorganisation of the relationships between platforms and academic researchers — if necessary facilitated by relevant legislative and regulatory frameworks.

For better or for worse, social media are now a fundamental part of society: they are how many of us follow the newssocialiseconsume entertainmentengage in politics and activismteach and learnfall in and out of love, and so much more. Facebook, Twitter, and other platforms may come and go over time, but it is unlikely that social media’s role in society will diminish any time soon.

Only strong, independent, data-enabled scholarly research can help society understand how these platforms are being used for these and other purposes. Such research is necessary both so that we can each make informed choices about our own social media use and the role we want these technologies to play in our everyday personal and professional lives — and it is crucial to how we decide collectively, through our democratic processes, how the platforms should be held to account. The platforms are now not only the principal gateway to social networking, but also to all research into social networking, and that gateway must be kept open for any independent, public-interest researchers around the world, as long as they adhere to the strict ethical standards of scholarly research.

(Would you like to sign this call? Please add your name and affiliation here.)

Signatories (at the time of publication)

Professor Axel Bruns
Digital Media Research Centre, Queensland University of Technology
President, Association of Internet Researchers

Research Director Anja Bechmann
DATALAB & Fellow at Aarhus Institute of Advanced Studies, Aarhus University
Co-chair, Association of Internet Researchers’ Ethics Working Group and project, “Internet Research Ethics 3.0”

Professor Jean Burgess
Professor of Digital Media and Director, Digital Media Research Centre
Queensland University of Technology

Professor Andrew Chadwick
Centre for Research in Communication and Culture
Loughborough University

Professor Lynn Schofield Clark
Professor, Chair, and Director, Estlow Center for Journalism and New Media, University of Denver
Affiliate Professor, University of Copenhagen

James H. Quello Professor William H. Dutton
Director of the Quello Center for Media and Information Policy
Michigan State University

Dr. Charles M. Ess
Professor in Media Studies; former Director, Centre for Research in Media Innovation, University of Oslo
Co-chair, Association of Internet Researchers’ Ethics Working Group and project, “Internet Research Ethics 3.0”
Past President, Association of Internet Researchers

Professor Anatoliy Gruzd
Canada Research Chair in Social Media Data Stewardship
Director of Research, Social Media Lab
Associate Professor, Ted Rogers School of Management, Toronto Metropolitan University, Canada

Professor Susan Halford
Executive Director, Web Science Institute
University of Southampton, UK

Dr. Alfred Hermida
Associate Professor and Director, School of Journalism, University of British Columbia
The Conversation Canada

Professor Jeanette Hofmann
Berlin Social Science Center
Alexander von Humboldt Institute for Internet and Society
Weizenbaum Institute for the Networked Society

Professor Phil Howard
Director, Oxford Internet Institute, Oxford University

UIC Distinguished Professor Steve Jones
Professor of Communication, University of Illinois at Chicago
Co-Founder, Association of Internet Researchers

Professor Christian Katzenbach
Senior Researcher, Alexander von Humboldt Institute for Internet and Society, Berlin
Interim Professor, Institute for Media and Communication Studies, Freie Universität Berlin

Assistant Professor Hai Liang
School of Journalism and Communication
The Chinese University of Hong Kong

Dr. Seth C. Lewis
Shirley Papé Chair in Emerging Media, School of Journalism and Communication
University of Oregon

Associate Professor Winson Peng
Chair, Computational Interest Group, International Communication Association
Michigan State University

Dr. Cornelius Puschmann
Senior Researcher
Hans-Bredow-Institute for Media Research

Professor Jack Qiu
School of Journalism and Communication
The Chinese University of Hong Kong

Dr. Kelly Quinn
Clinical Assistant Professor, University of Illinois at Chicago
Treasurer, Association of Internet Researchers

Professor Richard Rogers
Digital Methods Initiative, University of Amsterdam
Academic Director, Netherlands Research School for Media Studies

Associate Professor Luca Rossi
Data Science & Society Lab, IT University of Copenhagen

Professor Adrienne Russell
Mary Laird Wood Professor
Department of Communication, University of Washington

Professor Jennifer Stromer-Galley
Professor and Director, Center for Computational and Data Sciences, Syracuse University
Past President, Association of Internet Researchers

Professor José van Dijck
Distinguished University Professor
Utrecht University, The Netherlands

Dr. Katrin Weller
Senior Researcher, Computational Social Science Department
GESIS — Leibniz Institute for the Social Sciences

Professor Oscar Westlund
Oslo Metropolitan University, Volda University College, University of Gothenburg
Digital Journalism

Professor Jonathan J.H. Zhu
Chair Professor of Computational Social Science
Director of Centre for the Communication Research, City University of Hong Kong

Professor Michael Zimmer
Associate Professor and Director, Center for Information Policy Research, University of Wisconsin-Milwaukee
Co-chair, Association of Internet Researchers’ Ethics Working Group and project, “Internet Research Ethics 3.0”