Social Media Data Stewardship

As a research field, interdisciplinary academic social media research is growing at a rapid pace.

For many researchers, social media data (both user- and system-generated) is a rich source of behavioral data that can reveal how we communicate and interact with each other online and what that might mean for our society as we continue to speed towards an increasingly computer-mediated future. That future holds many promises, but also some perils. In the aftermath of a highly debated Facebook mood experiment in 2014, questions around how industry and academic researchers should handle and use social media data are more relevant than ever.

Currently, there is still a bit of a Wild West type mentality when it comes to the handling and usage of social media data. Many key questions about proper data management processes are still unsettled. For example, what can and can’t you do with social media data? When is it appropriate for a researcher to mention the name of a person in their data set and when is it not? As part of my recent appointment as a Canada Research Chair in Social Media Data Stewardship, I aim to examine these and other questions in details and to conduct studies that will help to settle some of these questions.

So what exactly is Social Media Data Stewardship?’ It is a new concept that I am proposing that touches on many of the data management processes that social media researchers have to navigate today in relation to collecting, storing, analysing, visualizing, publishing and reusing social media data.  A working definition for the concept of Social Media Data Stewardship’ is below.


To understand the new concept ofSocial Media Data Stewardship’, we must first understand what we mean when we refer to ‘data management processes?’ These are processes such as collection, retrieval, re-use, sharing, archiving, preservation, and disposal. They are often combined in a recently emerged, umbrella-term – data curation, or an even broader concept – data stewardship. The key functions of data curation are to “enable data discovery and retrieval, maintain its quality, add value, and provide for reuse over time“. Data stewardship expands the focus of data curation to also include preservation and long-term data management (Lazorchak, 2011).

The Archives, Library and Information Science field has been actively tackling issues of data and information management for decades. The rapid growth of accumulating scientific research data (primarily in the “hard” sciences) brought the questions related to the stewardship of research data to the forefront of the research community, including some recent work by Palmer, and Chao, However, questions related to stewardship of social media data and interrelated ethical considerations and implications have largely gone unanswered and understudied.

Defining Social Media Data Stewardship (SMDS)

As a way to offer a common framework to handle social media data and discuss data- and user-driven challenges associated with it, I propose to expand the original notion of data stewardship and apply it to the management of social media data. So here is a working definition:

Social Media Data Stewardship (SMDS) is a set of data- and user-driven principles to guide all aspects of managing social media data including its collection, storage, analysis, publication, reuse, sharing and preservation. 

Social Media Data + Data Stewardship = Social Media Data Stewardship –  processes related to all aspects of managing social media data including

In order to study SMDS, we at the Social Media Lab are launching a new 5-year initiative which goal is to develop a new social media data stewardship framework to inform future development of digital research infrastructure. As SMDS is a multi-faceted notion, it would require studying different stakeholders and their practices to get a full picture:

  1. Data consumers (researchers, policy and decision makers working in the private and public sectors);
  2. Data producers (social media users);
  3. Data intermediaries (social media platforms).

Table 1 below outlines some of the initial research questions and dimensions that we plan to study as part of this initiative. Our hope is that a resulting SMDS framework will allow both data consumers and producers to unlock the full potential of social media data while still considering ethical implications of using available data.