Conclusion First
In this tutorial, we conducted an exploratory toxicity analysis of a Twitter dataset consisting of 298,172 replies to a Donald Trump’s tweet announcing his positive COVID-19 diagnosis. We found that most of the toxicity scores produced by Perspective API are highly correlated. We also found that tweets that are blocked by Twitter have, on average, higher toxicity scores than (1) tweets that are still publicly available on the platform or (2) tweets posted by suspended or protected accounts. Interestingly, there was no clear connection between posting toxic tweets and an account being suspended by Twitter.
Introduction
The aim of this tutorial is to introduce users to a few analyses that can be performed with Communalytic and a Python library called twarc. This post will also show users how to explore the potential relationship between the toxicity scores of individual tweets and the likelihood that a tweet will still be publicly available days after the original data collection. As this is a tutorial, please note that any findings noted in this post are for illustrative purposes only.
Dataset
On October 2nd, 2020, U.S. President Donald Trump tweeted that he and the First Lady of the United States (FLOTUS), Melania Trump, both tested positive for COVID-19 (see Figure 1). Within seconds, his tweet received thousands of replies on Twitter.
To collect and analyze this Twitter thread, we used Communalytic, a research tool for studying online communities and online discourse. Communalytic can collect and analyze public data from various social media platforms including Reddit, Twitter, and Facebook/Instagram (via CrowdTangle). It uses advanced text and social network analysis techniques to automatically pinpoint toxic and anti-social interactions, identify influencers, map shared interests and the spread of misinformation, and detect signs of possible coordination among seemingly disparate actors. In the instance case, our dataset consists of 298,172 replies to Donald Trump’s tweet announcing his COVID-19 diagnosis, posted on October 2nd between 6:00am and 12:30pm (ET). Each tweet in the dataset includes a number of metadata attributes provided by the Twitter API, such as information about who and when it was posted, in what language, and how many other users engaged with it (retweeted or favorited). (For information about how to collect Twitter data with Communalytic, see our online tutorial here and for the full list of available metadata elements for a Tweet, see the Twitter API documentation.)
Toxicity Analysis with Perspective API
Communalytic uses Google’s Perspective API (Perspective API) to calculate and assign toxicity scores. This API relies on machine learning models to score the perceived impact a post might have on a conversation and assigns toxicity scores to each post. Currently, Perspective API can analyze posts in one of 7 different languages (English, Spanish, French, German, Portuguese, Italian, Russian). When you run this analysis in Communalytic, you will need to specify what is the primary language of the majority of posts in your dataset. Since the majority of posts in our dataset (83%, 247,975 of 298,172) were written in English (as detected by Twitter), we will select English in the drop down menu in Communalytic. (If the bulk of the posts in your dataset are not in one of these 7 languages, you should not run a toxicity analysis on your data as the toxicity scores it generates will be highly unreliable. See our online tutorial on how to run a Perspective API toxicity analysis in Communalytic.)
Once tweets in the thread were collected, Communalytic assigned the following toxicity scores/attributes (with values ranging from 0 to 1) to each tweet in the dataset, as calculated by Perpective:
- Toxicity score represents the degree to which the comment is rude, disrespectful, or unreasonable.
- Severe Toxicity score represents how hateful, aggressive, and disrespectful the comment is.
- Profanity score indicates if swear words or other profane language is used.
- Identity Attack score indicates if a post contains hateful language targeting someone because of their identity.
- Insult score helps to identify insulting or inflammatory posts.
- Threat score represents the degree to which a post displays an intention to inflict pain or violence against an individual or group.
- Sexually Explicit score indicates if a post contains references to sexual acts, body parts, or other lewd content.
- Flirtation score indicates if a post contains language commonly used in pickup lines, compliments regarding appearance, or subtle sexual innuendo.
For more information about how these scores are calculated and evaluated, see the Perspective API documentation.
Toxicity Scores
To explore the range of toxicity scores in the dataset, we built eight histograms to visualize a distribution of scores for each of the 8 toxicity attributes. From Figure 2, we can see that values for all eight toxicity attributes are skewed to the left, with the majority of posts having scores of less than 0.5. So, even though we expected a higher level of toxicity in replies to Trump’s tweet due to the highly polarized political environment leading up to the 2020 US Presidential election, this exploratory analysis suggests that the majority of replies in this thread were not as toxic as we expected.

While Perspective API calculates 8 different scores representing different attributes of toxicity, some of them might be highly related, for example ‘toxicity’ and ‘severe toxicity’. To find out to what extent the toxicity scores are correlated with each other, we can use a correlation matrix as shown in Figure 3. For each pair of toxicity scores, the matrix represents correlation values between 0 to 1, where 0 corresponds to not correlated and 1 corresponds to highly correlated. The values are represented in the correlation matrix in shades of red: from light red (values closer to 0) to the dark red (values closer to 1).
As shown in Figure 3, the following toxicity attributes (toxicity, severe toxicity, profanity, identity attack, and insult) are highly correlated with each other (correlation values of 0.7 or higher). The ‘threat’ and ‘sexually explicit’ toxicity attributes moderately correlate with the remaining toxicity attributes (correlation values between 0.3 and 0.7).
This means that in general, a high value in any one of these toxicity attributes will correspond to a high value in the other toxicity attributes, except for the ‘flirtation’ toxicity attribute which captures a somewhat different aspect of online exchanges.
For the purposes of our exploratory analysis, we will only examine the ‘toxicity’ attributes because it can be used as a proxy for the majority of the toxicity attributes provided by Perspective API. (Note: the correlation results are likely dataset/domain specific, so we suggest checking for multicollinearity among the toxicity attributes when analyzing a different dataset.)
Checking Tweets Availability After the Fact
To further enrich our dataset, we used a Python library called twarc to collect information about the availability status of each tweet. The purpose of this particular exercise is to identify whether any of the tweets or users from the dataset are still publicly available or if they have been removed by Twitter for violation of its policies. After running this script, a new attribute called tweet_status was added to the dataset with the following possible values:
- user_suspended – the account is suspended by Twitter;
- user_protected – the account’s privacy setting has been changed to ‘protected’;
- user_deleted – the account has been deleted, presumably by the user;
- tweet_blocked – the tweet has been blocked by Twitter;
- tweet_deleted – the tweet has been deleted;
- tweet_ok – the tweet is still available.
This analysis can be done anytime after the original data collection; just keep in mind that as time passes, more tweets/accounts may become unavailable for various reasons other than being blocked/suspended by the platform. In our case, we ran it twelve days after the original data collection.
Table 1 shows the counts of tweets based on their availability status, with the majority of English tweets being still available (96%, 238,272 out of 247,975).
Tweet Status
With the new tweet status info, we can now compare this data point with the toxicity scores from Perspective API in order to investigate how likely Twitter is to take action against those who post toxic messages (by either suspending an account or blocking a tweet). In other words, are blocked tweets (including tweets posted by accounts that are now suspended) more toxic on average than tweets that are still available? To examine this, we can take a look at the average toxicity scores in relation to the availability status of a tweet or account.
The bar chart (Figure 4) shows the average toxicity scores for tweets with different availability statuses as provided by Twitter API. Tweets that are still available have the lowest average toxicity value of 0.34, while tweets that have been blocked by the platform have the average toxicity value of 0.82, which is twice as large as the value for the other categories. This suggests that Twitter does act on tweets that are especially toxic by blocking them. However, since only a small proportion of tweets, 691 tweets, has been blocked by the platform, this finding must be taken with some caution.
It is important to note that ANOVA (Analysis of Variance) can be used here to test for differences across means more formally (see examples: here and here; also read this post to learn more about how to handle data that is not normally distributed). However, this more formal analysis is outside the scope of this tutorial.
We also expected that tweets posted by accounts that are now suspended would have much higher toxicity score relative to tweets that are still available. However, the difference was only marginal (0.4 for tweets by suspended accounts versus 0.34 for tweets that are still available, see Figure 4). This is likely because users can be suspended not only because they attacked others by posting toxic tweets, but also because they violated other community norms, such as copyright or impersonation policies. Thus, toxicity scores are not the only variables that may predict if an account would be suspended (and vice versa). Furthermore, since toxicity scores are only calculated based on the textual part of a tweet, it does not account for potentially toxic media files or URLs attached to the tweet.
Explore & Practice with This Dataset
If you are interested in exploring this dataset on your own and practicing your data science skills, you can retrieve our dataset from Dataverse. The dataset contains the toxicity scores and tweet status values that we added for this analysis; however, following Twitter’s API policy, we stripped metadata associated with each tweet except TweetId (unique identifier). So, if you’d like to examine potential relationships between other metadata elements, you would need to recollect the original tweets using a tool like DocNow’s Hydrator first. The only downside of this approach is that tweets that have been blocked or deleted will not be recollected. To help you get started, we also shared our Python script here.
Happy Data Exploration!
*By Anatoliy Gruzd and Shahnawaz Attarwala with editorial contributions from Philip Mai and Alyssa Saiphoo.
Additional Resources
- Matplotlib Python library is used for plotting graphs
- Sample Python script for exploratory data analysis
- Exploratory Data Analysis with PySpark on Databricks
- Example of using twarc library