{"id":14288,"date":"2019-02-25T15:37:16","date_gmt":"2019-02-25T15:37:16","guid":{"rendered":"https:\/\/socialmedialab.ca\/web\/?p=14288"},"modified":"2024-12-11T00:55:00","modified_gmt":"2024-12-11T00:55:00","slug":"heres-how-to-check-to-see-how-many-tweets-in-your-dataset-have-been-deleted","status":"publish","type":"post","link":"https:\/\/socialmedialab.ca\/web\/2019\/02\/25\/heres-how-to-check-to-see-how-many-tweets-in-your-dataset-have-been-deleted\/","title":{"rendered":"Here&#8217;s how to check to see how many tweets in your dataset have been deleted."},"content":{"rendered":"\n<figure class=\"wp-block-image\"><img decoding=\"async\" width=\"1024\" height=\"162\" src=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/Netlytic-banner-1024x162.jpg\" alt=\"\" class=\"wp-image-14257\" srcset=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/Netlytic-banner-1024x162.jpg 1024w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/Netlytic-banner-300x47.jpg 300w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/Netlytic-banner-768x122.jpg 768w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/Netlytic-banner-696x110.jpg 696w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/Netlytic-banner-1068x169.jpg 1068w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/Netlytic-banner.jpg 2048w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>Netlytic is a cloud-based social media data collector, text analyzer and social networks visualizers. It&#8217;s designed for social media researchers and educators to study public discourse on social media. [Available data sources:  Twitter, Facebook, YouTube, RSS Feed, or text\/csv file.]<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">If you are using Netlytic to collect Twitter data for your research, here&#8217;s how to check to see how many tweets in your dataset have been deleted (by a user or the platform).  <\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Step 1:<\/strong> Install a free text editor that supports regular expressions such as <a rel=\"noreferrer noopener\" aria-label=\"Nodepad++ (opens in a new tab)\" href=\"https:\/\/notepad-plus-plus.org\/\" target=\"_blank\">Nodepad++<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Step 2:<\/strong> Download your dataset from <a href=\"https:\/\/netlytic.org\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\"Netlytic  (opens in a new tab)\">Netlytic <\/a>as an Excel or CSV file and open it in Excel<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img decoding=\"async\" width=\"1024\" height=\"92\" src=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/download1-1024x92.png\" alt=\"\" class=\"wp-image-14296\" srcset=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/download1-1024x92.png 1024w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/download1-300x27.png 300w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/download1-768x69.png 768w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/download1-696x62.png 696w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/download1-1068x95.png 1068w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/download1.png 1130w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Step 3: <\/strong>Copy the column called \u201c<em>guid<\/em>\u201d from the Excel file into a new text file in Nodepad++<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img decoding=\"async\" width=\"1024\" height=\"160\" src=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/excel-guid-1024x160.png\" alt=\"\" class=\"wp-image-14295\" srcset=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/excel-guid-1024x160.png 1024w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/excel-guid-300x47.png 300w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/excel-guid-768x120.png 768w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/excel-guid-696x109.png 696w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/excel-guid.png 1044w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Step 4:<\/strong> in Nodepad++<br><\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>delete the first row with \u201c<strong>guid<\/strong>\u201d <\/li><li>using the <strong>Search &amp; Replace<\/strong> function, find \u201c<strong>https:\/\/twitter.com\/<\/strong>\u201d (no quotes) and replace it with an empty character \u201c\u201d (no quotes)<\/li><li>using the <strong>Search &amp; Replace <\/strong>function, find \u201c<strong>\/statuses<\/strong>\u201d (no quotes) and replace it with an empty character \u201c\u201d (no quotes)<\/li><li>using the <strong>Search &amp; Replace <\/strong>function in the <strong>regexp <\/strong>mode (check the appropriate check box), find the following regular expression <strong>^[^\\n\\t\\r\\\/]+\\\/<\/strong> and replace it with an empty character \u201c\u201d (no quotes):<\/li><\/ul>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img decoding=\"async\" width=\"1024\" height=\"410\" src=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/npp-replaceall-1024x410.png\" alt=\"\" class=\"wp-image-14294\" srcset=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/npp-replaceall-1024x410.png 1024w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/npp-replaceall-300x120.png 300w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/npp-replaceall-768x308.png 768w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/npp-replaceall-696x279.png 696w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/npp-replaceall-1068x428.png 1068w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/npp-replaceall-1048x420.png 1048w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/npp-replaceall.png 1392w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Step 4:<\/strong> in Nodepad++, save it as a new text file (let\u2019s call it \u201c<strong>ids.txt<\/strong>\u201d; this file will include a list of unique tweet ids.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img decoding=\"async\" width=\"1024\" height=\"568\" src=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/npp-ids-1024x568.png\" alt=\"\" class=\"wp-image-14293\" srcset=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/npp-ids-1024x568.png 1024w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/npp-ids-300x166.png 300w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/npp-ids-768x426.png 768w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/npp-ids-696x385.png 696w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/npp-ids-1068x593.png 1068w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/npp-ids-757x420.png 757w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/npp-ids.png 1258w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Step 5:<\/strong> Install a free <a rel=\"noreferrer noopener\" aria-label=\"Hydrator app (opens in a new tab)\" href=\"https:\/\/github.com\/DocNow\/hydrator\" target=\"_blank\">Hydrator app<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Step 6:<\/strong> in Hydrator:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Link your Twitter account to Hydrator (under Settings)<\/li><li> Open the &#8220;<strong>ids.txt<\/strong>&#8221; file in Hydrator by adding it as a new dataset.<\/li><li>Click the &#8220;<strong>Add Dataset<\/strong>&#8221; button<\/li><\/ul>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><a href=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator1.png\"><img decoding=\"async\" width=\"220\" height=\"300\" src=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator1-220x300.png\" alt=\"\" class=\"wp-image-14292\" srcset=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator1-220x300.png 220w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator1-696x951.png 696w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator1-307x420.png 307w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator1.png 731w\" sizes=\"(max-width: 220px) 100vw, 220px\" \/><\/a><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Step 7: <\/strong>Continue in Hydrator:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Run the collector by clicking the &#8220;<strong>Start<\/strong>&#8221; button. <\/li><li> Once ready, save the dataset by clicking on the &#8220;<strong>CSV<\/strong>&#8221; button. The resulting file will include all original tweets\/retweets (minus those that have been deleted either by the users or the platform). A side benefit of this process is that the original dataset is now enriched with additional metadata elements that have not been originally collected by Netlytic (such as the number of times a tweet has been retweeted). <\/li><\/ul>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><a href=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator2.png\"><img decoding=\"async\" width=\"300\" height=\"195\" src=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator2-300x195.png\" alt=\"\" class=\"wp-image-14291\" srcset=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator2-300x195.png 300w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator2-696x453.png 696w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator2-645x420.png 645w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator2.png 719w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Step 8:<\/strong> Once the recollection process is complete, click on the green bar to see how many tweets have been deleted from your original dataset. For example, in the sample dataset 4% of tweets have been deleted. <\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><a href=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator3.png\"><img decoding=\"async\" width=\"263\" height=\"300\" src=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator3-263x300.png\" alt=\"\" class=\"wp-image-14290\" srcset=\"https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator3-263x300.png 263w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator3-696x793.png 696w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator3-369x420.png 369w, https:\/\/socialmedialab.ca\/web\/wp-content\/uploads\/2019\/02\/hydrator3.png 732w\" sizes=\"(max-width: 263px) 100vw, 263px\" \/><\/a><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">\n\nPS. To check if a tweet was deleted by its creator or by the platform, try the following python script:  <a href=\"https:\/\/github.com\/DocNow\/twarc\/blob\/main\/utils\/deletes.py\">https:\/\/github.com\/DocNow\/twarc\/blob\/master\/utils\/deletes.py<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This post was originally posted on the Netlytic.org website at <a href=\"https:\/\/netlytic.org\/home\/?p=11627\">https:\/\/netlytic.org\/home\/?p=11627<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you are using Netlytic to collect Twitter data for your research, here&#8217;s how to check to see how many tweets in your dataset have been deleted (by a user or the platform). Step 1: Install a free text editor that supports regular expressions such as Nodepad++ Step 2: Download your dataset from Netlytic as [&hellip;]<\/p>\n","protected":false},"author":28,"featured_media":14291,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[491,494,490,264],"tags":[488,487,362,71,18],"class_list":["post-14288","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog-post","category-misinformation","category-online-communities","category-web-apps","tag-docnow-regexpr","tag-hydrator","tag-netlytic","tag-netlytic-org","tag-twitter"],"_links":{"self":[{"href":"https:\/\/socialmedialab.ca\/web\/wp-json\/wp\/v2\/posts\/14288","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/socialmedialab.ca\/web\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/socialmedialab.ca\/web\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/socialmedialab.ca\/web\/wp-json\/wp\/v2\/users\/28"}],"replies":[{"embeddable":true,"href":"https:\/\/socialmedialab.ca\/web\/wp-json\/wp\/v2\/comments?post=14288"}],"version-history":[{"count":12,"href":"https:\/\/socialmedialab.ca\/web\/wp-json\/wp\/v2\/posts\/14288\/revisions"}],"predecessor-version":[{"id":18630,"href":"https:\/\/socialmedialab.ca\/web\/wp-json\/wp\/v2\/posts\/14288\/revisions\/18630"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/socialmedialab.ca\/web\/wp-json\/wp\/v2\/media\/14291"}],"wp:attachment":[{"href":"https:\/\/socialmedialab.ca\/web\/wp-json\/wp\/v2\/media?parent=14288"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/socialmedialab.ca\/web\/wp-json\/wp\/v2\/categories?post=14288"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/socialmedialab.ca\/web\/wp-json\/wp\/v2\/tags?post=14288"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}