Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14279/12946
Title: Understanding web archiving services and their (mis)use on social media
Authors: Zannettou, Savvas 
Blackburn, Jeremy 
De Cristofaro, Emiliano 
Sirivianos, Michael 
Stringhini, Gianluca 
metadata.dc.contributor.other: Σιριβιανός, Μιχάλης
Major Field of Science: Engineering and Technology
Field Category: Computer and Information Sciences
Keywords: Ad revenue;Web Archiving;Large-scale analysis;Information ecosystems
Issue Date: Jun-2018
Source: 12th International AAAI Conference on Web and Social Media, 2018, Stanford, California, USA, 26-28 June
Project: EnhaNcing seCurity And privacy in the Social wEb: a user centered approach for the protection of minors 
Conference: International AAAI Conference on Web and Social Media 
Abstract: Web archiving services play an increasingly important role in today-s information ecosystem, by ensuring the continuing availability of information, or by deliberately caching content that might get deleted or removed. Among these, the Wayback Machine has been proactively archiving, since 2001, versions of a large number of Web pages, while newer services like archive.is allow users to create on-demand snapshots of specific Web pages, which serve as time capsules that can be shared across the Web. In this paper, we present a large-scale analysis of Web archiving services and their use on social media, shedding light on the actors involved in this ecosystem, the content that gets archived, and how it is shared. We crawl and study: 1) 21M URLs from archive.is, spanning almost two years; and 2) 356K archive.is plus 391K Wayback Machine URLs that were shared on four social networks: Reddit, Twitter, Gab, and 4chan-s Politically Incorrect board (/pol/) over 14 months. We observe that news and social media posts are the most common types of content archived, likely due to their perceived ephemeral and/or controversial nature. Moreover, URLs of archiving services are extensively shared on “fringe” communities within Reddit and 4chan to preserve possibly contentious content. Lastly, we find evidence of moderators nudging or even forcing users to use archives, instead of direct links, for news sources with opposing ideologies, potentially depriving them of ad revenue.
URI: https://hdl.handle.net/20.500.14279/12946
Type: Conference Papers
Affiliation : Cyprus University of Technology 
University College London 
University of Alabama at Birmingham 
Publication Type: Peer Reviewed
Appears in Collections:Δημοσιεύσεις σε συνέδρια /Conference papers or poster or presentation

Files in This Item:
File Description SizeFormat
1801.10396.pdfFulltext1.14 MBAdobe PDFView/Open
CORE Recommender
Show full item record

Page view(s) 50

338
Last Week
0
Last month
3
checked on Dec 3, 2024

Download(s) 50

523
checked on Dec 3, 2024

Google ScholarTM

Check


Items in KTISIS are protected by copyright, with all rights reserved, unless otherwise indicated.