Παρακαλώ χρησιμοποιήστε αυτό το αναγνωριστικό για να παραπέμψετε ή να δημιουργήσετε σύνδεσμο προς αυτό το τεκμήριο: https://hdl.handle.net/20.500.14279/23112
Τίτλος: Identifying Sensitive URLs at Web-Scale
Συγγραφείς: Matic, Srdjan 
Iordanou, Costas 
Smaragdakis, Georgios 
Laoutaris, Nikolaos 
Major Field of Science: Natural Sciences
Field Category: Computer and Information Sciences
Λέξεις-κλειδιά: Data privacy;HTTP;Websites
Ημερομηνία Έκδοσης: 27-Οκτ-2020
Πηγή: ACM Internet Measurement Conference, 2020, 27–29 October
Περιοδικό: ACM Internet Measurement Conference 
Περίληψη: Several data protection laws include special provisions for protecting personal data relating to religion, health, sexual orientation, and other sensitive categories. Having a well-defined list of sensitive categories is sufficient for filing complaints manually, conducting investigations, and prosecuting cases in courts of law. Data protection laws, however, do not define explicitly what type of content falls under each sensitive category. Therefore, it is unclear how to implement proactive measures such as informing users, blocking trackers, and filing complaints automatically when users visit sensitive domains. To empower such use cases we turn to the Curlie.org crowdsourced taxonomy project for drawing training data to build a text classifier for sensitive URLs. We demonstrate that our classifier can identify sensitive URLs with accuracy above 88%, and even recognize specific sensitive categories with accuracy above 90%. We then use our classifier to search for sensitive URLs in a corpus of 1 Billion URLs collected by the Common Crawl project. We identify more than 155 millions sensitive URLs in more than 4 million domains. Despite their sensitive nature, more than 30% of these URLs belong to domains that fail to use HTTPS. Also, in sensitive web pages with third-party cookies, 87% of the third-parties set at least one persistent cookie.
URI: https://hdl.handle.net/20.500.14279/23112
ISBN: 9781450381383
DOI: 10.1145/3419394.3423653
Rights: © owner/author(s).
Type: Conference Papers
Affiliation: TU Berlin 
Cyprus University of Technology 
IMDEA Networks Institute 
Εμφανίζεται στις συλλογές:Δημοσιεύσεις σε συνέδρια /Conference papers or poster or presentation

Αρχεία σε αυτό το τεκμήριο:
Αρχείο Περιγραφή ΜέγεθοςΜορφότυπος
3419394.3423653.pdfFulltext497.22 kBAdobe PDFΔείτε/ Ανοίξτε
CORE Recommender
Δείξε την πλήρη περιγραφή του τεκμηρίου

SCOPUSTM   
Citations 50

11
checked on 14 Μαρ 2024

Page view(s) 50

234
Last Week
0
Last month
10
checked on 14 Μαϊ 2024

Download(s) 5

742
checked on 14 Μαϊ 2024

Google ScholarTM

Check

Altmetric


Αυτό το τεκμήριο προστατεύεται από άδεια Άδεια Creative Commons Creative Commons